G
Gemma 4 · macOS Architecture

Local AI infrastructure
on Apple Silicon

Gemma 4 via Ollama with remote API, browser UI, Screen Share & SSH. Click any node to explore setup details.

Model gemma4:12b
Ollama API :11434
Unified RAM 9.6 GB
Animated — primary data flow
Solid — secondary flow
Dashed — remote access path
Remote devices
API client
:11434
Python · Node · curl
Browser
:3000
Open WebUI
Screen Share
:5900
VNC / Finder
SSH terminal
:22
Remote Login
Network — LAN (192.168.x.x) or internet via ngrok / SSH tunnel
macOS host · Apple Silicon
macOS services
Firewall
Allow :11434 :5900 :22
Open WebUI
Docker · port 3000
Screen Sharing
System Settings → Sharing
Remote Login
SSH daemon · port 22
Ollama
OLLAMA_HOST
0.0.0.0:11434 via LaunchAgent
Keep-alive
OLLAMA_KEEP_ALIVE = −1
REST API
OpenAI-compatible /v1
ngrok tunnel
Public HTTPS endpoint
llama.cpp / MLX inference engine
Apple Silicon Metal · unified memory · automatic GPU offload
Gemma 4 model weights
e4b · 12b · 27b — GGUF 4-bit · 128K–256K context
Local storage
~/.ollama/models/blobs — SHA256-named GGUF files on disk
↑ Select any node to view details and commands