Chat with open source LLMs powered by WebGPU. Everything runs in your browser — your data never leaves your machine. Free, private, and fast.
All AI inference runs locally on your GPU via WebGPU. There's no server, no API calls, no telemetry. Once the model is loaded, it even works offline.


Our models support multimodal input. Upload an image and ask questions about it — all processed locally on your device.
Adjust temperature, top-p, top-k, repetition penalty, and more. Like having a local AI playground with fine-grained control over generation behavior.

No sign-up required. No data collected. Load the model and start chatting — it's that simple.