Browser support
Which runtime each browser gets, hardware requirements, model download sizes, and what travels over the network.
Last updated 2026-06-03
The widget picks its runtime on two axes — mode (mobile or desktop) and engine (llm or scenarios) — based on browser, OS, and available hardware. Override either with data-mode-override or data-engine-override for testing.
Device matrix
| Browser | Version | Mode | Engine | Notes |
|---|---|---|---|---|
| Chrome / Edge | 113+ | desktop | llm | WebGPU — full Qwen3-0.6B. |
| Safari macOS | 17+ | desktop | llm | WASM SIMD fallback (slower than WebGPU). |
| Firefox | 130+ | desktop | llm | WASM SIMD fallback. |
| Safari iOS | 17+ | mobile | scenarios | Phones — instant scenario matching, no model. |
| Chrome Android | any | mobile | scenarios | Phones — instant scenario matching, no model. |
| Older browsers / low memory | — | desktop | scenarios | LLM unavailable — scenarios still work. |
What each engine delivers
- LLM engine — exact scenarios + AI fallback. Qwen3-0.6B runs in-browser on WebGPU when available, WASM otherwise. Grounded in your published vectors.
- Scenarios engine — exact scenarios only. No model download, no AI fallback. The scenario fallback message (configured in the app) is shown when nothing matches.
Hardware requirements
| Engine | RAM | Why |
|---|---|---|
| llm | ≥ 2 GB | Holds the embedder and the language model in browser memory. |
| scenarios | ~5 MB | Just scenarios.json and the matcher. |
Model download sizes
| Asset | Engine | Approx. size |
|---|---|---|
| Qwen3-0.6B q4f16 | llm | ~570 MB |
| multilingual-e5-small (embedder) | llm | ~100 MB |
| scenarios.json + matcher | all | ~5 MB |
All assets are cached in IndexedDB after first download. Subsequent visits skip the bulk and only check for updates.
What travels over the network
- Your
config.json,scenarios.json, andvectors.json— fromcdn.zupport.chator your customdata-config-base-url. - Model weights — from Hugging Face Hub by default, or your custom
data-model-base-url. - transformers.js runtime modules — from jsDelivr on first use (workers only).
IndexedDB caching
The embed stores model weights and the vector index in IndexedDB so repeat visits are instant. To force a fresh download — e.g. for debugging — set data-disable-cache="true" or call chat.refresh({ bypassCache: true }).
Self-hosting
For air-gapped or compliance-driven deploys, you can host everything yourself:
- The config and scenarios — via
data-config-base-url(ordata-config-url). - The model weights — via
data-model-base-url. - The script itself — host
embed.json your own CDN.