Browser support

Which runtime each browser gets, hardware requirements, model download sizes, and what travels over the network.

Last updated 2026-06-03

The widget picks its runtime on two axes — mode (mobile or desktop) and engine (llm or scenarios) — based on browser, OS, and available hardware. Override either with data-mode-override or data-engine-override for testing.

Device matrix

Browser	Version	Mode	Engine	Notes
Chrome / Edge	113+	desktop	llm	WebGPU — full Qwen3-0.6B.
Safari macOS	17+	desktop	llm	WASM SIMD fallback (slower than WebGPU).
Firefox	130+	desktop	llm	WASM SIMD fallback.
Safari iOS	17+	mobile	scenarios	Phones — instant scenario matching, no model.
Chrome Android	any	mobile	scenarios	Phones — instant scenario matching, no model.
Older browsers / low memory	—	desktop	scenarios	LLM unavailable — scenarios still work.

What each engine delivers

LLM engine — exact scenarios + AI fallback. Qwen3-0.6B runs in-browser on WebGPU when available, WASM otherwise. Grounded in your published vectors.
Scenarios engine — exact scenarios only. No model download, no AI fallback. The scenario fallback message (configured in the app) is shown when nothing matches.

Hardware requirements

Engine	RAM	Why
llm	≥ 2 GB	Holds the embedder and the language model in browser memory.
scenarios	~5 MB	Just `scenarios.json` and the matcher.

Model download sizes

Asset	Engine	Approx. size
Qwen3-0.6B q4f16	llm	~570 MB
multilingual-e5-small (embedder)	llm	~100 MB
scenarios.json + matcher	all	~5 MB

All assets are cached in IndexedDB after first download. Subsequent visits skip the bulk and only check for updates.

What travels over the network

Your config.json, scenarios.json, and vectors.json — from cdn.zupport.chat or your custom data-config-base-url.
Model weights — from Hugging Face Hub by default, or your custom data-model-base-url.
transformers.js runtime modules — from jsDelivr on first use (workers only).

No conversation server

Visitor messages and replies never leave the browser. All inference happens locally. There is nothing to log, no transcript to subpoena, and no cookies to regulate.

IndexedDB caching

The embed stores model weights and the vector index in IndexedDB so repeat visits are instant. To force a fresh download — e.g. for debugging — set data-disable-cache="true" or call chat.refresh({ bypassCache: true }).

Self-hosting

For air-gapped or compliance-driven deploys, you can host everything yourself:

The config and scenarios — via data-config-base-url (or data-config-url).
The model weights — via data-model-base-url.
The script itself — host embed.js on your own CDN.