Skip to main content
DocsEmbed scriptBrowser support
Embed script

Browser support

Which runtime each browser gets, hardware requirements, model download sizes, and what travels over the network.

Last updated 2026-06-03

The widget picks its runtime on two axes — mode (mobile or desktop) and engine (llm or scenarios) — based on browser, OS, and available hardware. Override either with data-mode-override or data-engine-override for testing.

Device matrix

BrowserVersionModeEngineNotes
Chrome / Edge113+desktopllmWebGPU — full Qwen3-0.6B.
Safari macOS17+desktopllmWASM SIMD fallback (slower than WebGPU).
Firefox130+desktopllmWASM SIMD fallback.
Safari iOS17+mobilescenariosPhones — instant scenario matching, no model.
Chrome AndroidanymobilescenariosPhones — instant scenario matching, no model.
Older browsers / low memorydesktopscenariosLLM unavailable — scenarios still work.

What each engine delivers

  • LLM engine — exact scenarios + AI fallback. Qwen3-0.6B runs in-browser on WebGPU when available, WASM otherwise. Grounded in your published vectors.
  • Scenarios engine — exact scenarios only. No model download, no AI fallback. The scenario fallback message (configured in the app) is shown when nothing matches.

Hardware requirements

EngineRAMWhy
llm≥ 2 GBHolds the embedder and the language model in browser memory.
scenarios~5 MBJust scenarios.json and the matcher.

Model download sizes

AssetEngineApprox. size
Qwen3-0.6B q4f16llm~570 MB
multilingual-e5-small (embedder)llm~100 MB
scenarios.json + matcherall~5 MB

All assets are cached in IndexedDB after first download. Subsequent visits skip the bulk and only check for updates.

What travels over the network

  • Your config.json, scenarios.json, and vectors.json — from cdn.zupport.chat or your custom data-config-base-url.
  • Model weights — from Hugging Face Hub by default, or your custom data-model-base-url.
  • transformers.js runtime modules — from jsDelivr on first use (workers only).
No conversation server
Visitor messages and replies never leave the browser. All inference happens locally. There is nothing to log, no transcript to subpoena, and no cookies to regulate.

IndexedDB caching

The embed stores model weights and the vector index in IndexedDB so repeat visits are instant. To force a fresh download — e.g. for debugging — set data-disable-cache="true" or call chat.refresh({ bypassCache: true }).

Self-hosting

For air-gapped or compliance-driven deploys, you can host everything yourself:

  • The config and scenarios — via data-config-base-url (or data-config-url).
  • The model weights — via data-model-base-url.
  • The script itself — host embed.js on your own CDN.