Private AI chat for any website — that runs in the user’s browser.
Drop in one <script> tag and add a context-aware assistant to any page.
On modern browsers the model runs entirely on the visitor’s device (WebGPU),
so conversations never leave the browser — with a hosted remote fallback for everyone else.
- ✓ On-device by default
- ✓ No data sent to third parties
- ✓ Live in ~5 minutes
This bubble is InferKit running on its own site. Open it and ask “What is InferKit?” or “How does on-device mode work?”
<script src="https://cdn.jsdelivr.net/npm/synapticortex-chat"></script>
<script>
InferKit.init({ apiKey: 'ik_pub_live_…' });
</script> From script tag to private assistant in four steps
- 01
Drop in the script
Add one tag and call
InferKit.init()with your publishable key. No build step, no backend to run. - 02
It reads the page
A deterministic grounding gate scopes answers to your page content — off-topic questions are refused before the model runs.
- 03
Runs on the device
On WebGPU browsers the model runs in-browser (WebLLM) — conversations never leave the visitor. Otherwise it falls back to hosted remote inference.
- 04
You stay in control
Publishable + secret keys, domain/IP allowlists, bot protection, quotas, and abuse auto-suspend — managed from your dashboard.
Private by architecture
On-device inference means page content and questions never reach a third-party LLM in local mode.
Multi-provider + BYOK
OpenAI, Groq, Anthropic, and Gemini — or bring your own key and keep inference billing with your provider.
Edge economics
When the visitor’s GPU does the work, your marginal inference cost trends to zero.
Abuse-resistant
Turnstile bot protection, rate limits, hard quotas, and anomaly auto-suspend keep public keys safe.
Local inference is free — you pay only for hosted fallback.
Because the visitor’s device does the work, in-browser chat costs you nothing. Upgrade when you need a remote fallback and higher limits. Change plans anytime, self-serve.
Free
Hobby projects, blogs, docs, evaluation. Local-only; visitors need WebGPU.
Start free- In-browser inference
- 1 domain per key
- Team & roles (RBAC)
- Community support
Starter
Production sites that want a remote fallback for every visitor.
Choose Starter- Remote inference fallback
- 500K remote tokens / mo
- 5 domains per key
- Bot protection
- Email support
Pro
Growing products, agencies, and teams wanting BYOK + white-label.
Choose Pro- 5M remote tokens / mo
- 20 domains per key
- Bring your own key (BYOK)
- Remove “Powered by” badge
- Priority support
Enterprise
SSO, audit logs, dedicated infrastructure, custom SLAs.
Talk to sales- Custom remote tokens
- Unlimited domains
- SSO / audit logs
- Dedicated infra
- SLA + dedicated support
Remote usage beyond the included allowance is billed as overage. With BYOK (Pro/Enterprise), remote tokens are billed by your provider — InferKit charges only the platform fee.
Add private AI chat to your site today.
Free to start, no credit card. Your visitors’ conversations stay on their device.