Overview
Add a private, context-aware AI chatbot to any website in one line — and keep your visitors’ conversations on their own device.
InferKit is a drop-in JavaScript SDK by SynaptiCortex. Paste one <script> tag,
and your page gets a chat assistant that answers questions grounded in that
page’s content. On capable devices it runs the model entirely in the
browser (WebGPU) — so the conversation never leaves the user’s machine.
Why InferKit
| 🔒 Privacy-first | In local mode, inference runs in-browser via WebLLM. No prompts, no page content, and no conversations are sent to any server. A genuine differentiator for healthcare, finance, legal, and EU/GDPR-sensitive sites. |
| ⚡ Drop-in | One script tag + an API key. No backend to build, no model to host, no infra to run. |
| 💸 Cost-efficient | When the user’s GPU does the work, we bear no inference cost — and neither do you. Remote inference is a fallback, not the default. |
| 🎯 Grounded answers | A deterministic grounding gate refuses off-page questions before calling the model, so “what’s the capital of France?” doesn’t hijack your support bot. |
| 🔌 Multi-provider | When remote inference is used, route to OpenAI, Groq, Anthropic, or Google Gemini — or bring your own key (BYOK). |
How it works
- Drop in the SDK. The script mounts a chat widget and reads the current page.
- Pick the mode automatically. Capable browser → local (in-browser WebLLM). Otherwise → remote fallback through the InferKit API.
- Answer in context. The page’s content is the knowledge base; the grounding gate keeps answers on-topic.
Visitor ──▶ InferKit widget ──┬─▶ Local model (WebGPU, in-browser) ← default, private
└─▶ InferKit API ──▶ LLM provider ← fallback / paid tiers
The platform API is the control plane — keys, tiers, usage, billing, bot protection — not a data pipe. In local mode it’s contacted exactly once, at startup, to validate the key.
Who it’s for
- Documentation & knowledge sites — instant Q&A grounded in your docs.
- SaaS products — in-app help that understands the current screen.
- Privacy-sensitive sites — healthcare, finance, legal: answers without data egress.
- Agencies — one integration, deployed across many client sites.
At a glance
- SDK:
synapticortex-chaton npm + jsDelivr/unpkg CDN - Models (local): Qwen, Llama 3.2, Phi-3.5, SmolLM2 (4-bit, WebGPU)
- Providers (remote): OpenAI · Groq · Anthropic · Google Gemini · BYOK
- Tiers: Free (local-only) · Starter · Pro · Enterprise — see pricing
- Get started: Quickstart · Security: Security & Privacy