Overview

Add a private, context-aware AI chatbot to any website in one line — and keep your visitors’ conversations on their own device.

InferKit is a drop-in JavaScript SDK by SynaptiCortex. Paste one <script> tag, and your page gets a chat assistant that answers questions grounded in that page’s content. On capable devices it runs the model entirely in the browser (WebGPU) — so the conversation never leaves the user’s machine.

Why InferKit


🔒 Privacy-first	In local mode, inference runs in-browser via WebLLM. No prompts, no page content, and no conversations are sent to any server. A genuine differentiator for healthcare, finance, legal, and EU/GDPR-sensitive sites.
⚡ Drop-in	One script tag + an API key. No backend to build, no model to host, no infra to run.
💸 Cost-efficient	When the user’s GPU does the work, we bear no inference cost — and neither do you. Remote inference is a fallback, not the default.
🎯 Grounded answers	A deterministic grounding gate refuses off-page questions before calling the model, so “what’s the capital of France?” doesn’t hijack your support bot.
🔌 Multi-provider	When remote inference is used, route to OpenAI, Groq, Anthropic, or Google Gemini — or bring your own key (BYOK).

How it works

Drop in the SDK. The script mounts a chat widget and reads the current page.
Pick the mode automatically. Capable browser → local (in-browser WebLLM). Otherwise → remote fallback through the InferKit API.
Answer in context. The page’s content is the knowledge base; the grounding gate keeps answers on-topic.

Visitor ──▶ InferKit widget ──┬─▶ Local model (WebGPU, in-browser)   ← default, private
                              └─▶ InferKit API ──▶ LLM provider       ← fallback / paid tiers

The platform API is the control plane — keys, tiers, usage, billing, bot protection — not a data pipe. In local mode it’s contacted exactly once, at startup, to validate the key.

Who it’s for

Documentation & knowledge sites — instant Q&A grounded in your docs.
SaaS products — in-app help that understands the current screen.
Privacy-sensitive sites — healthcare, finance, legal: answers without data egress.
Agencies — one integration, deployed across many client sites.

At a glance

SDK: synapticortex-chat on npm + jsDelivr/unpkg CDN
Models (local): Qwen, Llama 3.2, Phi-3.5, SmolLM2 (4-bit, WebGPU)
Providers (remote): OpenAI · Groq · Anthropic · Google Gemini · BYOK
Tiers: Free (local-only) · Starter · Pro · Enterprise — see pricing
Get started: Quickstart · Security: Security & Privacy