Your AI. Your choice. Your data.
Five AI providers, hybrid orchestration, and BYOK. Pick the best model for each task โ or run entirely on-premise with Ollama for zero data egress.
Five providers. One platform.
| Provider | Models | Best for | Data egress | Cost |
|---|---|---|---|---|
| โก Groq | Llama-3.3-70b, Llama-3.1-8b/70b | Speed-critical, high-volume | Cloud | Per token |
| ๐ค OpenAI | GPT-4o, GPT-4.1, GPT-4.1-mini | General intelligence, complex reasoning | Cloud | Per token |
| โจ Google Gemini | Gemini-2.5-Flash/Pro, Gemini-1.5 | Multimodal, long context windows | Cloud | Per token |
| ๐ง Anthropic Claude | Claude-3.5-Sonnet, Haiku, Opus | Careful reasoning, safety-focused | Cloud | Per token |
| ๐ Ollama (local) On-premise | Llama3, Mistral, Phi3, Gemma2 | Sensitive data, zero API cost | None | Free |
Zero data egress. Zero API cost.
Run AI models entirely on your own servers. No data ever leaves your infrastructure. Ideal for healthcare, finance, legal, and any business with sensitive data requirements.
Patient records, financial data, legal documents โ none of it leaves your network.
Run unlimited queries on your own hardware. No API bills, no usage caps.
Same quality responses as cloud AI โ Llama3, Mistral, Phi3, Gemma2 on your hardware.
Assign different local models for entity extraction, response generation, summarization, and quality scoring.
Supported Ollama models
The right model for every query
BYOK
Bring your own API key for any provider. Tenant-level isolation โ your key, your usage, your billing. Never shared.
Primary + backup
Set a primary provider and a backup. Automatic failover if the primary goes down or rate-limits.
Confidence routing
Queries are automatically routed to the best engine based on confidence scoring and query complexity.
Load balancing
Distribute queries across multiple providers or models to optimise cost and performance.
Choose your AI. Deploy in minutes.
All 5 providers available on every plan. Ollama on-premise available on Pro and Enterprise.