All Posts
Enterprise 7 min read

When your company wants ChatGPT but compliance says no

#llm#compliance

Every week we hear the same story from enterprise clients: teams want to use ChatGPT for their work, but compliance, legal, or security has blocked it. And for good reason — sending proprietary data to a third-party API is a non-starter in healthcare, finance, defense, and plenty of other industries. But the productivity gains are real, and telling people "just don't use AI" isn't a strategy.

The requirements

A mid-size enterprise came to us with a clear mandate:

- Employees need a ChatGPT-like interface for daily work - Zero data can leave company infrastructure - No third-party API calls — not OpenAI, not Anthropic, nothing - Must work with their existing SSO and access controls - Audit logs for every conversation - IT must be able to monitor usage without reading content

This ruled out every hosted AI solution on the market. We needed to run the entire stack — model included — on their servers.

Choosing the model

We evaluated several open-weight models for self-hosting: Llama, Mistral, and a few others. The selection criteria weren't just about benchmark scores. We needed:

- Good performance on business writing, summarization, and code review (their top use cases) - Reasonable hardware requirements — they weren't going to buy a cluster of A100s - A license that allows commercial internal use - Active community and regular updates

We went with a Llama variant running through Ollama for model management. It gave us the best balance of quality, resource efficiency, and ease of deployment. The model runs on two GPUs that fit within their existing server infrastructure.

The architecture

The stack is straightforward by design — complexity is the enemy of security audits.

Frontend: A clean chat interface built with Next.js, styled to match their internal tools. Supports markdown, code highlighting, and file uploads (for document Q&A).

Backend: A NestJS API layer that handles authentication (plugs into their existing SAML SSO), conversation management, and audit logging.

Model layer: Ollama running the LLM on dedicated GPU servers within their private cloud. The API layer talks to Ollama over an internal network — no external traffic.

Storage: All conversations stored in their existing PostgreSQL cluster. Encrypted at rest. Retention policies controlled by IT.

The entire system runs in a Docker Compose stack that their DevOps team can manage with their existing tooling. No exotic infrastructure. No Kubernetes cluster. Just containers on servers they already own.

What surprised us

The biggest challenge wasn't technical — it was adoption. We built the system, deployed it, and usage was... low. People had gotten used to ChatGPT's speed and polish. A self-hosted model on modest hardware is slower. The responses are good but not identical to GPT-4.

What turned it around:

- We added system prompts tailored to each department. The finance team's instance knows their reporting formats. The legal team's instance understands their contract templates. Generic AI is decent at everything. Specialized AI is great at the things that matter. - We built a shared prompt library so teams could save and share effective prompts - We added document context — users can upload internal documents and ask questions about them, which is something they couldn't do safely with external AI

Usage tripled within a month of these changes.

The compliance win

The security audit went smoothly because we designed for it from the start. Every conversation is logged. Every user authenticates through SSO. Network traffic stays internal. The compliance team can verify this independently — they don't have to trust us.

The client's CISO told us this was the first AI project that didn't give them a headache. That might be the best compliment we've gotten.

For companies stuck between "AI would help" and "compliance says no" — the answer isn't to wait for the hosted providers to solve your compliance problems. The answer is to bring the model in-house and own the entire stack. It's more work upfront, but it's the only approach that actually satisfies both sides.

Have a similar challenge?

We build production-grade software for companies that need it done right.

Let's Talk