A hybrid AI architecture enabled secure search, internal chat, and direct API access

Client

Danmarks Statistik

Industry

Public Data / Government Infrastructure

Challenge

Enabling AI on sensitive data without letting raw data leave internal infrastructure

Results

A hybrid AI architecture enabled secure semantic search, internal chat, and direct API access for external agents

Scope

AI capabilities without exposing sensitive data

Danmarks Statistik manages some of Denmark’s most sensitive public data. That ruled out tools like ChatGPT and Gemini from day one.

The project started as an internal research project to evaluate the possibilities of using AI-at-scale at DST. The organisation already had strong internal development capabilities and invested in internal infrastructure, including Next-gen GPUs, to run AI workloads in-house. The challenge was not whether AI could be used, but how to build useful capabilities at scale without compromising on security or relying fully on commercial APIs.

The scope included:

Designing an AI architecture where raw data never left internal systems
Enabling semantic search across large and difficult-to-navigate datasets
Building an internal chat tool for sensitive data use cases
Creating a more reliable interface for external AI agents interacting with public data
Training teams across observability, tooling, compliance, and AI workflows

Concept

Security and capability, without compromise

The architecture was built around a simple principle: raw data stays inside, while meaning can move safely through vectors.

Internal models running on Danmarks Statistik’s own hardware handle anonymisation and vectorisation. External models operate only on vectors at runtime, never on the underlying raw data. This creates a hybrid setup that combines the security of a closed system with the flexibility and performance of external models.

Alongside the core architecture, BCT built an internal chat tool powered by a large open source model, enabling staff to work conversationally with sensitive data in a secure environment. BCT also set up an MCP server around Danmarks Statistik’s public API, allowing external AI agents to query structured data directly instead of scraping the client-side website.

The result was not just new tooling, but a more scalable and sustainable AI foundation.

Process

BCT designed and implemented the technical setup while also helping the organisation build internal capability around it.

Hybrid AI architecture

We designed an architecture where internal models handle anonymisation and vectorisation on Danmarks Statistik’s own infrastructure, while external models interact only with vector representations at runtime. This ensured raw data stayed protected while still enabling modern AI capabilities.

Vector databases and semantic search

Danmarks Statistik’s data tables are large and complex. We embedded the data and built vector databases on top, making previously hard-to-navigate content searchable through semantic queries.

MCP server for the public API

To reduce the load caused by AI agents scraping the website, we built an MCP server around the existing API. This gave agents a structured, reliable way to access public data directly through the backend instead of through the client-side experience.

Internal chat tool

We built a secure internal chat environment powered by a 120-billion-parameter open source model running on internal infrastructure. Staff could work with vectorised sensitive data in natural language, with dynamic visualisation features in development.

Training and enablement

Alongside the technical work, we ran a six-month training programme covering AI observability, automation tooling, IDE integration, and compliance strategy — helping the team build lasting internal capability around AI.

Results

The result was an AI setup that did not force Danmarks Statistik to choose between security and capability.

Sensitive data could now be vectorised, searched, and used in AI-powered workflows without leaving internal infrastructure. External models could deliver performance at scale without ever touching the raw data.

This made it possible to:

Enable secure semantic search across complex internal data
Provide staff with an internal chat tool for sensitive data use cases
Replace website scraping by external AI agents with a structured API-based interface
Reduce client-side resource strain through direct backend access
Build stronger internal capability through training and practical enablement

More broadly, the project created a model for how highly regulated organisations can work with AI responsibly — without giving up either performance or control.

Need AI capabilities your current security model won't allow?

If your organisation is sitting on sensitive data and trying to figure out what's actually possible within your constraints, we should talk.

Talk to Us About Your Project.

Get in touch