Major Life Insurance Company: RAG System Development Support
Global Consulting Firm
Overview
Led the design and development of an internal document search RAG system on Microsoft Azure, with the goal of helping the client internalize AI engineering and grow their talent. Forked `Azure-Samples/azure-search-openai-demo` and extended it to meet the closed-network requirements and Entra ID-based authentication expected by a financial institution, owning production deployment into a private environment end-to-end.
Architecture
A React + TypeScript frontend (Vite) is paired with a Python Quart async backend. Azure AI Search stacks vector search, BM25, and the semantic ranker into a multi-stage pipeline — the combination keeps both recall and relevance high on internal policy documents where proper nouns and paraphrases mix freely.
Ingestion chains Azure Document Intelligence (OCR and table parsing) with Integrated Vectorization so uploads to Blob Storage become searchable through an event-driven flow. Prompts were migrated from the upstream Jinja2 layout to `.prompty` files (e.g. `chat_answer_question.prompty`), which lets prompt review and diff tracking be handled with the same tools as code.
Environment differences live in `azure-dev.yaml` as 130+ environment variables, so per-tenant and per-department deployments can be forked from one codebase. Hosting runs on Azure Container Apps with scale and availability declared in Bicep IaC.
Security & Private Networking
To meet financial-industry security requirements, `network-isolation.bicep` defines the VNet, Private Endpoints, and private DNS zones as a single declarative bundle. The closed-network setup — only partially covered upstream — is fully integrated into IaC, leaving no public path to the internet.
Authentication and authorization are tightened in `authentication.py`, which strictly manages the Entra ID tenant ID and server / client App IDs; document-level ACLs then prevent cross-department leakage. Secrets are externalized to Azure Key Vault, and a 43 KB Bicep template builds an Application Insights custom KPI dashboard.
An Azure Log Analytics dashboard built with KQL continuously tracks RAG-specific KPIs — answer latency, retrieval hit rate, per-user usage — so quality regressions are caught early.
Technology Transfer & Training
Delivered structured lectures spanning web fundamentals (HTML / CSS / JavaScript) through Python and Azure architecture, and incrementally embedded Git / GitHub, pull-request, and code-review practices into the team. Scrum with two-week sprints and daily standups was introduced to plan and drive a half-year transition from external dependency to a self-directed team.