CLIENT
HealthTech Platform (Anonymized)
TIMELINE
6 weeks
SERVICES
RAG Systems, Vector Databases, API Infrastructure
STACK
LangChain, Pinecone, GPT-4, FastAPI, Redis
// 01
TheProblem
12,000+ articles across three legacy systems. 18 minutes per ticket. Declining satisfaction.
A mid-market healthtech platform was drowning in support complexity. Their knowledge base spanned 12,000+ articles across three legacy systems, and support agents spent an average of 18 minutes per ticket just locating the right documentation. Customer satisfaction scores were declining, and the support team was burning out from manual search across disconnected tools.
The existing keyword-based search returned irrelevant results for nuanced clinical queries, and agents had developed their own informal workarounds — tribal knowledge that never made it back into the system. New hires took 3+ months to reach competency because the information architecture was fundamentally broken.
// 02
OurApproach
A semantic search layer that unified all knowledge sources into a single retrieval pipeline.
We designed a retrieval-augmented generation pipeline that unified all knowledge sources into a single semantic search layer. Rather than replacing the existing documentation systems, we built an abstraction layer that ingests, chunks, and embeds content from all three sources into a Pinecone vector store with metadata-aware filtering.
The architecture uses a multi-stage retrieval strategy: an initial broad semantic search followed by a reranking pass that accounts for document recency, clinical domain relevance, and user query intent classification. We implemented hybrid search combining dense vector similarity with sparse BM25 matching to handle both conceptual and exact-match queries.
The API layer, built on FastAPI with Redis caching, serves sub-200ms responses even for complex multi-hop queries. We integrated the system directly into the existing support dashboard through a lightweight widget, requiring zero workflow changes from the support team.
// 03
TheResult
Resolution time dropped from 18 minutes to under 5.
Within two weeks of deployment, average ticket resolution time dropped from 18 minutes to under 5. The RAG system correctly surfaced relevant documentation on the first query 89% of the time, compared to 34% with the previous keyword search. Support agents reported that the system felt like having a senior colleague available 24/7.
New hire onboarding time decreased from 3 months to 6 weeks, as the system effectively democratized institutional knowledge. The knowledge base itself improved as agents began flagging gaps that the system surfaced, creating a virtuous cycle of documentation quality improvement.
The platform has since expanded the system to power a customer-facing self-service portal, deflecting 40% of incoming tickets before they reach a human agent.
// IMPACT
faster response time vs. manual lookup
from kickoff to production deployment
throughput increase in support resolution
“Modulo turned our fragmented knowledge base into a system that actually thinks. Support tickets that took 20 minutes now resolve in under 5.”