industry insights

How Zendesk Cut Support Costs by 14% Using Custom Semantic Search Models

Discover how Zendesk reduced support costs by 14% through custom semantic search models. Learn their ML implementation strategies, search optimization techniques, and efficiency improvements. Get actionable insights for using AI to improve customer support while reducing operational expenses.

6 min read

Copy link

Apr 10, 2026

How Zendesk Cut Support Costs by 14% Using Custom Semantic Search Models

The High-Stakes Challenge of Enterprise Search

When your customers can't find the answers they need, everyone loses. Support tickets pile up, operational costs skyrocket, and customer satisfaction plummets. At Zendesk, this reality became crystal clear through their data analysis: even minor improvements in search relevance could dramatically reduce operational costs for businesses using their platform.

The stakes were particularly high for Zendesk Guide, their knowledge base solution that serves millions of search queries daily. Support agents rely on efficient search to resolve tickets quickly, while end users need self-service capabilities to find answers without creating tickets in the first place. According to the Zendesk team, the ripple effects of poor search extend far beyond user frustration, they translate directly into measurable business impact through increased support loads and decreased productivity.

Their existing Elasticsearch infrastructure was fast and scalable, but it had a critical limitation: traditional keyword search fails when there's little word overlap between queries and relevant documents. This gap between user intent and search results was costing their customers significant money in operational overhead.

The Breaking Point: When Keyword Search Wasn't Enough

The problem became increasingly apparent as Zendesk analyzed user behavior patterns. Their diverse client base spans industries from e-commerce to medical devices to financial services, each with specialized terminology and unique search patterns. A user searching for help with a "digital wallet issue" might never find an article titled "cryptocurrency payment troubleshooting", despite the content being perfectly relevant.

Even more challenging, real-world search queries rarely match the well-formed questions that traditional search systems expect. Instead, users typically enter short, informal, or poorly written queries that keyword matching struggles to interpret effectively. This mismatch between user behavior and system capabilities was creating a massive opportunity cost.

The Zendesk team recognized that solving this problem required more than incremental improvements to their existing keyword search. They needed a fundamental shift toward understanding the semantic meaning behind queries, not just matching words.

The Strategic Decision: Building Custom vs. Buying Off-the-Shelf

Initially, Zendesk integrated existing semantic search capabilities using off-the-shelf embedding models in 2023, which provided significant improvements in search quality. However, the team quickly realized that generic models, typically trained on well-formed questions from general domains, couldn't capture the nuanced requirements of their diverse customer base.

The decision point came down to a critical question: Could they achieve better business outcomes by training custom models using their own user interaction data? The answer required balancing the substantial engineering investment against the potential for dramatically improved results.

Rather than choosing between keyword and semantic search, the Zendesk team opted for an innovative hybrid approach. They designed a two-step system where Elasticsearch first extracts candidate documents using traditional keyword matching, then re-ranks these results based on semantic similarity in vector space. The final ranking uses a weighted average of both keyword match scores and cosine similarity scores.

This hybrid strategy proved crucial, their experiments showed that combining both approaches outperformed either method used alone.

The Custom Solution: Turning User Clicks into Training Gold

Transforming Implicit Feedback into Training Data

The breakthrough insight came from recognizing that Zendesk's millions of daily search queries represented an untapped goldmine of training data. Every click on a search result provides implicit feedback about relevance, but raw click data is notoriously noisy due to position bias and other factors.

The team developed a sophisticated data preparation process that addresses these challenges through careful aggregation and transformation. They organize the data into triplets: search query, clicked result, and logarithm of click frequency. This approach offers several advantages:

Click frequency indicates relative relevance for each query
Logarithmic transformation reduces variance and handles extreme values
The method allows for preprocessing to mitigate position bias
Aggregate training is computationally more efficient than processing individual clicks

Solving the Data Imbalance Challenge

One of the most complex challenges involved handling Zendesk's diverse customer base. Some accounts generate orders of magnitude more search traffic than others, creating severe data imbalance that could skew model performance toward high-traffic accounts.

The team's solution was elegantly practical: instead of training one universal model, they developed customer-specific models using Adapters, lightweight modules that fine-tune only a small portion of the base model's parameters. This approach provides several business advantages:

Enhanced transparency for tracking performance and errors
Simplified data governance with account-specific data usage
Better performance for specialized domains and terminology
Fallback to baseline models for accounts with insufficient data

Implementation Insights: Overcoming Real-World Challenges

Smart Data Sampling for Better Training

The team discovered that naive data sampling from click logs creates poor quality training sets. Popular documents dominate the data, while the diversity needed for robust model training gets lost. Their solution promotes document diversity by sampling one query at a time for each unique document, ensuring balanced representation across the knowledge base.

Choosing the Right Loss Function

After experimenting with multiple approaches, the team found that Kullback-Leibler (KL) divergence loss delivered the best performance for their specific needs. This choice supports multiple relevant documents per query during training and effectively utilizes click frequency information by modeling the relative popularity of all clicked documents.

Rapid Experimentation Framework

To accelerate development, the team created a hand-annotated test set covering diverse industries and query types. They pooled results from various retrieval models and manually annotated query-document pairs on a relevance scale from 0 to 4. This investment in quality evaluation data enabled rapid iteration and reliable performance measurement using standard metrics like Normalized Discounted Cumulative Gain (NDCG).

Implementation Insights: Overcoming Real-World Challenges

Results: Measurable Business Impact Across the Board

The results speak for themselves in both technical metrics and business outcomes. The fine-tuned models typically achieved NDCG gains of 2-6 % points compared to baseline models, with improvements scaling based on the amount of available search data for each account.

Online Performance Gains

The real-world impact proved even more impressive than laboratory results:

Up to 9% improvement in Click Through Rate (CTR) - More users finding relevant results on their first attempt
Up to 14% improvement in Mean Reciprocal Rank (MRR) - Relevant results appearing higher in search rankings
Particularly strong improvements for long queries and product-specific searches - Addressing the exact pain points identified in the original problem analysis

Qualitative Improvements

Beyond the numbers, the team observed significant qualitative improvements in user experience. Complex product names and industry-specific terminology that previously yielded poor results now connect users with relevant content effectively. The semantic understanding bridges the gap between how users naturally express their problems and how solutions are documented.

Key Lessons for Enterprise AI Implementation

1. Domain-Specific Training Delivers Outsized Returns

Generic models, no matter how sophisticated, cannot capture the nuanced terminology and usage patterns specific to your business domain. The investment in custom training using your own user interaction data can yield dramatically better results than off-the-shelf solutions.

2. Hybrid Approaches Often Outperform Pure Solutions

Rather than replacing keyword search entirely, combining lexical and semantic approaches leverages the strengths of both methods. This hybrid strategy provides insurance against the weaknesses inherent in any single approach.

3. Data Quality Trumps Data Quantity

Careful attention to data preparation, sampling strategies, and bias mitigation proves more valuable than simply processing larger volumes of raw data. The Zendesk team's sophisticated approach to handling click data demonstrates the importance of understanding your data's limitations and designing around them.

4. Customer-Specific Models Enable Better Governance

Account-specific models using lightweight adapters solve multiple problems simultaneously: data imbalance, performance optimization, and governance requirements. This approach makes enterprise AI deployment more practical and transparent.

5. Invest in Evaluation Infrastructure Early

Creating high-quality test sets and evaluation frameworks enables rapid experimentation and confident deployment decisions. This upfront investment accelerates the entire development process.

The Path Forward: Scaling Semantic Search Success

The success of Zendesk's semantic search implementation demonstrates that significant business impact is achievable when technical innovation focuses on real user needs. Their approach, combining sophisticated machine learning techniques with practical business considerations, provides a roadmap for organizations looking to implement similar solutions.

The key insight extends beyond search to enterprise AI generally: the most successful implementations solve specific business problems using custom approaches informed by real user behavior, rather than applying generic solutions and hoping for the best.

As businesses increasingly recognize that information retrieval quality directly impacts operational costs and customer satisfaction, Zendesk's experience offers valuable lessons about building AI solutions that deliver measurable business value.

VegaStack Blog

VegaStack Blog publishes articles about CI/CD, DevSecOps, Cloud, Docker, Developer Hacks, DevOps News and more.

Stay informed about the latest updates and releases.

Ready to transform your DevOps approach?

Boost productivity, increase reliability, and reduce operational costs with our automation solutions tailored to your needs.

Streamline workflows with our CI/CD pipelines

Achieve up to a 70% reduction in deployment time

Enhance security with compliance automation