How HubSpot Scores 3.5 Billion CRM Objects Daily: A 78% Load Reduction Success Story
Discover how HubSpot processes 3.5 billion CRM object scores daily while achieving 78% load reduction. Learn their proven scaling strategies, database optimization techniques, and performance improvements. Get practical insights for handling massive data volumes efficiently in production systems.

When Your CRM Becomes Too Successful to Handle
Picture this: your customer relationship management platform is growing so fast that it's drowning in its own success. Every second, 18,000 updates flood your system. During peak times, that number skyrockets to 60,000 updates per second. You're managing 11 billion contact objects, with 18 million new ones added daily. Your sales teams are overwhelmed, your systems are buckling, and traditional solutions just can't keep up.
This was the reality facing HubSpot's engineering team. Their CRM platform had become a victim of its own success, generating massive volumes of data that threatened to overwhelm both their infrastructure and their users. The solution they built, called Prediction Engine, didn't just solve the immediate problem. It transformed how they handle predictive scoring at scale, ultimately reducing system load by up to 78% while processing millions of predictions daily.
The story of how they built this system offers valuable lessons for any organization grappling with the challenges of scaling AI and machine learning infrastructure in high-volume environments.
The Scale Problem That Forced Innovation
According to the HubSpot team, their CRM wasn't just handling typical business data volumes, it was operating at internet scale. With 11 billion contact objects in their system and millions of deals, companies, and other CRM records, they faced a unique challenge: how do you provide intelligent, predictive insights without overwhelming your users or your infrastructure?
The core business problem was straightforward but massive in scope. Sales representatives were managing hundreds of contacts and dozens of deals simultaneously. Without predictive scoring to help prioritize their efforts, they were essentially flying blind, unable to identify which opportunities deserved their immediate attention and which deals were likely to stagnate.
The Technical Reality Behind the Business Challenge
The numbers tell the story of the scale challenge:
- 18,000-60,000 updates per second during peak periods
- 11 billion contact objects in the system
- 18 million new contacts added daily
- 3.5 billion objects scanned in offline processing
- Hundreds of predictive models requiring real-time inference
Traditional approaches to predictive scoring, where each use case implements its own inference pipeline, would have meant multiple teams solving the same scalability challenges independently. This duplication of effort wasn't just inefficient; it was unsustainable at HubSpot's scale.
The Strategic Decision: Platform vs. Point Solutions
Rather than building individual scoring systems for each predictive use case, the HubSpot team made a crucial strategic decision: they would build a unified platform that could handle all predictive scoring needs across their CRM.
This decision was driven by three key insights that apply to any organization considering AI infrastructure investments:
Consistent Data Architecture: HubSpot's CRM uses a standardized data structure where all objects (contacts, deals, companies) share common patterns, identifiers, properties, and associations. This consistency meant a single platform could leverage existing data relationships across all predictive use cases.
Similar Inference Patterns: Whether scoring contact conversion likelihood or deal closing probability, the inference pipeline follows similar patterns, feature extraction, model calling, result processing, and feedback collection. Building these pipelines repeatedly would waste engineering resources.
Shared Scalability Challenges: Every predictive use case at HubSpot's scale faces identical challenges around high-volume processing, real-time inference, and system reliability. Solving these problems once and reusing the solution was the only sustainable approach.
Engineering a Solution for Internet-Scale Predictions
The Prediction Engine platform they built addresses both online (real-time) and offline (batch) inference scenarios through a modular architecture designed for reusability and scale.
Real-Time Processing That Actually Scales
For online inference, the system processes approximately 20,000 requests per second while generating around 60 updates per second. This seemingly counterintuitive ratio, processing far more requests than updates generated, is the result of several intelligent optimizations:
Smart Property Filtering: Instead of reacting to every property change, the system monitors only "critical" properties identified by ML engineers and product managers. This approach alone reduced processing volume by 41% for contact scoring use cases.
Delta Threshold Logic: New scores are only written if they differ significantly from previous scores. Small, insignificant changes are filtered out, reducing CRM updates by approximately 22% for online inference in deal scoring scenarios.
Intelligent Load Segregation: When entire customer portals upgrade their subscriptions and need all objects scored, the system routes small and large portals through separate processing queues, ensuring smaller customers don't wait behind massive processing jobs.
The Debouncer: Turning Chaos Into Efficiency
One of the most innovative components is the "Debouncer", a system that consolidates multiple rapid updates to the same object within specified timeframes. During high-activity periods, the same deal or contact might receive multiple property updates in quick succession, potentially triggering redundant scoring requests.
The Debouncer ensures each object is scored at most once within a configured timeframe, using different strategies based on volume and business requirements. For contact scoring, this approach contributed to a 44% reduction in request volume. The engineering team discovered that extending the timeframe from 1 hour to 6 hours reduced task requests by only 3% but increased memory usage by 135%, highlighting the careful balance required between efficiency and resource utilization.
Batch Processing That Handles Billions
For offline inference, the system scans over 3.5 billion objects and scores approximately 250 million objects in scheduled batch jobs. This isn't just about processing volume, it's about handling implicit feature drift that occurs over time.
Consider a deal scoring model that includes "Days Since Last Update" as a feature. Even if a deal record isn't explicitly modified, this time-based feature changes continuously, potentially affecting the deal's score. The offline system captures these implicit changes by periodically re-scoring objects that haven't been updated within preconfigured timeframes.
To manage this massive processing load without impacting live systems, the batch jobs operate on daily snapshots stored in S3 rather than querying the live HBase database directly.
Implementation Insights: What Worked and What They Learned
The Explanation Performance Challenge
One unexpected discovery during implementation involved SHAP-like explanations that help users understand why particular scores were assigned. The team found that generating explanations took approximately 20 times longer than generating scores alone.
Their solution demonstrates elegant problem-solving: instead of generating explanations for every score, they implemented a two-step process. First, calculate the score and evaluate whether it meets the delta threshold for significance. Only if the score change is meaningful enough to save do they generate the computationally expensive explanations.
This splitting approach reduced offline inferencing time by 57% for deal scoring use cases, showing how understanding your computational bottlenecks can lead to dramatic efficiency improvements.
The 84% Insight That Changed Everything
Perhaps the most striking discovery was that in offline inferencing, approximately 84% of object updates failed to meet the required delta threshold. This meant the vast majority of computational work was being spent on updates that would never be shown to users because the changes were too small to matter.
This insight reinforced the value of their delta threshold approach and highlighted a broader lesson: not all technically accurate updates are business-relevant updates. Sometimes the most important optimization is knowing when not to compute something at all.
Results That Speak to Both Engineers and Executives
The business impact of Prediction Engine extends far beyond technical metrics, though the technical achievements are impressive in their own right:
Technical Performance Metrics
- 20,000 requests per second processed in online inference
- 3.5 billion objects scanned in offline batch processing
- 250 million objects scored in batch jobs
- Up to 78% reduction in load on upstream services
Business Impact Translation
These technical improvements translate directly into business value:
Sales Team Efficiency: Representatives can now quickly identify high-potential deals and prioritize their time strategically, focusing efforts on opportunities most likely to close successfully.
System Reliability: By reducing upstream service load by up to 78%, the platform prevents system overloads that could impact user experience during peak usage periods.
Development Velocity: The modular, reusable architecture allows new predictive models to be integrated quickly, reducing time-to-market for new AI-powered features.
Cost Optimization: Processing only meaningful updates and implementing intelligent filtering reduces computational costs while maintaining prediction accuracy.
Lessons for Organizations Building AI Infrastructure
HubSpot's Prediction Engine success offers several transferable insights for organizations grappling with similar challenges:
Platform Thinking Pays Dividends at Scale
The decision to build a unified platform rather than point solutions becomes more valuable as your AI initiatives multiply. If you're planning multiple machine learning use cases, investing in shared infrastructure early can prevent technical debt and resource duplication later.
Not All Updates Are Created Equal
The delta threshold approach, only processing changes that matter to end users, is applicable beyond predictive scoring. Any system generating frequent updates should evaluate whether all those updates provide meaningful value to users.
Optimization Requires Understanding Your Bottlenecks
HubSpot's discovery that explanation generation was 20x slower than scoring, and their splitting solution, demonstrates the importance of profiling your system's actual performance characteristics rather than making assumptions.
Design for Debuggability and Feedback
The Feedback Manager component that tracks inferences and sends model performance data shows the importance of building observability into AI systems from the start. You can't improve what you can't measure.
Scale Changes Everything
Solutions that work at moderate scale may fail completely at internet scale. HubSpot's challenges with 60,000 updates per second during peak periods required fundamentally different approaches than traditional CRM systems handling hundreds or thousands of updates.

Building the Future of Intelligent CRM
The Prediction Engine platform represents more than a technical achievement - it's a blueprint for how organizations can successfully integrate AI capabilities into high-volume, mission-critical business systems.
By focusing on modularity, reusability, and intelligent optimization, HubSpot created infrastructure that not only solved their immediate scaling challenges but positioned them to rapidly deploy new predictive capabilities as business needs evolve.
For technical leaders evaluating similar challenges, the key takeaway isn't the specific technologies used, but the strategic approach: understand your scale requirements, identify commonalities across use cases, and build platforms that can grow with your AI ambitions.
The future belongs to organizations that can deliver AI-powered insights at scale without overwhelming their users or their infrastructure. HubSpot's Prediction Engine shows one path to that future.
VegaStack Blog
VegaStack Blog publishes articles about CI/CD, DevSecOps, Cloud, Docker, Developer Hacks, DevOps News and more.
Stay informed about the latest updates and releases.
Ready to transform your DevOps approach?
Boost productivity, increase reliability, and reduce operational costs with our automation solutions tailored to your needs.
Streamline workflows with our CI/CD pipelines
Achieve up to a 70% reduction in deployment time
Enhance security with compliance automation