"AI-powered" has become the most overused term in M&A deal materials. Our analysis shows that 60% of companies claiming AI capabilities have nothing more sophisticated than basic rule engines or simple statistical models. Here's how to separate genuine AI value from marketing hype.
The AI Premium Problem
Companies with credible AI capabilities command significant valuation premiums—often 2-3x revenue multiples compared to traditional software. This creates a powerful incentive for sellers to overstate their AI sophistication.
We've seen targets describe Excel macros as "proprietary algorithms" and if-then rules as "machine learning." While technically not false, these characterizations dramatically overstate the defensibility and value of the technology.
The Five Levels of AI Sophistication
We use a simple framework to categorize AI capabilities:
Level 0: No AI (Despite Claims)
Rule-based systems, hard-coded logic, simple statistical thresholds. Common in legacy systems that have added "AI" to their marketing materials.
Level 1: Basic Machine Learning
Simple classification or regression models (logistic regression, decision trees). Valuable but not defensible—competitors can replicate quickly.
Level 2: Applied ML with Custom Features
More sophisticated models with domain-specific feature engineering. Some defensibility if the feature engineering reflects deep domain expertise.
Level 3: Deep Learning / Advanced ML
Neural networks, ensemble methods, sophisticated architectures. Requires significant expertise to develop and maintain. More defensible.
Level 4: Proprietary AI with Data Moat
Advanced models trained on proprietary datasets that competitors cannot access. The combination of algorithmic sophistication and unique data creates true competitive advantage.
Key Questions for AI Due Diligence
Model Validation
- What metrics demonstrate model performance? (Accuracy alone is insufficient—need precision, recall, F1, AUC as appropriate)
- How does the model perform in production vs. testing? (Beware significant degradation)
- Has the model been validated on out-of-sample data?
- How often does the model require retraining?
Data Pipeline Assessment
- Where does training data come from?
- Is there a data moat—proprietary data that competitors can't access?
- How is data quality maintained?
- Are there data licensing or privacy concerns?
MLOps Maturity
- How are models deployed and monitored?
- What's the process for model updates?
- How is model drift detected and addressed?
- Is there version control for models and datasets?
Team and Knowledge
- Who built the models? Are they still with the company?
- Is the ML knowledge documented or concentrated in key individuals?
- What's the data science team's background and capability?
Red Flags in AI Due Diligence
Watch for these warning signs:
- "Black box" explanations: Inability to explain how models work suggests they may not understand their own technology
- No performance metrics: If they can't quantify model accuracy, they probably haven't measured it
- Training data concerns: Models trained on purchased or scraped data may have licensing issues
- Single data scientist: Key person risk is extreme in AI—one departure can cripple the capability
- No production monitoring: Models that aren't monitored degrade silently
- Overfitting indicators: Perfect test performance with poor production results signals fundamental problems
Valuation Implications
AI capabilities should be valued based on:
- Defensibility: How hard is it for competitors to replicate?
- Revenue attribution: What portion of revenue depends on AI capabilities?
- Improvement trajectory: Is the model getting better over time with more data?
- Replacement cost: What would it cost to build equivalent capability from scratch?
A Level 4 AI capability with a genuine data moat might justify a 2-3x premium. A Level 1 capability that could be replicated in six months by a competent team? That's table stakes, not differentiation.
Case Study: The $18M AI Discount
We assessed a "AI-powered" clinical analytics company for a healthcare PE firm. The target claimed proprietary NLP models for extracting insights from medical records.
Our findings:
- The "proprietary NLP" was actually an off-the-shelf model with minimal customization
- Training data had been licensed from a vendor—no data moat
- The single ML engineer who built the system had already given notice
- Model accuracy in production was 67%, not the 94% claimed in marketing materials
Result: The buyer renegotiated from $45M to $27M—a $18M reduction based on our assessment of actual AI capabilities versus claims.