The 10-Category Scoring Framework
Every tool is scored 0–100 across weighted categories that matter most to buyers. The weights reflect what actually impacts business decisions.
| Category | Weight | What We Measure |
|---|---|---|
| Core AI Performance | 25% | Accuracy, reasoning quality, task completion rate, hallucination frequency |
| Data Privacy & Security | 20% | Data handling policies, encryption, retention, third-party sharing |
| Transparency & Explainability | 15% | Model disclosure, training data clarity, decision transparency |
| Reliability & Uptime | 10% | API reliability, documented SLA, historical uptime, error handling |
| Compliance & Regulatory Fit | 10% | GDPR, HIPAA, SOC2, EU AI Act readiness, audit logging |
| Pricing Fairness | 8% | Price-to-value ratio, hidden fees, contract lock-in, free tier honesty |
| Integration & Usability | 4% | API quality, documentation completeness, onboarding experience |
| Human Override Capability | 4% | Can humans override AI decisions? Clear escalation paths? |
| Vendor Accountability | 1% | Bug reporting responsiveness, public incident history, SLA enforcement |
| Bias & Fairness | 3% | Documented bias testing, demographic fairness, known failure modes |
The 5-Step Review Process
Every review follows these five steps in order. No steps are skipped. No exceptions.
Independent Testing (Minimum 2 Weeks)
We use the tool for its stated purpose across realistic scenarios. We pay for every tool we test at the tier being reviewed. Testing occurs at different times to capture reliability variance.
Documentation Deep-Dive
Privacy policy reviewed line by line. Terms of service flagged for unusual clauses. API documentation assessed. Security certifications verified through primary sources.
Developer Interview
Questions about model architecture, data handling, breach protocols, and roadmap. If a vendor declines, this is noted publicly. Interview summaries are published.
Independent Researcher Cross-Check
Draft reviews are verified by independent researchers. Their inputs are published verbatim. Researchers are never anonymous—full name and affiliation required.
Publication & Score Assignment
Final score calculated. Rationale published for every category. Company notified—they can submit a response but cannot request edits to scores.
Vendor Response Policy
Companies have 30 days from publication to submit a formal written response. Responses are published in full, unedited, as a dedicated section within the review.
Important: This is not an appeal. Scores do not change based on vendor objection. Scores change only when verifiable new information is provided or when the product materially changes.
Methodology Versioning
Every review is tagged with the methodology version under which it was conducted. When the methodology is updated:
- All affected reviews are flagged for re-evaluation
- Each version update is logged with what changed and why
- Reviews conducted under old versions are clearly labeled
- Major version changes trigger mandatory re-review
What We Will Never Do
- Accept payment to improve, change, or remove a score
- Offer priority review scheduling for paying companies
- Publish anonymous reviews or researcher contributions
- Quietly edit reviews—all changes are logged publicly
- Review tools where we have any financial interest
- Review a tool in which the reviewer holds any equity, advisory role, or commercial relationship
Methodology Changelog
- v1.0 — February 2026 — Initial framework published.