The Secret Sauce Behind ZDNET's AI Testing Process

Elyse Betters Picaro / ZDNET

Stay updated with ZDNET: Set us as a favored source on Google.

ZDNET’s main highlights

ZDNET evaluates AI through hands-on, real-world applications.
Companies cannot preview or alter reviews.
Consistent testing guarantees fair “best of” evaluations.

At ZDNET, we recognize the significant duty we carry. Many of you rely on our assessments to guide your buying choices, so we strive to deliver clear, impartial, and carefully considered insights. This ensures you have a dependable foundation for investing both your money and time wisely.

We apply the same commitment to free offerings, recognizing that time is just as valuable as cash. We aim to help you avoid wasting either resource.

Also: ZDNET AI guidelines

To review AI tools and services, we often partner with vendors for product access. However, they are never granted a preview of our content or given any power to sway our conclusions. Our analysis remains unbiased and focused on how these products benefit our audience.

Our AI testing approach in 2026

Let’s examine our testing methodology for AI at ZDNET. Given that AI features are now integrated into nearly everything, our scope is vast. We cover large language models, development utilities, image-creating software, AI-powered apps, and even specialized hardware—whether that’s an effective application like a robotic vacuum or a less successful gadget like an AI pin.

We judge products and services using a broad set of benchmarks. Most importantly, every review involves direct hands-on interaction and practical, real-world trials. While we might acknowledge benchmark results shared in press releases, they do not form the basis of our evaluations.

Generally, our reviews fall into two categories. To identify the top performers in specific areas, we create “Best of” selections. For individual products or services, we often provide in-depth narratives based on long-term personal usage, offering diverse viewpoints for our readers.

Creating comparative assessments

Compiling our “best” lists is a three-phase operation. First, we create evaluation metrics to compare items objectively. Next, we select the products for comparison. Finally, we perform the actual head-to-head testing.

The process begins by asking, “How do we fairly judge this category?” I then develop a series of tests which I outline in the final article. These tests assess performance, value, utility, precision, security, and privacy, utilizing standardized procedures to ensure our comparisons remain objective.

For instance, our evaluations of chatbots and AI image creators include full methodologies within the articles. The same rigorous process applies to other categories.

Some products are obvious choices for our candidate pools. When evaluating chatbots, for instance, tools like ChatGPT, Gemini, and Claude are immediate inclusions.

We also delve deeper by considering reader requests and general buzz from online communities. Occasionally, if a vendor suggests a product that fits our criteria and we find it relevant, we may include it as well.

Typically, we end up with a pool of five to ten contenders. Often, a quick review of our testing criteria will disqualify some; perhaps they are overpriced or simply do not align with the category guidelines.

For example, we may receive pitches from course creators wanting their paid training featured in our compilation of free classes, but payment-based courses do not qualify for that list.

The time required to finalize candidates and set up access varies. Last year, testing AI website builders required 231 emails and over six months of preparation to ensure all products were ready for evaluation. In contrast, this year’s update was completed in two months with fewer than 50 emails.

This brings us to the actual execution of tests and re-testing. Once our methodology is set, the testing phase is straightforward, though it takes considerable time. We record results meticulously for each test and screen, later normalizing the data with comparative scoring and weighting. These metrics are all explained within the articles.

While publication marks the initial milestone, the work is far from over.

In the swiftly evolving AI landscape, products constantly change. Some may fade away or face financial hurdles, while others continuously improve. After 6 to 12 months, our lists often require updates. For example, last year’s AI website builder reviews were modest at best, but this year, several options are genuinely impressive.

Some of my preferred AI comparisons include:

Extended product use

Another review method involves using products and integrating them into personal and professional projects over extended periods. These assessments go beyond standard reviews as we put the tools through weeks—or even months—of regular use.

This is particularly evident in my coding-focused articles. Comparing AI coding tools objectively is difficult without building something real. Academic coding tasks differ greatly from professional product development or troubleshooting live issues.

These projects are frequently ongoing and generate fresh insights and evolving perspectives.

Initially, I tested OpenAI’s Codex during its early stages and was unimpressed. Following its subsequent updates, I revisited it for a security project. I managed to complete 24 days of coding in just 12 hours, though I also encountered some limitations. As it continued to improve, a later test allowed me to simulate four years of development in just four days.

We’ve published similar long-term experiential reviews focusing on Gemini, ChatGPT, Claude Code, various image creators, and other tools. As these utilities evolve, we continuously find new ways to integrate them and test them further. Some highlights from these explorations include:

Your vital contribution

We receive substantial feedback via emails, social platforms, and article comments, which helps guide our future coverage. Your high expectations are appreciated; they drive us to deliver the most accurate insights.

Additionally, sharing your personal experiences with the products we cover is invaluable. Many readers are highly skilled experts. Your input helps us stay informed, which in turn allows us to provide deeper insights to the wider audience. Essentially, our work benefits from peer review by our millions of professional readers and enthusiasts.

We maintain a rigorous standard for our reviews because we understand the stakes of your financial and time investments, which frequently depend on our reporting.

Please reach out whenever you have a suggestion for a new AI topic or tool. What product, service, or category should we explore next? Share your thoughts in the comments below.

For regular project updates, follow me on social media. Subscribe to my newsletter and connect with me on Twitter/X at @DavidGewirtz, Facebook at Facebook.com/DavidGewirtz, Instagram at Instagram.com/DavidGewirtz, Bluesky at @DavidGewirtz.com, and YouTube at YouTube.com/DavidGewirtzTV.

Top Posts

Open-Sourcing the Azure Integrated HSM: A Bold Step Toward Trust and Transparency

Secure Data, Smarter IoT: Inside the Federated Computing Platform Revolution

The Secret Sauce Behind ZDNET’s AI Testing Process

The Secret Sauce Behind ZDNET’s AI Testing Process

Bend, Don’t Break: How Deformable Materials Are Forging the Future of Physical AI

“Securing Profits Through AI Governance in the Enterprise”

Proxy-Pointer RAG: Turning Text into Cross-Modal Answer Maps

Moonshot AI Unleashes FlashKDA: Cutting-Edge Kernels for Kimi Delta Attention with Variable-Length Batching on H20

Small Business Hidden Tax: Stop Silent Payroll Errors from Draining Thousands

LG and NVIDIA’s Secret Conversations About Physical AI’s Future

Open-Sourcing the Azure Integrated HSM: A Bold Step Toward Trust and Transparency

Secure Data, Smarter IoT: Inside the Federated Computing Platform Revolution

The Secret Sauce Behind ZDNET’s AI Testing Process

Morning Minute: MegaETH’s MEGA Launch Powers the Biggest Token Debut of 2026

Bridging the Gap: A Generalist-Specialist Framework for Generalizable Medical AI

“Security’s Blind Eye: Why BEC Bypasses MFA and Keeps Winning”

durable execution that follows the tenant

Securing Digital Identities: How Modern IT and IoT Systems Defend Against Evolving Threats

Trending

Open-Sourcing the Azure Integrated HSM: A Bold Step Toward Trust and Transparency

Secure Data, Smarter IoT: Inside the Federated Computing Platform Revolution

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

The Secret Sauce Behind ZDNET’s AI Testing Process

ZDNET’s main highlights

Our AI testing approach in 2026

Creating comparative assessments

Extended product use

Your vital contribution

Related Posts