solutions

Published: Apr 3, 2026

Quick API Data Quality Checks Without Writing Python Scripts

9 min read·Tags: data quality, API, no-code, DuckDB, validation, analytics

Quick API Data Quality Checks Without Writing Python Scripts

Here's an uncomfortable truth about API data: it's rarely clean.

APIs return nulls where you expect values. They return strings where you expect numbers. They silently change field names between versions. They return stale data without telling you. They paginate inconsistently. They include test records in production responses.

Every analyst who works with API data has a story about a report that went wrong because the source data was garbage. A metric that was off because of duplicate records. A trend line that spiked because of a timezone issue in the timestamps.

The standard solution is to write a data quality script in Python. Install pandas. Write a DataFrame profiler. Check nulls, dtypes, duplicates, value ranges. Output a report. Set it up to run before your analysis.

It's good practice. It's also a significant time investment — especially if you're not a Python developer, or if you need to check a new API quickly without setting up a project.

There's a faster path.

What "Data Quality" Actually Means for API Data

Before we talk tools, let's be clear about what we're checking:

1. Completeness

Are the fields I expect actually populated? A user_id column shouldn't be 30% null. A revenue field shouldn't have blanks.

2. Uniqueness

Does the API return duplicates? Pagination bugs, caching issues, and API version differences can cause the same record to appear multiple times.

3. Validity

Are the values in a sensible range? Negative prices. Future timestamps in historical data. Age fields with values of 0 or 999. These are validity failures.

4. Consistency

Does the data agree with itself? If start_date is always before end_date. If country_code and country_name match. If totals match the sum of line items.

5. Freshness

Is the data up to date? If the API claims to update daily but the latest timestamp is from last week, that's a freshness failure.

6. Schema Drift

Has the API silently changed? New fields, renamed fields, changed data types — these break downstream analysis silently and painfully.

The Old Way: Python Quality Checks

Here's what a thorough data quality check on an API response looks like in the traditional workflow:

Step 1 — Set up the environment: Create a virtual environment. Install requests, pandas, numpy, maybe great_expectations or ydata-profiling.

Step 2 — Fetch the data: Write the API call. Handle authentication. Handle pagination. Handle rate limits. Flatten the JSON.

Step 3 — Load into pandas: df = pd.DataFrame(data). Deal with nested columns. Cast types.

Step 4 — Write the checks:

df.isnull().sum() — null counts per column
df.duplicated().sum() — duplicate rows
df.describe() — stats per column
Manual range checks for each critical field
Custom consistency checks
Timestamp max for freshness

Step 5 — Interpret the results: 200 lines of output. Decide what's a problem and what isn't.

Step 6 — Document for next time: Hope you remember what you checked and why.

This is a 2–4 hour project the first time. Even with experience, it's 30–60 minutes per new API source. And it requires Python fluency throughout.

The New Way: Harbinger Explorer

Harbinger Explorer collapses this workflow to minutes — for people who know what questions to ask, but don't want to write a program to ask them.

Here's the workflow:

Step 1: Crawl Your API (2 minutes)

Add your API endpoint to Harbinger Explorer's Source Catalog. Authenticate once. Run the crawl. The data is loaded into DuckDB WASM — in your browser, instantly queryable.

Step 2: Ask Quality Questions in Plain English or SQL

The beauty of having your API data in a SQL engine is that every data quality check is a query. And with Harbinger's AI agent chat, you can ask in plain English:

Completeness checks:

"How many rows have a null value in the user_id column?"
"What percentage of records are missing revenue data?"
"Show me all columns and their null counts."

Uniqueness checks:

"Are there any duplicate record IDs?"
"Show me any rows where the same email appears more than once."

Validity checks:

"Are there any negative values in the price column?"
"Show me records where the end_date is before the start_date."
"What's the min and max value of the age field?"

Freshness checks:

"What's the most recent timestamp in this dataset?"
"How many records were created in the last 7 days?"
"Show me the distribution of records by date."

Schema drift detection:

Compare current schema to a previous crawl
"What new columns appeared since last week?"
"Has the data type of the revenue column changed?"

All of these translate to DuckDB SQL under the hood — fast, accurate, and reproducible.

Example: Validating an External Data Feed

You're an analyst receiving daily data from a third-party market intelligence API. You've been burned before — a field that went null for a week without warning, causing incorrect calculations in your weekly report.

Your new quality gate with Harbinger Explorer:

Monday, 8:50 AM — Open Harbinger Explorer. Re-crawl the API. 60 seconds.

8:52 AM — Ask: "How many rows are in today's data? Is that roughly the same as last week?"

8:53 AM — Ask: "Are there any nulls in the signal_score or market_region columns?"

8:54 AM — Ask: "What's the most recent update timestamp in the data?"

8:55 AM — Ask: "Are there any duplicate record IDs?"

8:57 AM — Green light. Data is clean. Start your analysis.

Total quality check time: 7 minutes. No scripts. No environment setup. No Python.

Compare that to the old way, and you've saved 40+ minutes every single day — while actually being more thorough because you're checking the right things for your specific use case.

The Most Important Quality Checks for API Data

Here are the checks that catch 80% of API data problems, phrased as questions you can ask directly in Harbinger Explorer:

Check	Question to Ask
Row count sanity	"How many rows did we get? Is that normal?"
Null completeness	"Show me null counts for each column"
Duplicate detection	"Are there any duplicate IDs?"
Value range validation	"What's the min and max of [critical numeric field]?"
Timestamp freshness	"What's the most recent date in the dataset?"
Category distribution	"Show me unique values in [category field] with counts"
Cross-field consistency	"Show me rows where end_date < start_date"
Outlier detection	"Show me records where [metric] is more than 3x the average"

These eight checks will catch the vast majority of data quality issues before they reach your reports.

Who Needs This Most

Freelance Data Consultants

When you're delivering analysis to clients, data quality is your responsibility — even when the source is a third-party API you don't control. A quick quality gate before every deliverable protects your reputation. With Harbinger Explorer, it's a 10-minute habit, not a half-day project.

Internal Analysts at Fast-Moving Companies

Your data team is heads-down on roadmap. You can't create a ticket every time you want a new API validated. Harbinger gives you the autonomy to run your own checks.

Researchers Working with Public APIs

Academic and public datasets are notoriously inconsistent. APIs revise historical data, update field definitions, and change response formats without announcement. Regular quality checks catch these changes before they corrupt your research.

Bootcamp Graduates Entering Analyst Roles

You know what data quality means, but you haven't built the Python tooling yet. Harbinger Explorer gives you the outcome — validated, understood data — while you develop your scripting skills.

Competitor Comparison

Tool	For Non-Devs	API Crawling	NL Queries	Data Quality Checks	Price
pandas profiling	❌ Python required	❌	❌	✅ Auto-profile	Free
Great Expectations	❌ Engineering heavy	❌	❌	✅ Test suite	Open source
Ataccama	✅ UI-based	❌	❌	✅ Full platform	Enterprise $$$
Metabase	✅	❌ (needs DB)	⚠️	⚠️	$500+/mo
Harbinger Explorer	✅	✅	✅	✅ Via SQL/NL	€8/mo

Harbinger is the only tool that combines API access, browser-based SQL, and natural language queries at a price accessible to freelancers and small teams.

What Harbinger Explorer Doesn't Do (Be Honest)

It's worth being transparent about limitations:

No automated alerting — Harbinger doesn't send you a notification if today's crawl has more nulls than yesterday. You run the checks manually. Think of it as a tool you use, not a sentinel that watches for you.
No persistent data storage — Data lives in your browser session. You're not building a long-term quality history unless you export your results.
Not a replacement for a full data quality platform — If your organization needs automated, continuous, enterprise-grade data contracts across 50 sources, that's a different tool category. Harbinger is for individuals and small teams who need fast, ad-hoc validation.

For what it is designed to do — fast, browser-based quality checks on API data for non-engineers — it's uniquely positioned.

Time Savings: By the Numbers

Task	Python Script	Harbinger Explorer
Environment setup	15–30 min	0
Fetch and flatten API data	30–60 min	2 min
Write null checks	10 min	30 sec
Write duplicate checks	10 min	30 sec
Write freshness checks	10 min	30 sec
Write consistency checks	15–20 min	1 min
Interpret and document results	15–30 min	5 min
Total first-time quality check	1.5–3 hours	~10 minutes
Total repeat check (same source)	20–30 min	5–7 min

For a consultant running quality checks on 3–4 API sources per week, that's 4–6 hours saved every week. That's time that goes back into actual analysis, client communication, and deliverables.

Getting Started

Data quality checks don't need to be a project. They can be a habit.

Visit harbingerexplorer.com
Start your 7-day free trial
Crawl your most important API source
Ask: "Are there any nulls in the columns I care about?"
Ask: "Are there any duplicates?"
Ask: "How fresh is this data?"

You'll have a quality assessment in under 10 minutes — and a new habit that will save your analysis from bad data.

Pricing

Plan	Price	Best For
Starter	€8/month	Freelancers, solo analysts, researchers
Pro	€24/month	Power users with multiple API sources
Free Trial	7 days	Validate your most important source today

Bad data is silent. It produces reports that look right, decisions that feel informed, and insights that are subtly wrong. The only defence is asking the right questions before you trust the numbers.

With Harbinger Explorer, those questions take 10 minutes, not half a day.

Try Harbinger Explorer free for 7 days →

View all articles

Try Harbinger Explorer for free

Connect any API, upload files, and explore with AI — all in your browser. No credit card required.

Start Free Trial

Harbinger Explorer

Quick API Data Quality Checks Without Writing Python Scripts

Quick API Data Quality Checks Without Writing Python Scripts

What "Data Quality" Actually Means for API Data

1. Completeness

2. Uniqueness

3. Validity

4. Consistency

5. Freshness

6. Schema Drift

The Old Way: Python Quality Checks

The New Way: Harbinger Explorer

Step 1: Crawl Your API (2 minutes)

Step 2: Ask Quality Questions in Plain English or SQL

Example: Validating an External Data Feed

The Most Important Quality Checks for API Data

Who Needs This Most

Freelance Data Consultants

Internal Analysts at Fast-Moving Companies

Researchers Working with Public APIs

Bootcamp Graduates Entering Analyst Roles

Competitor Comparison

What Harbinger Explorer Doesn't Do (Be Honest)

Time Savings: By the Numbers

Getting Started

Pricing

Continue Reading

Search and Discover API Documentation Efficiently: Stop Losing Hours in the Docs

Automatically Discover API Endpoints from Documentation — No More Manual Guesswork

Track API Rate Limits Without Writing Custom Scripts

Try Harbinger Explorer for free

Quick API Data Quality Checks Without Writing Python Scripts

What "Data Quality" Actually Means for API Data

1. Completeness

2. Uniqueness

3. Validity

4. Consistency

5. Freshness

6. Schema Drift

The Old Way: Python Quality Checks

The New Way: Harbinger Explorer

Step 1: Crawl Your API (2 minutes)

Step 2: Ask Quality Questions in Plain English or SQL

Example: Validating an External Data Feed

The Most Important Quality Checks for API Data

Who Needs This Most

Freelance Data Consultants

Internal Analysts at Fast-Moving Companies

Researchers Working with Public APIs

Bootcamp Graduates Entering Analyst Roles

Competitor Comparison

What Harbinger Explorer Doesn't Do (Be Honest)

Time Savings: By the Numbers

Getting Started

Pricing

Continue Reading

Search and Discover API Documentation Efficiently: Stop Losing Hours in the Docs

Automatically Discover API Endpoints from Documentation — No More Manual Guesswork

Track API Rate Limits Without Writing Custom Scripts

Try Harbinger Explorer for free

Command Palette