xhub.io
A Digital Service by BeeBack UG
ServicesSolutionsAI StrategyProductsBlogAboutContact
Back to Blog
Software Development

Nanook: Open-Source Test Data Generation with Equivalence Classes

Manual test data wastes time and misses edge cases. Nanook generates systematic test data from spreadsheets — open source, CI/CD-ready, and used by Deutsche Bahn.

Author
xhub.io Team
Published
April 10, 2026
Read time
7 min read
Nanook: Open-Source Test Data Generation with Equivalence Classes

The Problem: Test Data Is Software Development's Blind Spot

Every developer knows this story. The feature is done, unit tests pass, everything green. Then it hits production — and it breaks. Not because the code was wrong, but because nobody thought of that one edge case. The order with a negative quantity. The date in February with 29 days. The customer name with special characters.

The problem is rarely the code. It's the test data. Or more precisely: the lack of systematic test data.

Most teams create test data manually. A few standard cases, the obvious error scenarios, and then we hope it's enough. Spoiler: it never is.

The Method: Equivalence Class Tables

Before we talk about the tool, let's talk about the method. Because Nanook isn't just another data generator like Faker.js. It's the implementation of a proven testing methodology.

Equivalence class partitioning is a technique from systematic software testing. The idea: instead of testing every possible input value (impossible), we divide input values into classes. Values within a class behave identically — if one works, they all work.

An example. An age field accepts values from 0 to 150:

ClassValuesExpected
Valid0–150Accepted
Invalid (too small)< 0Rejected
Invalid (too large)> 150Rejected
Invalid (not a number)"abc"Rejected
Boundary0, 150Accepted

Instead of testing 150 values, we test 5. Systematically. Completely. Traceably.

The problem: creating these tables is easy. Generating matching test data from them is tedious. That's exactly where Nanook comes in.

What Nanook Does

Nanook connects two things that have traditionally been separate: planning test cases and generating test data.

Step 1: Plan Test Cases — in a Spreadsheet

You define your equivalence classes in a regular spreadsheet. Excel, LibreOffice, Google Sheets — whatever you prefer. No new software to learn, no special syntax. Just rows and columns.

Each column is an input field. Each row defines a class of values. Nanook understands the structure and knows which combinations need to be tested.

Step 2: Assign Data Generators

For each equivalence class, you define how concrete test data should be generated. Nanook includes generators for common data types:

  • Email addresses (valid and invalid)
  • Names (various character sets and lengths)
  • Dates (with boundary values and formats)
  • Numbers (ranges, decimals, signs)

Need something specific? No problem. Write custom generators in JavaScript. An IBAN generator, an article number generator matching your internal schema, a generator for your domain-specific codes — all possible.

Step 3: Generate and Export

One command. Nanook reads the table, combines the equivalence classes, invokes the generators, and writes the results. Choose JSON, CSV, or a custom format via a pluggable writer.

npm install @xhubio/nanook-table

The result: a complete set of test data that systematically covers all relevant equivalence classes. Reproducible. Versionable. CI/CD-ready.

Why Not Just Use Faker.js?

Faker.js is great at what it does: generating realistic random data. But Faker.js solves a different problem.

Faker.jsNanook
ApproachRandom dataSystematic data
FoundationNo methodologyEquivalence classes
GoalRealistic dummy dataComplete test coverage
Boundary valuesRandom, if at allExplicitly defined
ReproducibleOnly with seedAlways
PlanningIn codeIn spreadsheet

Faker.js fills a database with test data. Nanook ensures the right test cases exist. Both have their place — but they solve different problems.

In Production: Deutsche Bahn

Nanook isn't a hobby project. The toolkit is used at Deutsche Bahn, where systematic testing isn't optional — it's a necessity. When booking systems, timetable data, and customer information need to work together, random test data doesn't cut it.

The combination of structured test planning in spreadsheets and automated data generation has proven itself in one of Germany's most complex IT environments.

CI/CD Integration

Nanook is a Node.js module. That means it runs everywhere Node.js runs. In your pipeline, in your pre-commit hook, in your nightly build.

# In your CI pipeline
npx nanook generate --input tests/equivalence-tables/ --output tests/data/

Generated test data becomes part of your repo. Changes to equivalence class tables automatically produce updated test data. No manual steps. No forgotten updates.

Open Source and MIT Licensed

Nanook is MIT licensed. Free for personal and commercial use. No hidden costs, no enterprise-tier restrictions.

The complete source code is on GitHub. Fork it, extend it, contribute your own generators, or just use it.

Who Is Nanook For?

  • QA teams that need systematic test coverage, not just spot checks
  • Developers who want to generate test data as part of their CI/CD pipeline
  • Test managers who want to plan test cases in an understandable format (spreadsheet, not code)
  • Enterprise teams that need documented, reproducible test coverage

Get Started

Nanook is ready to use in minutes:

npm install @xhubio/nanook-table

Full documentation, tutorials, and API reference at nanook.xhub.io.

Source code on GitHub.


Questions about using Nanook in your project? Talk to us — we're happy to help, even if you're using the toolkit for free.

xhub.io

25 years of digital innovation excellence

in𝕏⌗

Our Services

  • AI Strategy
  • Custom Software Development
  • SaaS & Platform Solutions
  • Strategic IT Consulting
  • Digital Transformation
  • Cloud Migration & DevOps

Products

  • invoice.xhub.io
  • invoice-api.xhub.io
  • tempus.xhub.io
  • cv.xhub.io
  • nanook.xhub.io

Company

  • About Us
  • Blog
Tags
#Testing#Open Source#QA#Test Data#Automation
Share this article

Subscribe to Newsletter

Get monthly practical tips on digitalization

We respect your privacy. Unsubscribe anytime.

Contents

  • The Problem: Test Data Is Software Development's Blind Spot
  • The Method: Equivalence Class Tables
  • What Nanook Does
  • Step 1: Plan Test Cases — in a Spreadsheet
  • Step 2: Assign Data Generators
  • Step 3: Generate and Export
  • Why Not Just Use Faker.js?
  • In Production: Deutsche Bahn
  • CI/CD Integration
  • Open Source and MIT Licensed
  • Who Is Nanook For?
  • Get Started
  • Contact
  • Legal

    • Legal Notice
    • Privacy Policy
    • Terms of Service

    © 2026 xhub.io is a Digital Service by BeeBack UG. All rights reserved.