Plyworks

csvops

A fast, local CLI for inspecting CSV files and surfacing structural issues — no schemas, notebooks, or setup.

operations open source Rust data tools

csvops — CSV Inspection for Operators

A fast, local CLI for inspecting CSV files and surfacing structural issues. Designed for operators who need answers immediately — no schemas, notebooks, or setup.

The Problem

Every operational workflow eventually produces a CSV that someone needs to make sense of. Accounting exports, production logs, vendor data feeds, compliance reports. Before you can use the data, you need to know what you’re looking at: what’s in it, what’s missing, what’s broken, and whether anything changed since last time.

The existing tools require too much setup. Pandas needs a notebook. Excel chokes on large files. Database imports need schema definitions. For an operator who just needs to know “is this file clean enough to use,” the overhead kills the momentum.

How It Works

One command. Immediate answers.

csvops profile data.csv

The tool reads the file and reports everything an operator needs: delimiter detection, column types, missing values, numeric statistics, cardinality, top values, mixed-type warnings, and outlier flags. No configuration files. No database connections. No cloud accounts.

Key Capabilities

  • Delimiter detection — auto-detects comma, tab, pipe, and semicolon from the first 8KB
  • Type inference — classifies columns as integer, float, boolean, datetime, or string
  • Missing value detection — counts empties, NA, NULL, None, NaN with per-column percentages
  • Numeric statistics — min, max, mean, standard deviation, percentiles via streaming algorithms
  • Cardinality estimation — exact counts up to 10K, HyperLogLog approximation above that
  • Mixed-type warnings — alerts when a column contains multiple data types
  • Outlier flagging — surfaces values far outside the normal range
  • Drift detection — group by time column to surface how data changes across periods

Built With

Rust. Streaming architecture for constant-memory processing on large files. Reservoir sampling for percentile estimation. Single-pass statistics.

Status

Active. Open source under MIT license.

View on GitHub →

Let's build something.

Whether you're modernizing a shop floor, fighting a claims denial, or rethinking your production workflow — we'd like to hear about it.

Get in touch →