A CLI tool for viewing, querying, and converting tabular data files. Supports AWS / Azure / Google Cloud Storage URLs.
- Jsonl
- CSV
- TSV
- Parquet
- Avro
Display rows from a tabular data file:
tab view data.csvOutput to different formats:
tab view data.parquet -o jsonl
tab view data.parquet -o csvtab schema data.parquettab summary data.parquetRun SQL queries on your data. The table is referenced as t:
tab sql 'SELECT * FROM t WHERE Metric_A_Value > 80' test.csvConvert between formats:
tab convert data.csv data.parquet
tab convert data.parquet data.jsonl -o jsonlWrite partitioned output:
tab convert data.csv output_dir/ -o parquet -n 4tab cat data1.csv data2.csv data3.csv -o jsonl > output.jsonl| Option | Description |
|---|---|
-i |
Input format (parquet, csv, tsv, jsonl). Auto-detected from extension. |
-o |
Output format (parquet, csv, tsv, jsonl). |
--limit |
Maximum number of rows to display. |
--skip |
Number of rows to skip from the beginning. |
| Option | Description |
|---|---|
-n |
Number of output partitions. Creates a directory with part files. |