sample
The sample
command selects a random sample of rows from a CSV file.
Syntax
qsv sample <num_samples> [<input_file>]
Description
The sample
command randomly selects a specified number of rows from the input file (or stdin if no file is specified) and outputs them to stdout.
Options
- None specific to
sample
Exit Codes
- 0: Sampling successful
- Non-zero: An error occurred
Examples
Basic Usage
Get a random sample of 5 rows from a CSV file:
qsv sample 5 olympics2024.csv | qsv table
Output:
Rank Country Country Code Gold Silver Bronze Total
23 Hungary HUN 6 7 7 20
45 Colombia COL 0 4 1 5
62 Portugal POR 1 1 2 4
19 Poland POL 4 5 5 14
56 India IND 1 2 4 7
Common Use Cases
- Quickly inspecting large datasets
- Creating smaller test datasets for development or testing
- Reducing dataset size for faster processing or analysis
Tips
- Use
sample
to get a representative subset of your data - Combine with other qsv commands for more complex data processing pipelines