replace
The replace
command substitutes specific values in your dataset with new ones, useful for data cleaning and standardization.
Syntax
qsv replace <old_value> <new_value> [<input_file>] [<output_file>]
Description
The replace
command searches for occurrences of <old_value>
in the input file and replaces them with <new_value>
. If no input file is specified, it reads from stdin. If no output file is specified, it writes to stdout.
Options
- None specific to
replace
Exit Codes
- 0: Replacements were made successfully
- 1: No replacements were made
Examples
Basic Usage
Replace a typo in the 'nearest_metro_en' column:
qsv select 'nearest_metro_en' DLD_Transactions_English.csv |
qsv replace 'Buj Khalifa Dubai Mall Metro Station' 'Burj Khalifa Dubai Mall Metro Station' |
qsv sample 5 |
qsv table
Output:
106920
nearest_metro_en
Burj Khalifa Dubai Mall Metro Station
Rashidiya Metro Station
Rashidiya Metro Station
""
Rashidiya Metro Station
Replacing Multiple Values
You can chain multiple replace
commands:
qsv select 'country' data.csv |
qsv replace 'USA' 'United States' |
qsv replace 'UK' 'United Kingdom' |
qsv sample 5 |
qsv table
Common Use Cases
- Correcting typos or inconsistencies
- Standardizing values (e.g., country names)
- Replacing missing values with placeholders
- Removing unwanted characters or strings
Tips
- Always verify your replacements by sampling the output
- Consider case sensitivity when replacing values
- Use in combination with other qsv commands for more complex data cleaning pipelines