Skip to main content

replace

The replace command substitutes specific values in your dataset with new ones, useful for data cleaning and standardization.

Syntax

qsv replace <old_value> <new_value> [<input_file>] [<output_file>]

Description

The replace command searches for occurrences of <old_value> in the input file and replaces them with <new_value>. If no input file is specified, it reads from stdin. If no output file is specified, it writes to stdout.

Options

  • None specific to replace

Exit Codes

  • 0: Replacements were made successfully
  • 1: No replacements were made

Examples

Basic Usage

Replace a typo in the 'nearest_metro_en' column:

qsv select 'nearest_metro_en' DLD_Transactions_English.csv | 
qsv replace 'Buj Khalifa Dubai Mall Metro Station' 'Burj Khalifa Dubai Mall Metro Station' |
qsv sample 5 |
qsv table
Output:
106920
nearest_metro_en
Burj Khalifa Dubai Mall Metro Station
Rashidiya Metro Station
Rashidiya Metro Station
""
Rashidiya Metro Station

Replacing Multiple Values

You can chain multiple replace commands:

qsv select 'country' data.csv | 
qsv replace 'USA' 'United States' |
qsv replace 'UK' 'United Kingdom' |
qsv sample 5 |
qsv table

Common Use Cases

  • Correcting typos or inconsistencies
  • Standardizing values (e.g., country names)
  • Replacing missing values with placeholders
  • Removing unwanted characters or strings

Tips

  • Always verify your replacements by sampling the output
  • Consider case sensitivity when replacing values
  • Use in combination with other qsv commands for more complex data cleaning pipelines

See Also

  • select - for selecting specific columns
  • sample - for sampling rows from your data