Thursday, March 21, 2013

useful unix commands for data processing

I use IBM Datastage for ETL at work. Datastage has an "Execute Command Activity" which allows us to issue an command on the operating system, in our case it's a linux.

Below are the commands that have come in very handy and efficient in helping me process the data or augment my data workflow.

To convert a single column into a comma separated row of values:
paste -s -d, your_input_file

To get last row:
tail -1 your_input_file

1 comment: