Monday, January 19, 2009

Using awk to transpose flat files

Here's a pretty nifty awk script that will take a flat file database that has entries segmented in stanzas rather than single lines. By stanzas I mean a group of lines, in which each entry is delimited by a blank line. Here's how it works:

Here is your flat file database:

Name:Tony Cesaro
Age:25
E-mail:acesaro@gmail.com

Name:Jim Jones
Age:22
E-mail:jim@gmail.com

Here is the file that you are trying to create:

Tony Cesaro,25,acesaro@gmail.com
Jim Jones,22,jim@gmail.com

Here is the script, utilizing the power of awk, showing how to achieve this:

#!/usr/bin/ksh

FFDB=/home/acesaro/people.txt
awk -F"\n" 'BEGIN { RS=""} {print $1" "$2" "$3" "$4}' $FFDB | awk '{print $2","$4","$6"}'

Now let's dissect this quickly:

Obvious, setting the variable:
FFDB=/home/acesaro/people.txt

Transpose the stanzas to single lines; the $1, $2, etc. are set to the values of each line in each stanza:
awk -F"\n" 'BEGIN { RS=""} {print $1" "$2" "$3" "$4}' $FFDB

And finally, use a typical awk line to only print out the values on each line that we want, using -f: to specify ":" as the delimiter and separating each variable with a comma to create a CSV:

awk -F: '{print $2","$4","$6"}'

This is a handy way to translate flat files between formats that previously seemed pretty difficult...to me at least. :)

I utilized this at work to take the flat file DBs that NerveCenter uses to store the nodes that it polls and their associated properties and convert them into a tab delimited lookup file for Netcool.

No comments: