dnnslbrwn
10th June 2003, 18:47
I need to parse the output of a BAAN query and I would like to use awk. The problem is that some of the fields that I need to work with are strings. They contain spaces that I don't want awk to parse as field separators if they are inside the single quotes of a string.

Thoughts?

-Dennis

patvdv
10th June 2003, 18:53
Can you post some examples what the raw data looks like?

dnnslbrwn
10th June 2003, 19:16
Order Country Item ItemDescripton SafetyStk OrderQuantity
104047 ' NZ' 'part01 ' 'Desc of part 1 1000ml ' 0.000000 368.000000
104047 ' NZ' 'part02 ' 'Desc of part 2 500ml NZ ' 6500.000000 9180.000000

The first line is just a UNIX echo that I added in my script, but the 2nd and 3rd lines are part of the output... item codes changes to protect the innocent :)

The output from qptool is great for loading into excel as the text to columns wizard is able to recognize the single quote as a string delimiter and ignores spaces inside of the string... but when I need to work with the file in UNIX first.. sigh.

-Dennis

patvdv
10th June 2003, 20:06
Dennis,

Same approach could apply to AWK. You can set the field delimiter to the single quote and throw away any empty fields


cat <your_file> | awk '
BEGIN {
FS="\047"
}

{
for (i=1;i<=NF; i++) {
if ($i !~ /^ +$/ && i != 7) {
sub(/^ +/,"",$i)
print $i
}
if (i == 7) {
z=split($i,array," ")
for (y=1;y<=z; y++) {
print array[y]
}
}
}
}'


Probably not the most elegant solution but it seems to work (unless your format is not fixed). Don't underestimate the power of AWK! ;)

dnnslbrwn
10th June 2003, 21:18
Pat,

Thanks for your reply... what you posted does work - but I guess I was looking for something a bit more generic (that I can pass the output of any query through to clean it up - not fixed where only the 7th field needs extra effort).

I will spend some time on it later. I guess what I am thinking is something that still breaks fields on space - but then concatenates fields back together based on if the number of single quotes found so far is odd or even (if you have seen an odd number of quotes, then you are inside a string).

However, based on what you posted, I changed it slightly so the output looked like this...

104047 |NZ|part01 |Desc of part 1 1000ml |0.000000|368.000000|
104047 |NZ|part02 |Desc of part 2 500ml NZ |6500.000000|9180.000000|

... which I can now pass back into another AWK script and clean up into nice columns with headers and do the math that I need - THANKS.

Cheers,

-Dennis

dnnslbrwn
10th June 2003, 21:55
Well - guess I answered my own question...



awk 'BEGIN{FS="|"; instring=0}
{
for (i=1;i<=length($1);i++) {
l = substr($1,i,1)
if ( l == "\047" ) {
if ( instring == 0 )
instring = 1
else
instring = 0
}
else {
if ( l == " " && instring == 0 )
printf "|"
else
printf "%s", l
}
}
printf "\n"
}' [data file]




.... turns this....
104047 'X NZ' 'part01 ' 'Desc of part 1 1000ml ' 0.000000 368.000000
104047 ' NZ' 'part02 ' 'Desc of part 2 500ml NZ ' 6500.000000 9180.000000

.... into this.....
104047|X NZ|part01 |Desc of part 1 1000ml |0.000000|368.000000
104047| NZ|part02 |Desc of part 2 500ml NZ |6500.000000|9180.000000

Thank you for you help Patrick.

Cheers,

-Dennis