[Tfug] text file help

christopher skeptikos at gmail.com
Thu Jun 26 17:15:55 MST 2008


> So... this could be VERY simple to fix. How consistent is the data? If it is 
> well formatted, you could just do a sort and uniq count. (Even if it isnt 
> completely, it will clean it up and make the differences apparent.) Using 
> your snippet of data:
[snip]

It's very consistent. The only thing you I didn't think
to add to the original post was a the header for each
location. Here is the print out for one whole location:

&l0H
                                      Physical
Inventory Detail Report for Univ. of Ariz. Bkst Valued
at Standard Retail Date:
06/20/08                                   UNIVERSITY
OF ARIZONA
BOOKSTORES
Page:   2
------------------------------------------------------------------------------------------------------------------------------------
PLU \ SKU
DESCRIPTION                               DCC
QTY        COST    EXT COST
------------------------------------------------------------------------------------------------------------------------------------
Fixture #: 111  
 
   Area #: 1    
 
660534011073         ANTENNA BALL WHITE
A     /NAVY    /       492009    169        0.87
147.42 Area Total     147.42
   Area #: 2    
 
13671853             BOWL DOG AZ
POOCHIE      /NONE    /       492099     91
3.08      280.51 Area Total     280.51
   Area #: 3    
 
13671853             BOWL DOG AZ
POOCHIE      /NONE    /       492099    111
3.08      342.16 Area Total     342.16
   Area #: 4    
 
15404473             FACE PAINT 2 PK
WHITE    /NONE    /       504099     38
1.06       40.40 15404480             FACE PAINT 2 PK
RED      /NONE    /       504099     23
1.06       24.45 Area Total      64.85 Area #: 5    
 
14318283             TATTOO GLITTER
A         /NONE    /       492011     18
2.20       39.54 13361181             TATOO JEWELED
CATS       /NONE    /       504009     50
1.58       78.85 Area Total     118.39 Area #: 6    
 
15183095             LEASH DOG RE RIBBON A
CCA/NONE    /       492099     51        7.07
360.79 Area Total     360.79
   Area #: 7    
 
15155207             DOG COLLAR
XL            /NONE    /       492099     23
3.97       91.30 Area Total      91.30
 
                                                   Fixture
Total    1405.42

The only difference is that the headers are centered in
the page


> >From your 14 lines of data you end up with 5 lines of results showing you have 
> 8 of SKU 883230063751 in location 402008  listed as a single quantity. Since 

actually the 402008 is the dcc, not the quantity. I
won't bore you with the details because they are pretty
irrelevant. Just look at the example above for where
you would see the location and area within the location

> it is sorted by SKU/Location/Quantity (assuming the desc field is the same) 

actually location/area within location/sku, so I think
the script would look at everything between two headers
and then tally each duplicate entry of a sku.

> other quantity counts of the same SKU would be in sequential lines.
> 
> Simple first step reduction in the data to group and count all the same 
> entries. 

Yes, but the locations would have to be preserved

>Afterward add in a couple awk and sed commands, perhaps
multiplying 
> the uniq count and the qty fields as a total field and then reordering the 
> columns, lather, rinse, repeat to continue consolidating the data.

Seems that way. I was hoping to get that quick and easy
here (again, not out of laziness, but out of
desperation). I've got a sed and awk O'reilly manual
here, so I think I'll be getting into that tonight.

Thanks, I'll let you know ~ Chris

-- 
christopher <skeptikos at gmail.com>




More information about the tfug mailing list