[Tfug] Database file import/consolidation

Thu Jul 15 16:04:20 MST 2004

Choprboy wrote:

>Anybody doing regular imports into a database of a large set of partially 
>updated data? I.e. you get an updated file regularly and need to accumulate 
>the changes in the DB?
>
>I'm talking about large data sets here, ~500,000+ records... How do you do 
>efficent updates (additions really) of the DB when there is no unique key 
>across the rows (or a practical way to make one)???
>
>At the moment, to process updates, I have to take the current 500,000+ row 
>table and copy/index all the data it into a second temporary table. Then 
>import the 500,000+ rows of the update file (more than 95% of which is the 
>same) into the temporary table. Then do a SELECT DISTINCT() across the entire  
>secondary temp table and insert/index the results into a third table... After 
>I'm done, I can then quickly R/W lock the original table, drop it, and swap 
>in my newly created/consolidated table... But it obviously takes a while and 
>severely hammers the server to be building/rebuilding/destroying multiple 
>500,000+ row tables to integrate new updates.
>
>Is there an easier way to do this? Or am I out of luck?
>  
>
Is the whole database ~500,000 records?  Possibly export the data and 
remove the duplicate records with command line tools and/or scripts and 
the then just update the actually changed records.  It would probably be 
faster.

Paul Scott