[Tfug] Database file import/consolidation
Paul Scott
waterhorse at ultrasw.com
Thu Jul 15 16:04:20 MST 2004
Choprboy wrote:
>Anybody doing regular imports into a database of a large set of partially
>updated data? I.e. you get an updated file regularly and need to accumulate
>the changes in the DB?
>
>I'm talking about large data sets here, ~500,000+ records... How do you do
>efficent updates (additions really) of the DB when there is no unique key
>across the rows (or a practical way to make one)???
>
>At the moment, to process updates, I have to take the current 500,000+ row
>table and copy/index all the data it into a second temporary table. Then
>import the 500,000+ rows of the update file (more than 95% of which is the
>same) into the temporary table. Then do a SELECT DISTINCT() across the entire
>secondary temp table and insert/index the results into a third table... After
>I'm done, I can then quickly R/W lock the original table, drop it, and swap
>in my newly created/consolidated table... But it obviously takes a while and
>severely hammers the server to be building/rebuilding/destroying multiple
>500,000+ row tables to integrate new updates.
>
>Is there an easier way to do this? Or am I out of luck?
>
>
Is the whole database ~500,000 records? Possibly export the data and
remove the duplicate records with command line tools and/or scripts and
the then just update the actually changed records. It would probably be
faster.
Paul Scott
More information about the tfug
mailing list