I usually find it helpful to have the table enforce uniqueness in
circumstances like yours:
create unique index IX_tbl_FINAL on tbl_FINAL ( fld_FIRST, fld_SECOND,
fld_THIRD )
In addition to preventing duplicates from being inserted, it also prevents
duplicates from occuring due to modification of existing values.
In answer to #2, it depends on how records are "appended." If it is via an
INSERT and the set includes a row that would violate the unique constraint,
then the entire INSERT will be rolled back.
You could eliminate the duplicates during pre-processing. Either delete them
or flag them (in tbl_INITIAL) with a DoNotExport column (default 0). Set the
flag with something like:
update tbl_INITIAL
set DoNotExport = 1
from tbl_INITIAL i join tbl_FINAL f on
f.fld_FIRST = i.fld_FIRST and f.fld_SECOND = i.fld_SECOND and f.fld_THIRD =
i.fld_THIRD
There are many other ways to accomplish this step, and if perf is an issue,
you might need to experiment to find the fastest.
The append would then look like:
insert into tbl_FINAL
( fld_FIRST, fld_SECOND, fld_THIRD )
select fld_FIRST, fld_SECOND, fld_THIRD
from tbl_INITIAL
where DoNotExport = 0
These suggestions are untested.
et
[quoted text, click to view] "T Carnahan" wrote:
> I am new to SQL Server and am having a problem trying to figure out the best way to approach it.
>
> I am working with SQL Server 2000.
>
> My database has two tables of interest with fields that are 60% identical (that is, most of the
>
> fields have the same data type and context, although they have different field names). For the purpose
>
> of this inquiry, let me call the first table "tbl_INITIAL" and the second, "tbl_FINAL". Table
>
> "tbl_INITIAL" is a temp table that is used to hold the contents of imported data for some
>
> pre-processing. It is populated by a DTS package that imports a .CSV file. It is preferable that it not
>
> be constrained until all pre-processing is finished.
>
> "tbl_FINAL" has 10 of its 15 fields that have "identical" counterparts in "tbl_INITIAL". By
>
> identical, I mean they have the same structure and context, but have different names. The remaining 5
>
> fields in "tbl_FINAL" have no bearing on this discussion. "tbl_INITIAL" has records that I want to
>
> "append" to "tbl_FINAL".
>
> I need to make sure that no record in "tbl_INITIAL" with the "combination" of the following 3
>
> fields, fld_FIRST, fld_SECOND, and fld_THIRD, are duplicated in "tbl_FINAL" during insertion or
>
> modification. All 3 of these fields are varchar. It is not important or desired to notify the user when
>
> one of these records is prohibited from being inserted.
>
> I was thinking of 3 approaches, all of which are new areas of learning for me:
>
> 1) Write a procedure (view, stored proc, or function???) to insert the records, but check for and
>
> prevent "3-field combination" duplicates from being committed. I don't think this help when a record in
>
> "tbl_FINAL" is modified.
>
> 2) Modify the structure of "tbl_FINAL" so that the combination of the 3 fields is UNIQUE. I don't know
>
> what the consequences of this would be during batch updates. Would I get errors that would stop the
>
> process, or would it just prevent the insert and keep going?
>
> 3) Build a trigger that would check for this condition on insert or modify and prevent the action if a
>
> duplicate, 3-field combination shows up.
>
> Needless to say, I don't know which approach is best OR how to write the code to do any them.
>
> Can someone suggest which of the 3 or combination of the 3 approaches is the best way to go?
>
> Could you point me to some examples?
>
> Thanks ahead of time! [:)]
>
>
>
>
>
> **********************************************************************
> Sent via Fuzzy Software @
http://www.fuzzysoftware.com/ > Comprehensive, categorised, searchable collection of links to ASP & ASP.NET resources...
Hi
You could cover the three fields with a unique index, or possibly make them
the primary key.
You may hit problems with NULLs and NULLABILITY but without DDL and example
data it would be speculation!
tbl_INITIAL is quite often called a staging table, temp tables can have a
different meaning in SQL Server.
If you wrote a stored procedure to do the inserts/updates you can add a step
to your DTS package that will execute it.
To Update existing rows (assuming extra columns cola and colb)
UPDATE F
SET cola = I.cola
colb = I.colb
FROM tbl_Final F
JOIN tbl_Initial I ON F.fld_FIRST = I.fld_FIRST AND F.fld_SECOND =
I.fld_SECOND AND F.fld_THIRD = I.fld_THIRD
To Insert into tbl_Final
INSERT INTO tbl_Final ( fld_FIRST, fld_SECOND, fld_THIRD, cola, colb )
SELECT fld_FIRST, fld_SECOND, fld_THIRD, cola, colb
FROM tbl_Iniatial I
WHERE NOT EXISTS ( SELECT * FROM tbl_Final F
WHERE F.fld_FIRST = I.fld_FIRST AND F.fld_SECOND = I.fld_SECOND AND
F.fld_THIRD = I.fld_THIRD )
John
[quoted text, click to view] "T Carnahan" wrote:
> I am new to SQL Server and am having a problem trying to figure out the best way to approach it.
>
> I am working with SQL Server 2000.
>
> My database has two tables of interest with fields that are 60% identical (that is, most of the
>
> fields have the same data type and context, although they have different field names). For the purpose
>
> of this inquiry, let me call the first table "tbl_INITIAL" and the second, "tbl_FINAL". Table
>
> "tbl_INITIAL" is a temp table that is used to hold the contents of imported data for some
>
> pre-processing. It is populated by a DTS package that imports a .CSV file. It is preferable that it not
>
> be constrained until all pre-processing is finished.
>
> "tbl_FINAL" has 10 of its 15 fields that have "identical" counterparts in "tbl_INITIAL". By
>
> identical, I mean they have the same structure and context, but have different names. The remaining 5
>
> fields in "tbl_FINAL" have no bearing on this discussion. "tbl_INITIAL" has records that I want to
>
> "append" to "tbl_FINAL".
>
> I need to make sure that no record in "tbl_INITIAL" with the "combination" of the following 3
>
> fields, fld_FIRST, fld_SECOND, and fld_THIRD, are duplicated in "tbl_FINAL" during insertion or
>
> modification. All 3 of these fields are varchar. It is not important or desired to notify the user when
>
> one of these records is prohibited from being inserted.
>
> I was thinking of 3 approaches, all of which are new areas of learning for me:
>
> 1) Write a procedure (view, stored proc, or function???) to insert the records, but check for and
>
> prevent "3-field combination" duplicates from being committed. I don't think this help when a record in
>
> "tbl_FINAL" is modified.
>
> 2) Modify the structure of "tbl_FINAL" so that the combination of the 3 fields is UNIQUE. I don't know
>
> what the consequences of this would be during batch updates. Would I get errors that would stop the
>
> process, or would it just prevent the insert and keep going?
>
> 3) Build a trigger that would check for this condition on insert or modify and prevent the action if a
>
> duplicate, 3-field combination shows up.
>
> Needless to say, I don't know which approach is best OR how to write the code to do any them.
>
> Can someone suggest which of the 3 or combination of the 3 approaches is the best way to go?
>
> Could you point me to some examples?
>
> Thanks ahead of time! [:)]
>
>
>
>
>
> **********************************************************************
> Sent via Fuzzy Software @
http://www.fuzzysoftware.com/ > Comprehensive, categorised, searchable collection of links to ASP & ASP.NET resources...
Don't see what you're looking for? Try a search.