Groups | Blog | Home
all groups > sql server programming > april 2005 >

sql server programming : UPDATE using self-referencing JOIN ?


Itzik Ben-Gan
4/2/2005 9:22:57 AM
Specify the alias of the side you want to update instead of the original
table name:

UPDATE t1
SET Price = NULL
FROM test AS t1
INNER JOIN test AS t2
ON t1.Parent = t2.ID

--
BG, SQL Server MVP
www.SolidQualityLearning.com


[quoted text, click to view]

--CELKO--
4/2/2005 10:01:40 AM
Have you considered using SQL instead of dialect?

UPDATE Test
SET Price = NULL
WHERE EXISTS
(SELECT *
FROM Test AS T2
WHERE Test.parent_id = T2.node_id);

Better, use a nested set model for the hierarchy.
Axel Dahmen
4/2/2005 7:13:25 PM
Hi,

I'd like to update a self-referencing table like

CREATE TABLE test (ID INT PRIMARY KEY,Price SMALLMONEY,Parent INT REFERENCES
test (ID))
....
UPDATE test SET Price = NULL
FROM test AS t1 INNER JOIN test AS t2
ON t1.Parent = t2.ID

OK, this is not the actual statement, of course. But how can I define to use
t1.Price as update target?

I know there are alternatives to do this update. I just want to make sure
this is a design flaw and not me missing something.

TIA,
Axel Dahmen

Axel Dahmen
4/4/2005 9:58:38 PM
I did, but I was actually curious about the dialect version having some flaw
in its design.

Online Books isn't clear about the fact that a table alias can be used after
the UPDATE keyword. But Itzik helped me out here. Thanks a lot to him!

Can you enlighten me briefly on the term "nested set model"? You made me
curious... ;)

TIA,
Axel Dahmen



-----------------
"--CELKO--" <jcelko212@earthlink.net> schrieb im Newsbeitrag
news:1112464900.106186.105930@z14g2000cwz.googlegroups.com...
[quoted text, click to view]

Axel Dahmen
4/4/2005 10:01:54 PM
....ahh, forgot one:

Do you perhaps know about some case studies on the runtime behaviour of
those two different approaches (dialect vs. ANSI)? For me that would be the
only point preferring one version over the other.

Again, TIA,
Axel

--CELKO--
4/7/2005 12:22:24 PM
This is a simple example from Adam Machanic of some of the dialect
problems. Never use an alias in the UPDATE clause; it makes no sense
in the Standard SQL model, since the alas has to behave as if it were
materialized, used and then discard without ever touching the base
table.

CREATE TABLE Foo
(col_a CHAR(1) NOT NULL,
col_b INTEGER NOT NULL);

INSERT INTO Foo VALUES ('A', 0);
INSERT INTO Foo VALUES ('B', 0);
INSERT INTO Foo VALUES ('C', 0);

CREATE TABLE Bar
(col_a CHAR(1) NOT NULL,
col_b INTEGER NOT NULL);

INSERT INTO Bar VALUES ('A', 1);
INSERT INTO Bar VALUES ('A', 2);
INSERT INTO Bar VALUES ('B', 1);
INSERT INTO Bar VALUES ('C', 1);

You run this proprietary UPDATE with a FROM clause:

UPDATE Foo
SET Foo.col_b = Bar.col_b
FROM Foo INNER JOIN Bar
ON Foo.col_a = Bar.col_a;

The result of the update cannot be determined. The value of the column
will depend upon either order of insertion, (if there are no clustered
indexes present), or on order of clustering (but only if the cluster
isn't fragmented).

You can goolge for the Nested Set model or buy my book TREES &
HIERARCHIES IN SQL for the code and painful details. The basic idea is
to use pairs of integers (lft, rgt) to define the range of subtrees --
think of XML tags or parentheses in algebra.
--CELKO--
4/7/2005 12:36:41 PM
[quoted text, click to view]
other. <<

What about correct answers? What about maintaining and porting the
code? Remember that 80% or more of the cost of a system is in
maintaining it, not coding it.
Axel Dahmen
4/8/2005 12:00:00 AM
[quoted text, click to view]

Both syntaxes are equal to me, so maintaining both has equal costs. Neither
will I port. I'm using so many SQL specific features (like UDF) already. If
SQL Server 2005 would break with 2000 T-SQL that would mean a problem.

My databases are more often executed than my code is maintained or ported,
so efficient runtime behaviour is my primary goal.

Axel Dahmen
4/8/2005 12:00:00 AM
a) The example is not a self-join. Is there a way to avoid an alias in a
self-join update?

b) How would you write the UPDATE line from the example in ANSI?

c) How are you planning to make the ANSI version more predictable than the
proprietary version? The comparison in the example is not sufficient to
select a particular source row...


Thanks for the hint! I'll see for the book.



---------------
"--CELKO--" <jcelko212@earthlink.net> schrieb im Newsbeitrag
news:1112901744.284061.94790@f14g2000cwb.googlegroups.com...
[quoted text, click to view]

Axel Dahmen
4/8/2005 12:00:00 AM
I've read through the thread you gave. What I noticed was that in your
example you are ignoring the fact of double rows in the T-SQL version
whereas you are aware of it by filtering it out using MIN() in the ANSI
version.

Still I admit that the ANSI version would throw an exception whereas the
T-SQL version wouldn't.


--------
"David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> schrieb im
Newsbeitrag news:1112984565.457958.125160@z14g2000cwz.googlegroups.com...
[quoted text, click to view]
http://www.google.co.uk/groups?hl=en&lr=&selm=-P2dnVZ2I9M31WXdRVn-jg%40gigan
ews.com
[quoted text, click to view]

Axel Dahmen
4/8/2005 12:00:00 AM
I've read through the thread you gave. What I noticed there is that in your
example you are ignoring the fact of double rows in the T-SQL version
whereas you are aware of it by filtering it out using MIN() in the ANSI
version.

Still I admit that the ANSI version would throw an exception whereas the
T-SQL version wouldn't. But I don't think someone would intentionally
perform such an update in the first place as the column definition is so
obviously unconstrained.



---------------
"David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> schrieb im
Newsbeitrag news:1112984565.457958.125160@z14g2000cwz.googlegroups.com...
[quoted text, click to view]
http://www.google.co.uk/groups?hl=en&lr=&selm=-P2dnVZ2I9M31WXdRVn-jg%40gigan
ews.com
[quoted text, click to view]

Axel Dahmen
4/8/2005 12:00:00 AM
[quoted text, click to view]

That's not porting. That's just adapting to a newer version. You can never
be safe that a column name you have been using becomes a reserved word in
any new version. But tell me, how often in YOUR lifetime have you ever
ported a running, bigger project to another vendor's system?



[quoted text, click to view]

That's pathetic... And it's bullshit. After a couple of years every
application is worth being rewritten. Experience, demands and
infrastructures grow with the years. If one is denying that he proves his
ignorance.



[quoted text, click to view]

I didn't say that. In fact I was asking that some postings before.
Nonetheless, good code is also efficient code. Or would you decline that?



[quoted text, click to view]

That ist correct. But you don't have to go that far. If one legal syntax is
quicker than the other, it's sufficient to use that. Not to do that would
result in a not as good product just because you fear porting at some
nebulous day in the future... It's like going barefoot all day just because
you're afraid of replacing your wornout shoes sometime.

To make it clear: I'm not saying that one syntax is necesserily more
efficient than the other. This is still my question. But the T-SQL syntax
exists, so it's worth being used if it is more efficient. On the other hand,
if one version is error prone, then don't use it. I fully agree with this.
But please... don't preach.



[quoted text, click to view]

Pathetic again... Did I write somewhere that I insist on the T-SQL syntax?



[quoted text, click to view]

I am a programmer.

Hugo Kornelis
4/8/2005 12:00:00 AM
[quoted text, click to view]

Hi Axel,

All self-joins require an alias. That goes for the proiprietary T-SQL
UPDATE FROM syntax just as it goes for the ANSI standard SELECT syntax.

Without aliases, there is no way to know which of the versions of the
same table you are refering to.


[quoted text, click to view]

UPDATE Foo
SET col_b = (SELECT col_b
FROM Bar
WHERE col_a = Foo.col_a)
WHERE EXISTS (SELECT *
FROM Bar
WHERE col_a = Foo.col_a)

This will result in an error, because the subquery returns more than one
row. Not good - but lots better than the server just randomly picking
one of the available values without giving even as much as a hint to me
that the query was ambiguous.


[quoted text, click to view]

Go back to the design stage. Find out what the query is actually
supposed to do. Talk with the responsible managers and/or the end users
to find out what really needs to be done. And if everyone says that it
really doensn't matter which value is selected, define my own definition
(probably something involving lowest value of primary key).

Then, go back to QA and rewrite the update to match the new and improved
specs.

Best, Hugo
--

David Portas
4/8/2005 11:22:45 AM
The best way to determine performance is to try it out and see. There
are few general answers to performance questions.

I find that the convoluted proprietary UPDATE syntax is very prone to
logical errors. There is also the nasty bug-feature where it will give
indeterminate results without returning an error - silently destroying
data integrity. For these reasons I will avoid it.

http://www.google.co.uk/groups?hl=en&lr=&selm=-P2dnVZ2I9M31WXdRVn-jg%40giganews.com

Regardless of what you find convenient, someday someone else might have
to maintain your code. So clarity and standard conventions ARE
important IMO. Sometimes they are more important than small, short-term
performance gains.

--
David Portas
SQL Server MVP
--
--CELKO--
4/8/2005 12:27:11 PM
[quoted text, click to view]
costs.<<

Did you run the sample code that gives you two different results? That
is not equality by any definition -- it is right answer versus wrong
answer. What is the cost of doing a complete data audit and correcting
all the errors the proprietary version might have made in the database?


[quoted text, click to view]

Anyone with a few years in IT is laughing his head off at that
statement.

[quoted text, click to view]

You are actually saying that you do not know how to write Standard,
non-procedural SQL code and are falling back on UDFs? Does your boss
know about this? Will he pay to get you training before you do some
harm?

[quoted text, click to view]
problem. <<

Like when the outer join changed, the ALL predicate changed, the UPDATE
changed the first time, etc. People who think that they will never
port code forget that the next release of their current product can
have all kinds of changes. Vendors will retain or add standard syntax
and semantics.

People who say that they will never port are telling us that their
application is so weak, useless or badly written than nobody else will
want to move it to another platform. That is is cheaper and better to
throw it out and start over. Or perhaps they are saying that they have
no dsire for professionalism and career growth.

[quoted text, click to view]
ported, so efficient runtime behaviour is my primary goal. <<

Why do you think there is a conflict between good code and good
performance? If raw performance were the only goal, you would write in
directly machine language or use a custom optimizer designed for your
hardware (like BLISS for the old DEC computers).

Would you go to your boss and tell him "My code gets the wrong answers
some of the time, it is a bitch to maintain, will cost the company far
more than it should over the next five years and it will never move to
a new platform; but it gets those wrong answer really fast!"

You are thinking like a programmer and not like a total systems person.
David Portas
4/8/2005 12:32:05 PM
[quoted text, click to view]

Then you haven't read enough posts in this group! I've seen it in
production code as well.

You are right that the problem is obvious in my simplistic example, but
in more complex joins, maybe involving more than two tables, it's easy
to make that error. The illogical syntax that forces you to reference
the same table twice (!) doesn't help. The difference is that the ANSI
version fails safe, the Microsoft version doesn't.

--
David Portas
SQL Server MVP
--
Axel Dahmen
4/9/2005 12:00:00 AM
I perfectly agree with you, David. And as I might add, good systems
architecture and efficient runtime behaviour is no contradiction at most of
the times.

Axel


----------------------
"David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> schrieb im
Newsbeitrag news:1113049747.888645.135830@g14g2000cwa.googlegroups.com...
[quoted text, click to view]

David Portas
4/9/2005 5:29:07 AM
Many of us who work in corporate environments have ported plenty of
code between platforms. That doesn't necessarily mean entire systems,
although I've done that a few times too. Most enterprises mandate
development standards precisely to increase developer productivity,
encourage code re-use and extend the life of the codebase. If you think
that code has a lifetime of only two years then just look up some of
the recent debate on the VB6 lifecycle, or the pre-2000 Millennium
compliance effort that had to address many years of legacy code.

Industry Standards don't mean you should ignore proprietary features -
there are other considerations too - but "use whatever is performant"
isn't always the best policy. Performance is one factor, not
necessarily the overriding one.

In SMEs and other environments with relatively small IT assets the
short-term priorities may carry more weight. However, it's prudent to
expect change. Companies of any size may change management or be taken
over; business priorities change; careers change. Good systems
architecture and developent means anticipating change and building
systems that will last and will scale.

--
David Portas
SQL Server MVP
--
--CELKO--
4/10/2005 8:07:24 AM

[quoted text, click to view]

Then you better tell the SEI to stop calling porting. They have a
classification system of mainenance activities, such as adaptive,
corrective, etc.

[quoted text, click to view]
a reserved word in any new version. <<

Use the ISO-11179 naming conventions and put an underscore in the name.
There is an agreement that further names will be single words. Use the
list of possible future reserve words given the SQL-92 standard.

[quoted text, click to view]
running, bigger project to another vendor's system? <<

Yes; I think about five of them which I would call "running bigger
projects" and I have a few dozen systems were prototyped and then
ported. You forget that I am old, so I have worked on equipment from
the BUNCH (Burroughs, Univac, NCR, Control Data Corp, and Honeywell) in
my career. When the vendor stops making the hardware, you have to port
Medicaid, State Prison Systems, State Highway accunting, etc. Today,
you guys would be moving stuff from Windows to Linux, to Z/os, to Xbox.


[quoted text, click to view]
Experience, demands and infrastructures grow with the years. If one is
denying that he proves his ignorance. <<

Ever hear the expression "legacy system" in the trade? The truth is
that at some point, a system can become so vital, so large that it is
not economic or physically possible to do a re-write. It has to be
maintained.

Smaller sub-systems with high cohesion and low coupling can be thrown
away and replaced. That was the idea behind most of the Software
Engineering research in the 1970-80's.
AddThis Social Bookmark Button