all groups > sql server (alternate) > january 2004 >
Dave, Try www.TechnicalVideos.net. The videos on triggers and Stored Procedures will help you a lot. There is also a video specifically dealing with cursors. Hope this helps, Chuck Conover www.TechnicalVideos.net [quoted text, click to view] "Dave Anderson" <anderdw2@cvn.net> wrote in message news:ELORb.112$uM2.98@newsread1.news.pas.earthlink.net... > Can anyone point me to a good resource for learning cursors in MSSQL? > > Thanks > > Dave > >
Can anyone point me to a good resource for learning cursors in MSSQL? Thanks Dave
David, I'm not disagreeing exactly, but I would be interested in your opinion. Typically I use cursors to load data from an outside source. Since you can't control what that outside data looks like, I will do the following: - load data into a temp table - cursor thru the temp table, verifying data types and generally massaging the data, if needed - insert into our production table(s) if the data passes examination (done in the cursor) I'm not certain if you'd consider this an administrative task. Some of the data checking we do would be hard or impossible to do using a set-based structure, I think. Appreciate your views. Best regards, Chuck [quoted text, click to view] "David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> wrote in message news:JbWdnTdWGp4je4rd4p2dnA@giganews.com... > Remember that you should generally try to avoid using cursors at all. They > are not usually a good solution to a problem in SQL because of their > performance and resource constraints compared to the set-based alternatives. > > My view is that the majority of cursors fall into one of three categories: > Procedural administrative tasks (a reasonable use of a cursor); Written by > procedural programmers who don't know SQL; Used to work around a poor data > design (e.g. lack of keys in tables). There are other cases but they are few > and far between in my experience. > > -- > David Portas > SQL Server MVP > -- > >
Remember that you should generally try to avoid using cursors at all. They are not usually a good solution to a problem in SQL because of their performance and resource constraints compared to the set-based alternatives. My view is that the majority of cursors fall into one of three categories: Procedural administrative tasks (a reasonable use of a cursor); Written by procedural programmers who don't know SQL; Used to work around a poor data design (e.g. lack of keys in tables). There are other cases but they are few and far between in my experience. -- David Portas SQL Server MVP --
"Chuck Conover" <cconover@commspeed.net> wrote in news:1075323791.146752@news.commspeed.net: [quoted text, click to view] > David, > I'm not disagreeing exactly, but I would be interested in your > opinion. Typically I use cursors to load data from an outside > source. Since you can't control what that outside data looks like, > I will do the following: > > - load data into a temp table> - cursor thru the temp table, verifying data types and generally > massaging the data, if needed > - insert into our production table(s) if the data passes > examination (done in the cursor)
Here's what I do: 1) bcp the data to load into a staging table 2) using set-based logic to validate against master tables putting the 'clean' results into a 'good' table and 'bad' data into another (for later reporting) 3) once all the validation is done, a single insert from the 'good' table is used to slam all the data into the target table. -- Pablo Sanchez - Blueoak Database Engineering, Inc
As Pablo says, many data cleansing and transformation operations can be performed using set-based statements, even if you don't have keys in your source data. For the rest I would use a dedicated ETL tool. DTS is the component that ships with SQLServer but there are also other packages specifically designed to automate the type of data prep you are describing and then load the cleansed data into your database. This approach has lots of advantages over hand-coded conversions in an RDBMS that isn't really designed and optimised for the job. http://pervasive.datajunction.com/djcosmos/ http://www.embarcadero.com/products/dtstudio/index.html http://www.informatica.com -- David Portas SQL Server MVP --
David Portas (REMOVE_BEFORE_REPLYING_dportas@acm.org) writes: [quoted text, click to view] > My view is that the majority of cursors fall into one of three > categories: Procedural administrative tasks (a reasonable use of a > cursor); Written by procedural programmers who don't know SQL; Used to > work around a poor data design (e.g. lack of keys in tables). There are > other cases but they are few and far between in my experience.
I'll add one more case: you already have a stored procedures that performs some complex logic, and this procedure accepts its input in scalar parameters, and you want to apply that logic to entire set of data. -- Erland Sommarskog, SQL Server MVP, sommar@algonet.se Books Online for SQL Server SP3 at
[quoted text, click to view] >> My view is that the majority of cursors fall into one of three
categories: Procedural administrative tasks (a reasonable use of a cursor); Written by procedural programmers who don't know SQL; Used to work around a poor data design (e.g. lack of keys in tables). There are other cases but they are few and far between in my experience. << The only other one that comes to mind is an NP complete problem (traveling salesman, etc.) where you can use the first near-optimal answer. Being a set-oriented language, SQL tends to find the **entire set** of solutions and that can take a **lot** of time. Thank god you
Erland Sommarskog <sommar@algonet.se> wrote in news:Xns947F2060989AYazorman@127.0.0.1: [quoted text, click to view] > David Portas (REMOVE_BEFORE_REPLYING_dportas@acm.org) writes: > > I'll add one more case: you already have a stored procedures that > performs some complex logic, and this procedure accepts its input > in scalar parameters, and you want to apply that logic to entire > set of data.
I'll counter the above with the fact that you're using the wrong tool for the job. The SP was written to handle single row inserts therefore if you plan on doing batch inserts, then the SQL should be written and tuned accordingly. Unless you wish this side of sucky performance. :) -- Pablo Sanchez - Blueoak Database Engineering, Inc
Pablo Sanchez (honeypot@blueoakdb.com) writes: [quoted text, click to view] > Erland Sommarskog <sommar@algonet.se> wrote in > news:Xns947F2060989AYazorman@127.0.0.1: >> I'll add one more case: you already have a stored procedures that >> performs some complex logic, and this procedure accepts its input >> in scalar parameters, and you want to apply that logic to entire >> set of data. > > I'll counter the above with the fact that you're using the wrong > tool for the job. The SP was written to handle single row inserts > therefore if you plan on doing batch inserts, then the SQL should be > written and tuned accordingly. Unless you wish this side of sucky > performance. :)
I'm not talking of simple SPs that inserts a single row into a table. I'm talking about stored procedures with more than 1500 lines of code, and which calls plenty of over procedures, activating another 1500 lines of code. Those lines of code include important business rules, that you don't want to have duplicated in a scalar version of the procedure and a table-oriented one. And when many of the calls to the procedure for busieness reasons are in fact one off, there may be a performance penalty of the procedure is rewritten to be table-oriented. -- Erland Sommarskog, SQL Server MVP, sommar@algonet.se Books Online for SQL Server SP3 at
Most of our data comes from outside sources in which we have no control over the quality or accuracy of the data. As a business, we felt that it was more important to get 99% of the data into our systems, then deal with those rows that fail afterwards. Since I was unable to figured out a way to do this as sets (since a set either all works or all fails together) I have had to cursor processing all over the place. I also make use of stored procedures which (as far as I know) cannot be passed in data sets, only scalar
Another thought (more of a question). When you need to provide logging (auditing) of each transaction the easiest way is to run through each row. Do the action, then log it. Go to next row. Am I wrong? What are the alternatives? I provide a real life situation as an example. Derivatives (futures, options, and future options) are stocks that expire at some date. After their expiry date, I need to mark the record in our Security Master table as being inactive. I want a log of each security that was marked as inactive. If I do not use a cursor, how else can I do it? The only other way I can think our is do perform the query twice (one select, then one update) inside a begin / end transaction. The select statement would give me info to output to a log file and the update
Erland Sommarskog <sommar@algonet.se> wrote in news:Xns94803655A9EEYazorman@127.0.0.1: [quoted text, click to view] > I'm not talking of simple SPs that inserts a single row into a
Nope, I didn't assume that you were ... I assumed complicated beasts. [quoted text, click to view] > Those lines of code include important business rules, that you don't > want to have duplicated in a scalar version of the procedure and a > table-oriented one.
If the batch processing is important, you do. If it's not something that requires to be loaded within a time constraint, I completely agree with you. [quoted text, click to view] > And when many of the calls to the procedure for busieness reasons > are in fact one off, there may be a performance penalty of the > procedure is rewritten to be table-oriented.
For example? -- Pablo Sanchez - Blueoak Database Engineering, Inc
[quoted text, click to view] > After their expiry date, I need to mark the record in our Security > Master table as being inactive.
Why? If the expiry date is recorded in your system then you already *know* whether a stock has expired or not based on the current date and time. An active / inactive column would just be redundant data. Put the status in a view if you like - not in a table. [quoted text, click to view] > The only other way I can think our is do perform the query twice (one > select, then one update)
Yes. But you would need two statements whether you do it in a cursor or not. It should be quicker without a cursor. [quoted text, click to view] > inside a begin / end transaction. The select
If you really wanted to do it you don't need a transaction but I don't see the point of the AuditLog unless Stocks are going to change Inactive -> Active as well as Active -> Inactive. On limited info here's a guess: INSERT INTO AuditLog (status, col1, col2, .... ) SELECT 'inactive', col1, col2, .... FROM Stocks WHERE expirydate <= CURRENT_TIMESTAMP UPDATE Stocks SET status = 'inactive' WHERE status = 'active' AND EXISTS (SELECT * FROM AuditLog WHERE status = 'inactive' AND pkcol = Stocks.pkcol) -- David Portas SQL Server MVP --
[quoted text, click to view] > Most of our data comes from outside sources in which we have no > control over the quality or accuracy of the data. As a business, we > felt that it was more important to get 99% of the data into our > systems, then deal with those rows that fail afterwards.
See my reply earlier in this thread. There are tools specifically designed to solve this problem. -- David Portas SQL Server MVP --
Pablo Sanchez (honeypot@blueoakdb.com) writes: [quoted text, click to view] > Erland Sommarskog <sommar@algonet.se> wrote in > news:Xns94803655A9EEYazorman@127.0.0.1: >> And when many of the calls to the procedure for busieness reasons >> are in fact one off, there may be a performance penalty of the >> procedure is rewritten to be table-oriented. > > For example?
We did take to task to rewrite one of our procedures to be table-oriented. We did this, because this one creates an account transaction, updates positions, balances and a whole lot more things. The scope for the database transaction for this may be a singe account transaction, for instance a simple deposit of money. The scope may also be over 50000 account transactions, for instance capitalization of interest, or a corporate action in a major company like Ericsson. The outcome of this adventure is that we can now rewrite the multi- transaction updaets to be set-based and be a lot faster than before. But anything that is still one-by-one due to legacy is now slower, say one second instead of 200 ms. Instead of having single values in variables, it is now in 43 table variables, and that is of course slower. -- Erland Sommarskog, SQL Server MVP, sommar@algonet.se Books Online for SQL Server SP3 at
David Portas (REMOVE_BEFORE_REPLYING_dportas@acm.org) writes: [quoted text, click to view] >> After their expiry date, I need to mark the record in our Security >> Master table as being inactive. > > Why? If the expiry date is recorded in your system then you already *know* > whether a stock has expired or not based on the current date and time. An > active / inactive column would just be redundant data. Put the status in a > view if you like - not in a table.
It isn't that easy. I don't know Jason's business, but I know my own. When an instrument has expired, it should indeed be inactivated, or deregistered to use the terminology in our system. But you cannot deregister if there are still are positions or unsettled trades. Even if the instrument has expired, you may still have to register transactions in it. For instance, you may not until now discover that you have registered a trade for the wrong account, and have to cancel and create a replacement note. So even if the instrument is expired, it should still be fully valid in transactions - but of course there should be validation that you don't specify a trade date after expiration. Once everything has been cleared up, all trades resulting from expiration has been registered, and all unused options has been booked out, you can deregister the instrument. But there is of course no reason to use a cursor just because of this. A very simple-minded approach is: INSERT #temp(id) SELECT id FROM instruments WHERE -- should be deregistered UPDATE instruments SET deregdate = getdate() FROM instruments ins JOIN #temp t ON ins.id = t.id INSERT audirlog (...) SELECT ... FROM #temp ... -- Erland Sommarskog, SQL Server MVP, sommar@algonet.se Books Online for SQL Server SP3 at
Erland Sommarskog <sommar@algonet.se> wrote in news:Xns94814EAFE04FYazorman@127.0.0.1: [quoted text, click to view] > The outcome of this adventure is that we can now rewrite the > multi- transaction updaets to be set-based and be a lot faster > than before. But anything that is still one-by-one due to legacy > is now slower, say one second instead of 200 ms. Instead of having > single values in variables, it is now in 43 table variables, and > that is of course slower.
[ I don't know if you want to pursue it further so if you don't respond, I'll assume not. ] The fact that there's some legacy sounds like that legacy code needs to be refactored as well. It's only natural that that's what needs to be done when taking row-at-a-time and converting to set-based. -- Pablo Sanchez - Blueoak Database Engineering, Inc
Pablo Sanchez (honeypot@blueoakdb.com) writes: [quoted text, click to view] > The fact that there's some legacy sounds like that legacy code needs > to be refactored as well. It's only natural that that's what needs > to be done when taking row-at-a-time and converting to set-based.
Needs to is a relative item. I don't know how much work it took to do rewrite that particular stored procedure, but I seem to recall that my time estimate was 100 hours. In a company like hours, 100 hours is not something you take out of thin air. Some of these jobs running one-by-one have been rewritten, rest assured. But there is at least one process where a rewrite will take another 100 hours, maybe more.) (This one is however not keeping all calls in one database transaction.) And I would not expect this to happen to either we have someone coughing up money for it, or a customer yelling loud enough about the performance. (The latter is of course much more likely than the former.) Anyway, this particular procedure that we rewrote had to be rewritten, for our system being able to scale. But I have another case, where most calls are one at a time, and where only one functions make sucessive calls, and this function is used by one customer only. I'd be cautious before I init a rewrite here. -- Erland Sommarskog, SQL Server MVP, sommar@algonet.se Books Online for SQL Server SP3 at
Thanks for the imput guys. I thought about auditing everything as a step and then performing the action as another step but I felt it was somehow "wrong". Erland summed it up nicely about the inactive status and unprocessed trades. To add to that, there are other status's (like Halted) that can be applied to a security. That is another reason for the Status
Don't see what you're looking for? Try a search.
|
|
|