all groups > sql server programming > january 2007 >
You're in the

sql server programming

group:

Which One Is Faster


Re: Which One Is Faster Steve
1/2/2007 8:06:20 AM
sql server programming:
Did you SET STATISTICS TIME and STATISTICS IO (if you're using Query
Editor) both on and check CPU time and reads or do the same with
Profiler?

[quoted text, click to view]
Re: Which One Is Faster Roy Harvey
1/2/2007 1:11:59 PM
The example you give using copied tables is not completely valid
unless they are the same size and indexed the same way as in
production.

So how many rows are in the table when it takes 8 minutes? How many
are being deleted?

An index on [date] might speed things up IF the number of rows being
deleted is a small percentage of the entire table. Delete rows older
than 90 days when run daily would only delete one day of data. It
will not help if the number of rows deleted is a large percentage of
the table; run it after 180 days and it would have half the table to
delete.

Roy Harvey
Beacon Falls, CT

[quoted text, click to view]
Re: Which One Is Faster Mike C#
1/2/2007 2:34:05 PM
The second query should be more efficient; i.e.:

declare @d datetime
set @d = dateadd(day,-90,getdate())
delete from table where [date] < @d

The reason is that this query:

Delete from Table where datediff(dd,[date],getdate()) > 90

Has to perform a DATEDIFF() calculation on every single row of your table.

The more efficient version only performs one DATEADD calculation and
comapres that result to the [Date] column. Beyond that your indexes,
statistics, etc., are probably the biggest factors if you think the results
are inconsistent each time you run the query.


[quoted text, click to view]

Re: Which One Is Faster Roy Harvey
1/2/2007 2:44:22 PM
[quoted text, click to view]

That may be the key issue. Certainly it is the only explanation that
comes to mind that would explain:

[quoted text, click to view]

I suggest investigating that locking.

Roy Harvey
Re: Which One Is Faster vt
1/2/2007 3:47:59 PM
Max

To know why the first delete was slow read the following
http://www.sql-server-performance.com/transact_sql.asp

VT



[quoted text, click to view]

Which One Is Faster Max
1/2/2007 11:25:26 PM
I'm using SQL Server 2000 in a Windows 2000 Server Box
There is a table with a datetime type column, named date, for logging
propose
I want to delete all records what are older then 90 days
So I do

"Delete from Table where datediff(dd,[date],getdate()) > 90"

It's so slow, it cost me about 8 mins
Then I tried

"declare @d datetime
set @d = dateadd(day,-90,getdate())
delete from table where [date] < @d"

And the results are confused
It takes 10 secs to 8 mins
Considerate that my box is pretty busy (Exchange 2000, SQL Server 2000, IIS
5, Anti-Spam and Anti-Virus.........etc )
So it's not perfect for testing

Could someone tell me
1.Which one is faster? In principle
2.What is fastest way to do that?Beside my codes

Thanx

Max

Re: Which One Is Faster Hugo Kornelis
1/3/2007 1:10:56 AM
[quoted text, click to view]

(snip)
[quoted text, click to view]

Hi Max,

That would explain the inconsistent execution times.

Proper indexing *might* reduce or even completely remove the lock
contention. Please post the CREATE TABLE statement used to create the
table; don't forget to include all constraints, properties and indexes.
Post the statement(s) used for the additions to the table as well. That
information can help us find a suggestion to reduce the lock contention.

--
Hugo Kornelis, SQL Server MVP
Re: Which One Is Faster Max
1/3/2007 1:55:36 AM
Great point
I clone 3 table for testing
select * into t1 from table
select * into t2 from t1
select * into t3 from t1

and the results are


1. Delete from t1 where datediff(dd,[date],getdate()) > 90

SQL Server parse and compile time:
CPU time = 0 ms,elapsed time = 152 ms?
Table 't1'?Scan count 1,logical reads 6463,physical reads 0,read-ahead
reads 0?

SQL Server Execution Times:
CPU time = 172 ms,elapsed time = 5996 ms?

(930 row(s) affected)


2. delete from t2 where [date] < dateadd(day,-90,getdate()) /* Alejandro
Mesa's solution */

SQL Server parse and compile time:
CPU time = 47 ms,elapsed time = 257 ms?
Table 't2'?Scan count 1,logical reads 6524,physical reads 0,read-ahead
reads 312?

SQL Server Execution Times:
CPU time = 141 ms,elapsed time = 987 ms?

(991 row(s) affected)


3. declare @d datetime
set @d = dateadd(day,-90,getdate())
delete from t3 where [date] < @d

SQL Server parse and compile time:
CPU time = 0 ms,elapsed time = 68 ms?

SQL Server Execution Times:
CPU time = 0 ms,elapsed time = 0 ms?

(993 row(s) affected)

Table 't3'?Scan count 1,logical reads 6526,physical reads 0,read-ahead
reads 0?

SQL Server Execution Times:
CPU time = 140 ms,elapsed time = 217 ms?

#3 is the best
Cus I use getdate() function, That costs affected rows different
I'm still seeking a faster solution
Any suggestion?

"Steve" <morriszone@hotmail.com>
???????:1167753980.556052.79940@h40g2000cwb.googlegroups.com...
[quoted text, click to view]

Re: Which One Is Faster Max
1/3/2007 2:13:42 AM
VT
Thanks
My Have-To-Read list is 8 inches longer
Max

"vt" <vinu.t.1976@gmail.com> ¼¶¼g©ó¶l¥ó·s»D:uaa3wToLHHA.5104@TK2MSFTNGP06.phx.gbl...
[quoted text, click to view]

Re: Which One Is Faster Max
1/3/2007 3:17:47 AM
[quoted text, click to view]
That's why I copied t1 from original table
And copied another 2 from t1
all of them are not indexed

[quoted text, click to view]
I don't think amount of rows quite related
The original table is a GFI MailEssentials created table for logging SMTP
traffic
And it's a Data-Keeps-Coming-In table
It's pretty difficult for me to works with, due to some locking job or
whatelse
It takes 9 mins and 21 secs for a "Select count(*) from Table" query
Since you asked,The result is 83039(GREAT JOB, SPAMMER)
Nearly 920 rows will be deleted everyday

Max



"Roy Harvey" <roy_harvey@snet.net>
???????:en7lp25c6amq42sinq10nt896ll7sakp2b@4ax.com...
[quoted text, click to view]

Re: Which One Is Faster Roy Harvey
1/3/2007 8:24:42 AM
Hugo also asked that you:

[quoted text, click to view]

To which I would add, please post the query being blocked and
executing with such wildly different times. If nothing else it would
tell us which of the five tables described are being queried.

In the meanwhile, looking at the table definitions I noticed some
choices that might be considered unusual. For example, nchar(255)
always takes up 510 bytes, even when NULL. Two of the tables have
five such columns totalling 2550 bytes just for those columns, which
means only three rows per page. For long character data is it far
more common to use varying lengh than fixed length - nvarchar rather
than nchar for unicode.

Another point is the column named id of type float. Since id occurs
once in every table it sounds like some sort of key. I have never
seen a numeric key stored as a float before. What does the data in
this column look like?

Roy Harvey
Beacon Falls, CT

[quoted text, click to view]
Re: Which One Is Faster Stu
1/3/2007 11:58:28 AM
Welcome to my own personal hell, or "crappy vendor design" as I like
to call it. I work with a variety of syslogging databases, and most of
them are horribly "designed" as simple logging structures to quickly
write data.

Since most logging vendors don't care about data retrieval, they tend
to create tables with no indexes; heaps are very easy to insert data.

My general rule of thumb is to NOT change column types, because I've
had applications blow up because some developer had to check for a
column type before adding a row of data. Usually all we do is add a
few select indexes to speed up data retrieval.

This is also one of the few scenarios where you may want to experiment
with transaction isolation levels (using NOLOCK hints), because the
data is typcially INSERTed, not UPDATEd, and there ususally is no
transactional consistency anyway.

You may also want to consider pulling the data out of their database
into a properly structured database of your own, and avoid the issue of
DELETE's altogether.

Stu

[quoted text, click to view]
Re: Which One Is Faster Max
1/3/2007 1:58:03 PM
Hi Hugo
Here it is
/*---------------------------------------------------------------------------------

CREATE TABLE [log_as_neutral] (
[date] [smalldatetime] NOT NULL ,
[entry] [nchar] (255) NULL ,
[id] [float] NOT NULL ,
[type] [int] NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [log_as] (
[action_details] [ntext] NULL ,
[action_id] [int] NOT NULL ,
[date] [smalldatetime] NOT NULL ,
[id] [float] NOT NULL ,
[mfrom] [nchar] (255) NULL ,
[module_id] [int] NOT NULL ,
[mto] [nchar] (255) NULL ,
[reason_details_id] [int] NULL ,
[reason_id] [int] NOT NULL ,
[subject] [nchar] (255) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
CREATE TABLE [log_in] (
[attach_count] [int] NULL ,
[clean_subject] [nchar] (255) NULL ,
[date] [smalldatetime] NOT NULL ,
[domains] [nchar] (255) NULL ,
[full_subject] [nchar] (255) NULL ,
[id] [float] NOT NULL ,
[mfrom] [nchar] (255) NULL ,
[mto] [nchar] (255) NULL ,
[size] [int] NULL
) ON [PRIMARY]
GO
CREATE TABLE [log_in_att] (
[attid] [int] NOT NULL ,
[attsize] [int] NULL ,
[filename] [nchar] (255) NULL ,
[id] [float] NOT NULL
) ON [PRIMARY]
GO
CREATE TABLE [log_out] (
[attach_count] [int] NULL ,
[clean_subject] [nchar] (255) NULL ,
[date] [smalldatetime] NOT NULL ,
[domains] [nchar] (255) NULL ,
[full_subject] [nchar] (255) NULL ,
[id] [float] NOT NULL ,
[mfrom] [nchar] (255) NULL ,
[mto] [nchar] (255) NULL ,
[size] [int] NULL
) ON [PRIMARY]
GO
CREATE TABLE [log_out_att] (
[attid] [int] NOT NULL ,
[attsize] [int] NULL ,
[filename] [nchar] (255) NULL ,
[id] [float] NOT NULL
) ON [PRIMARY]
GO

---------------------------------------------------------------------------------*/

"Hugo Kornelis" <hugo@perFact.REMOVETHIS.info.INVALID>
???????:7vslp2t91aggqdlevjer1tntqhafas17t5@4ax.com...
[quoted text, click to view]

Re: Which One Is Faster Roy Harvey
1/3/2007 5:41:32 PM
On Wed, 03 Jan 2007 23:22:00 +0100, Hugo Kornelis
[quoted text, click to view]

Yes, but from which table(s)?

Re: Which One Is Faster Stu
1/3/2007 7:22:18 PM
Max,

on each of the three log tables, add a clustered index on the date
column; haven't worked with GFI in a long time, but I believe that the
column represents the date that the logging software receives the
event, so it should be monotonically increasing. The impact of
INSERTing against a table with a single clustered index that occurs in
the same direction as the natural data flow should be minimal

Don't change anything else, because the vendor application may cease
functioning. Since you're doing DELETE's, you'll also need to develop
a maintenance plan to handle fragmentation on those indexes.

Stu

[quoted text, click to view]
Re: Which One Is Faster Hugo Kornelis
1/3/2007 11:22:00 PM
[quoted text, click to view]

Hi Roy,

Max has posted two different version of the same query (to delete rows
older than 90 days) at the start of this thread. That's the query with
the varying execution times.

--
Hugo Kornelis, SQL Server MVP
Re: Which One Is Faster Hugo Kornelis
1/3/2007 11:24:27 PM
[quoted text, click to view]
(snip)

Hi Max,

There ain't a single index or constraint in sight!!

If that is indeed your real design, the reason for the blocking is
obvious. Each delete has to scan the complete table, and will thus be
blocked by any pending inserts and selects.

But I assume that you simply forgot to script the indexes and
constraints. You might want to post again. And do look into the very
valid comments Roy made about your data types!

--
Hugo Kornelis, SQL Server MVP
Re: Which One Is Faster Stu
1/4/2007 4:22:04 AM
Hey Max,

Depending on the volume of your data, you may still need to place an
index on the date column (or the ID column if you know for sure that it
always increases, and never writes out of order).

Usually, in situations like these, the method I use is based on the
following steps:

1. Design your target database. Include a parameters table that
allows you to record the last event that you transfer. For example, if
you use the ID, enter a 0 for the first time

2. Use DTS to select data from the original where the ID is greater
than the parameter you specified in the last run. Copy the data over to
the destination, and update the parameters table.

3. Schedule the job to run every minute or so.


Obviously, there will be lag, and your server will have to be able to
handle any transformations that you need to make from Source to
Destination. This is just the basics; the actual implementation can be
very complex (even for simple transfers). For example, in your case,
you wouldn't transfer over the rows of data that you don't want to
report on.

You will still need to maintain the original database; hopefully the
GFI application has some method of removing older rows of data.

Hope that gets you started
Stu


[quoted text, click to view]
Re: Which One Is Faster Max
1/4/2007 9:43:34 AM
I didn't forget anything
That's what it is
BTW
That's not my design, It's an application created database
Max

"Hugo Kornelis" <hugo@perFact.REMOVETHIS.info.INVALID>
???????:b4bop2p8729m0q148mu27nsjp6s6b79dfg@4ax.com...
[quoted text, click to view]

Re: Which One Is Faster Max
1/4/2007 9:45:45 AM
Almost anyone
but the "Table" I mentioned in this group
It's log_as_neutral
Max

"Roy Harvey" <roy_harvey@snet.net>
???????:p6cop2tbugj8res9thalliob4s6udqpmbo@4ax.com...
[quoted text, click to view]

Re: Which One Is Faster Max
1/4/2007 3:40:54 PM
Stu,

Instead of to modify the original database, I prefer your another approach
<quote>
You may also want to consider pulling the data out of their database
into a properly structured database of your own
</quote>

Max
"Stu" <stuart.ainsworth@gmail.com>
???????:1167880936.923725.130840@42g2000cwt.googlegroups.com...
[quoted text, click to view]

Re: Which One Is Faster Hugo Kornelis
1/4/2007 10:41:18 PM
[quoted text, click to view]

Hi Max,

Well, in that case you will never get any reliable performance on the
deletions of old data, since they will be blocked by new inserts.

And while deleting, your inserts will be blocked as well, so that part
of your application will stall.

I'd suggest adding at least an index on the datetime column used for the
deletion plus choosing a query syntax in the form of
WHERE column_name < (non correlated expression)
That is, the indexed column must appear by itself on one side of the
equation, and the expression on the other side must not refer to any
columns in the same table.

--
Hugo Kornelis, SQL Server MVP
AddThis Social Bookmark Button