Groups | Blog | Home
all groups > sql server notification services > july 2005 >

sql server notification services : un-answered question got removed


InvalidLastName
7/14/2005 5:59:23 PM
Hi,
I posted a qustion in this newsgroup on 07/08 without getting any =
responses. However, for some reasons, the post has been removed from the =
news server today (at least I cannot see it from my outlook express)

I can still found it from MSDN newsgroup web site =
http://msdn.microsoft.com/newsgroups/default.aspx?&lang=3Den&cr=3DUS&guid=
=3D&sloc=3Den-us&dg=3Dmicrosoft.public.sqlserver.notificationsvcs&p=3D1&t=
id=3D9e84b23a-238d-4c6e-b951-ba5bc5ef7d9e , but not in my outlook =
express.=20


Any reason my un-answered question got removed ?



<><><><><><><><> original post <><><><><><><><>


Recently we started stress testing our SQLNS (for SQL2K) application. =
Here are some basic background information:=20
- running SQLNS enterprise sp1 v2.0.3008.0=20
- 1 generator, 2 distributors (W2k3 boxes)=20
- custom content formatter and custom delivery component (the delivery =
component uses notificationStatusCallback to report NotificationStatus =
to distributor, as shown as =
http://msdn.microsoft.com/library/default.asp?url=3D/library/en-us/sqlnts=
v/htm/nsp_advancedtopics_4e44.asp?frame=3Dtrue)=20
- quad CPU SQL server cluster w/ 4G memory=20
- basic distributor logging (see ADF below)=20

<DistributorLogging>=20

<LogBeforeDeliveryAttempts>false</LogBeforeDeliveryAttempts>=20

<LogStatusInfo>false</LogStatusInfo>=20

<LogNotificationText>false</LogNotificationText>=20

</DistributorLogging>=20


The most serious performance issue we encoutered is average 75% CPU =
usage on SQL server while distributors were distributing. At the time =
the SQLNS application server CPU usage was at only about 10%. This is a =
totally unacceptable result.=20

After we dug more on this problem, we found out the high CPU was caused =
by NSInsertESSNotificationsDistributionLogEntries stored procedure. This =
stored procedure has "300" parameters, and "100" IF/ELSE blocks. I =
believe it is used by StatusCallback to log the results back to =
notification table and NSDistributionLog table. It was designed to log =
up to 100 notifications at a time (3 parameters per record). However, in =
our case, we only need 3 out of 300 parameters and the first IF/ELSE for =
logging just ONE notification.=20

Here are the alternatives we experimented and their results:=20

A) "Hacked" NSInsertESSNotificationsDistributionLogEntries stored =
procedure by returning to calling program (added "return" statement =
right after the first IF/ELSE. The result was the SQL server CPU usage =
dropped from average 75% to 7%. Yeah, 10 times difference by just bypass =
99 IF/ELSE blocks in the stored procedure. The SQLNS application was =
able to deliver 50 notifications/second/distributor=20

B) Skip calling notificationStatusCallback for sucessfully delivered =
notification. This will totally remove the logging, however, I believe =
Distributor also uses notificationStatusCallback to determine the the =
result of current batch (work item). Because no successful status =
reported by delivery component, the distributor marked the current work =
item as "failure" , and this work item will be picked up by distributor =
again at next retry. (We actually wrote a custom stored procedure which =
mimics the 300-parameter NSInsertESSNotificationsDistributionLogEntries =
to just update the notification status. But distributor marked the work =
item "failure" because no sucessful status reported back by custom =
delivery component. However, those "successful" delivered notification =
(deliverystatuscode =3D6) were not picked up again by distributor during =
retry)=20

C) Changed custom delivery component to "cache" the NotificationStatus =
for each notification,and submitted the NotificationStatus in batch =
(when cache size reaches 100, or in Flush() to log all the cached =
NotificationStatus). SQL server CPU dropped from 75% to 10%.=20

The alternative A resulted the best performance and very healthy CPU =
usage pattern, and option C is a close second. We don't consider to ues =
option B because I don't think it is right to skip the =
notificationStatusCallback in delivery component=20

The option A requires to "hack" the SQLNS stored procedure, which I try =
to avoid. However, the option C will cause some status cached =
notification being re-sent if distributor encouters server failure while =
distributing the notification. Since the status is not logged =
immediately, the SQLNS will treat those un-reported notifications as =
never delivered, and re-send those at retry.=20

The last thing we want to do is to modify the SQLNS stored procedures. =
However, it is also important for us not resending duplicate =
notifications.=20

Here are some questions:=20
- Has Anyone seen the similar high CPU usage from basic logging as we =
encoutered (I know from SQLNS BOL, the logging is not recommended, =
because distributor already has basic logging. But this "basic" logging =
is killing us. It doesn't make any sense to me)=20

- Did we terriblely do something wrong in our SQLNS application? We =
seperate the data files, log files and index files. The event and =
subscription are on different file groups. I don't think the hard drive =
is not an issue here. We are using production cluster to perform stress =
testing.=20

- There is really no way for a SQLNS application can deliver over 400 =
notifications/second as described in =
http://msdn.microsoft.com/library/default.asp?url=3D/library/en-us/dnsql2=
k/html/sql_nsperformance.asp . It doesn't matter how fast is the =
distributor, or how many distributors, the logging just kills the SQL =
server. I am not quite sure how can they achieve those numbers while =
running typical SQLNS configuration=20

- How come just 100 IF/ELSE blocks in a stored procedure can cause such =
high CPU usages on SQL server? I never expect that updating 100 records =
cross network via connection pooling is even more efficient then =
wrapping update in 100 IF/ELSE blocks=20

Any suggestion would be appreciated=20

Thanks=20

ILN=20





-------------------------------------------------------------------------=
-------=20

Her it comes, one of the two monster logging stored procedure which has =
300 parameters and 100 IF/ELSE blocks (the other one is =
NSInsertESSNotificationsDistributionLogEntries )=20

PROCEDURE dbo."NSSetESSNotificationsNotificationDeliveryStatus" (=20
@distributorId int,=20
@callerCurrentUtcTime datetime,=20
@notificationId0 bigint =3D NULL,=20
@workItemId0 bigint =3D NULL,=20
@deliveryStatus0 tinyint =3D NULL,=20
@callerSentTime0 datetime =3D NULL,=20
@deliveryStatusInfo0 nvarchar(2048) =3D NULL,=20
@notificationText0 nvarchar(2048) =3D NULL,=20
@notificationId1 bigint =3D NULL,=20
@workItemId1 bigint =3D NULL,=20
@deliveryStatus1 tinyint =3D NULL,=20
@callerSentTime1 datetime =3D NULL,=20
@deliveryStatusInfo1 nvarchar(2048) =3D NULL,=20
kate
7/15/2005 1:36:12 PM
I did notice your original post and would be interested to see what
Shyam's (as a member of the developer team) comments were on the
reasoning behind the distributor architecture and how to crank peak
performance out of it (without hacking the procs).

Perhaps, Shyam will see this and give us a quick whistle-stop tour of
optimising the distributor for bulk despatch applications.

It would be helpful to know the kind of through-put you are using in
your stress test in terms of how many work-items are waiting to be
processed when the distributor kicks-in , how many are arriving per
second and how many are processed into your custom delivery channel
during one cycle.

It would be interesting to try the same app on SQL server 2005 which
has better performance monitoring and see if the same problem exists in
that version.


Kate MBCS

Alert Technologies
http://www.alert-technologies.co.uk
Get your notification services implementation going in minutes not
weeks ......
AddThis Social Bookmark Button