Groups | Blog | Home
all groups > dotnet performance > october 2004 >

dotnet performance : *URGENT Reducing time/performance hit on system (ADO/XML)


Alex
10/13/2004 7:29:03 AM
Hi! I have a slight problem with my application. In short, it goes to
database (Oracle) executes a view, returns on an average less than 1000
records. Then my app has to read an xml config file that describes how each
column should be formated before outputed to a text file in a fixed length
format. The way I wrote it, I use a DataReader, do the While Reader.Read
check and in it I have a For Loop that goes through the columns list one at a
time. I previously load the list of all elements that from xml congif file
into a Nodelist object. I do a check against that list using an XPath
expression to see if the field that I got in database has a match in config
file, if it does I pass the returned Node object to my format function that
reads the attributes of that node that describes how the data should be
formated. I format it accordingly and return back to the calling procedure
the formated result. Here, I add the returned text to a StringBuilder object
so that after the row is processed I can write out the formated string using
TextWriter object to a file on HD.
I cannot identify where the application could be optimized, but it has to be
because on an average it takes over 20 minutes for olny 700 records. Any
Alex
10/13/2004 8:25:07 AM
Thanks for taking the time. here's the sample short program. Please let me
know if there's something we can do. I assume you don't care/need the
database structure or the xml config file.

Public Shared Sub Main()
Dim p as new Processor

p.Process
End Sub

Public Class Processor

Public Sub Process()

Dim OleDbConn As New OleDb.OleDbConnection(OleDbConnString)Dim OleDbComm As
New OleDb.OleDbCommand
Dim xd As New XmlDataDocument
Dim writer As TextWriter
Dim dr As OleDb.OleDbDataReader
Dim i As Integer
Dim str As String
Dim sb As System.Text.StringBuilder

xd.Load(Directory.GetCurrentDirectory & "\xmlconfig-E.txt")

OleDbComm.CommandText = "select * from VW_EODBilling_Equities_RPT"
OleDbConn.Open()
OleDbComm.Connection = OleDbConn
dr = OleDbComm.ExecuteReader
writer = New StreamWriter("c:\test.txt", False)
nl = xd.GetElementsByTagName("Field")


While dr.Read
sb = New System.Text.StringBuilder
For i = 0 To dr.FieldCount - 1

If Not IsNothing(xd.SelectSingleNode("//Field[@Name='" &
dr.GetName(i).ToString & "']")) Then
_count += 1
If dr.GetValue(i).GetType.ToString <> "System.DBNull"
Then

sb.Append(FormatField(xd.SelectSingleNode("//Field[@Name='" &
dr.GetName(i).ToString & "']"), _

CType(dr.GetValue(i), String)))
Else

sb.Append(FormatField(xd.SelectSingleNode("//Field[@Name='" &
dr.GetName(i).ToString & "']"), ""))
End If
End If
Next
writer.WriteLine(sb.ToString)
sb = Nothing
End While
writer.Close()
writer = Nothing
dr = Nothing
OleDbConn.Close()

End Sub

Public Function FormatField(ByVal node As XmlNode, ByRef value As String) As
String

Select Case node.Attributes("Type").Value

Case "char"
Str = Trim(value)
num = CType(node.Attributes("Format").Value, Integer)
If Str.Length <= num Then
Str += Space(num - Str.Length)
Else
Str = Left(Str, num)
End If
Return Str
Case "number"
'same type of processing
End Select

End Function
Alex
10/13/2004 10:05:04 AM
First of all thanks for fast replies all. I'm pretty sure that the view does
not take long to be executed. Each row has no more than 60 columns. not
sure how I can use the dr.IsDbNull since dr referes to datareader, and I"m
checking if the value of a particular column/field is dbNull.

I thought about only doing the XPath call for the first row and storing in
some form of array/collection. Will see if this helps the performance.
Another idea I had, but not 100% sure how to implement is to load the data
from view in DataSet and create an XSLT file that does the logic/formating
and outputing the file to HD. Once data loaded in Dataset it's essentially
XML so it might be possible to apply the XSLT file to it to achieve the
desired result faster. Any pointers/examples of code in this directions would
be appreciated too. At this point I'm still stuck as to why it's so
incredibly slow. I've tried going through 1 loop of all fiels and the
processing seemed to work correctly without any sidetracking or delays.
Thanks again for your help!


[quoted text, click to view]
David Browne
10/13/2004 11:23:59 AM

[quoted text, click to view]

There are some little performance things that could be done to this program,
but nothing on the order of causing a 20 minute runtime. For 700 rows, that
code should execute in seconds, not minutes.

I suspect that the time may be associated with executing
VW_EODBilling_Equities_RPT itself.

How long does

SELECT * FROM VW_EODBilling_Equities_RPT

take without any client-side processing?

You might want to get an explain plan for the view, or ask a DBA to take a
look at it.

David

Jon Skeet [C# MVP]
10/13/2004 3:38:48 PM
[quoted text, click to view]

Could you post a short but complete program which demonstrates the
problem?

See http://www.pobox.com/~skeet/csharp/complete.html for details of
what I mean by that.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
Jon Skeet [C# MVP]
10/13/2004 5:38:15 PM
[quoted text, click to view]

Well, it would be nice to get an idea of it. As David said, there are
some things you can do to improve the performance of this, but it's
unlikely that the code below would really be taking 20 minutes if the
query is fast, unless there are thousands of columns.

One performance improvement would be to do the XPath queries once,
storing the results in an array, rather than doing two queries per
field, per row, and then taking the value of a node's attribute for
each field too.

Using dr.IsDBNull would be a better test than using GetType.ToString.

There's no need to pass value by reference, either.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
Jon Skeet [C# MVP]
10/13/2004 10:17:41 PM
[quoted text, click to view]

Hmm. It would certainly be worth timing how long it would take to load
all the data into a dataset just using DataAdapter.Fill. As you say,
this *shouldn't* take a long time, but it would be good to rule it out.

[quoted text, click to view]

Because DataReader.IsDBNull takes a parameter which specifies which
column to check for nullity.

[quoted text, click to view]

Righto.

[quoted text, click to view]

I suspect that wouldn't help - that would involve a lot more processing
than you *should* be incurring here.

[quoted text, click to view]

If you're able to provide a sample database somehow, I'd be happy to
look at it all more closely and try to tweak it. (I'd probably change
the code to C# just to make it easier for me to analyze, but it should
be easy to apply anything learned there back to the VB.NET.)

Something else you might like to do just to make the code nicer is to
use OleDbDataReader.GetString rather than casting the result of
GetValue to String. I'd be very surprised if that made things faster,
but it would certainly be cleaner code. Unless, of course, that CType
is doing more than a cast would in C# - if it's actually doing
conversions from numbers etc, GetString would fail.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
Alex
10/14/2004 6:53:06 AM
Thanks Jon. I will change the code to utilize the DbNull check as you
recommended and will try to implement the array handling of the data from xml
after the first row passes. I think I can provide you with the structure of
the database, and you can probably have a simple script populate it with test
data. The thing is though if you'll SQL server as oppose to Oracle (which is
my source) the results might be different and will not help me at all. But
honestly, i don't think it's somewhere on the database end, the view I
execute returns data real fast. I'll try to see what's taking so long in
this processing by using old trick of outputing the start/end time for each
row in the same file to see if one or series of rows lag the whole thing up
and then determine what's wrong with them. Any further code analysis and
suggsestions are appreciated as is all you previous replies.

[quoted text, click to view]
Alex
10/14/2004 1:47:03 PM
Hi Jon! It's finally done!!! I have first done all the code clean up as per
your suggestion such as using IsDBNull, etc... the performance bottleneck was
in the Xpath calls on each column. So as we discussed, i've stored the Node
object and associated key (field name from db) in a hashtable. And from
second row on, it was passing the values from hash table. Now the execution
time is about 1-3 seconds for the same app that was going for over 20 minutes
sometimes. It's really strange, but I guess using XML is not always the
answer :)
At any rate, I'm as always greatfull for all the support you and the rest of
the community showed. I will do my best to contribute to other's problems in
same way.

Cheers

[quoted text, click to view]
Jon Skeet [C# MVP]
10/14/2004 3:17:36 PM
[quoted text, click to view]

Well, I can use Oracle if that would help. It's somewhat less
convenient for me, but far from impossible.

The start/end time bit is definitely a good idea. You might also want
to try running it once in a way which does nothing but read the data,
just to check that that bit isn't where the time is going (even if the
database itself is fast looking at the data from other clients).

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
Jon Skeet [C# MVP]
10/14/2004 9:58:05 PM
[quoted text, click to view]

No problem - but I'm still staggered that the query took that long.
Even with 60 fields, 2 queries per field, 700 rows, that's only 84,000
queries. Is the XML document huge?

Ah well - anyway, I'm very glad it's sorted :)

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
Alex
10/15/2004 7:31:24 AM
Don't know why it was chocking, but don't think the xml file itself was at
fault, it was a very simple xml file with about 50 elements under a root
element. Each element had 2 attributes. Go figure... I'm glad it's resolved
too, so thanks again!

[quoted text, click to view]
AddThis Social Bookmark Button