Groups | Blog | Home
all groups > dotnet performance > may 2006 >

dotnet performance : DataSet and best practice in 2.0


sonic
5/31/2006 8:03:31 AM
Hello,
There appear to have been numerous enhancements to the way DataSet is
serialized in 2.0 but it appears like it is still a big performance hit
when using with web services. is it still recommended to use a strongly
typed collection or array instead of DS to be moved across app
boundaries using soap ?

the new binary formatter appears to be only useful when using remoting.
David Browne
6/12/2006 9:57:56 AM

[quoted text, click to view]

Well, if you are truly moving data "across app boundaries" then you should
follow best practices for SOA and use contract-first, interoperable web
services. The problem with DataSets in that scenario is that the WSDL that
they generate is not specific clean enough to allow non-.NET clients to
meaningfully work with the data.

If you are just moving data across machine boundaries within a single
applciation, then I wouldn't worry about the performance hit too much. If
it becomes an issue you can later move to binary serialization without too
much disruption. This is an important differece from 1.1, where you had to
make a performance-related decision about using DataSets during the initial
application design, when you often have no idea whether the performance of
XML-Serialization will be an issue.

David
sonic
6/13/2006 10:45:40 AM
thanks for the reply.
i am indeed moving data across machine boundaries in a single
application. using dataset with different platforms wouldn't make much
sense.

as far as switching to binary serialization, that would mean switching
from webservices to remoting no ? that is a significant change. i was
wondering about datasets in webservices, and if it is worth creating
strongly typed collections for every purpose instead of using datasets.

[quoted text, click to view]
David Browne
6/13/2006 2:59:38 PM

[quoted text, click to view]

What I was thinking about as an easy change is to add a web method that
passes byte[]'s containing binary-serialized datasets. Not quite as fast as
remoting, and a bit dirty, but certianly easy.


Moreover, by the time you get around to solving this problem WCF will be
here, which changes the whole game for distributed apps.

[quoted text, click to view]

I wouldn't. Collections serialize to XML as well so there's not that much
of a perf difference. And another new feature of .NET datasets is the
ability to exclude the schema in XML serializaion. This makes the XML
format of DataSets not too different from the XML formats for typed
collections.


David
sonic
6/14/2006 9:27:23 AM
thanks again for your reply.
can you elaborate more about how WCF would change the game for
distributed apps ?
also,
everything over webservices serializes to xml, the question is which
objects serialize faster and slimmer output.. i think dataset is at the
lowest performance end of that matric.

[quoted text, click to view]
David Browne
6/14/2006 11:43:11 AM

[quoted text, click to view]

It is a matter of degree. DataSet XML is really not that bloated once you
exclude the inline schema. Certianly it's not so huge that it should drive
a design decision.

In general you should avoid, if possible, making design choices based on
performance differences which may not be large, and may not even matter.

David
sonic
6/15/2006 7:18:28 AM
ok, what did you mean by the following ?

"What I was thinking about as an easy change is to add a web method
that
passes byte[]'s containing binary-serialized datasets. "

webservices do not support binary format, are you saying the byte[]
array will be serialized and still so much more efficient?


[quoted text, click to view]
David Browne
6/15/2006 11:57:24 AM

[quoted text, click to view]

A byte[] will be serialized as a base64 encoded string. That carries a 33%
overhead over binary transmission. However the byte array that holds the
binary-serialized dataset should be substantially smaller than xml version.
I suspect you would see a substantial net gain over xml transmission _if_
xml transmission turns out to be too costly.

David
sonic
6/16/2006 12:27:09 PM
i see.
have you benchmarked this personally?


[quoted text, click to view]
john conwell
6/20/2006 2:19:02 PM
Um, i wouldnt pass a byte[] across a soap protacal. I tried this a while
back and the results were pretty funny actually. I binary serialized a
dataset to a byte[], which has about 35,000 elements in the array. the xml
the soap serializer generated went something like this

<bunch of soap stuff>
<byte_a>12</byte_a>
<byte_a>4</byte_a>
<byte_a>7</byte_a>
...repeated 34,997 more times
<bunch of soap stuff>

bad option.

My suggestion is do create your own class to represent the data. Mark it up
with serialization attributes so that the class uses xml attributes for the
properties when serialized. Then put your class instances in a straight
array (not stuff from System.Collection...) and pass that. that should get
you the tightest xml possible.



[quoted text, click to view]
sonic
6/21/2006 7:29:14 AM
yah..
this is why i asked this question. which was, how bad is the
performance on dataset 2.0.
i run some tests, and found the dataset to perform pretty well. at
least up to couple thousand rows of data.
it was slower than doing a straight up sqlreader -> MySimpleObject[],
but not by much.
when compared again a more elaborate strongly typed db object
implementation such as "net tiers" open source codegen, dataset was
10-20 times faster in being constructed from the db. i couldnt see much
difference in the serialization part either.

so to me, dataset doesnt seem like a performance hog, maybe it is
thought as such only for much bigger chunks of data than couple
thousand rows ?

[quoted text, click to view]
Ben Voigt
6/26/2006 7:45:02 AM
[quoted text, click to view]

The suggestion was to use Base64 encoding, not xml, to serialize the
dataset. The resulting string can be embedded in xml along with your other
items.

[quoted text, click to view]

sonic
6/27/2006 8:09:20 AM
thanks Ben,
would you happen to have an example of how that could be done with
WebServices?
is it as simple as overriding DataSets serialize method? or would i
need to do something custom on the datasets proxy level ?


[quoted text, click to view]
AddThis Social Bookmark Button