Groups | Blog | Home
all groups > dotnet ado.net > january 2007 >

dotnet ado.net : Validating an XML file


Jack White
1/12/2007 8:00:15 PM
Hi there,

I've created a strongly-typed "DataSet" using VS. If I save the data via
"DataSet.WriteXml()" and later prompt my users for the name of the file in
order to read it back in again (using "DataSet.ReadXml()"), how do I
validate that the file they enter is valid. That is, while
"DataSet.ReadXml()" throws a "System.Xml.XmlException" if they enter a
non-XML file, a valid XML file results in no exception even though the
schema may be completely incorrect (if they enter some random XML file on
the system that is). Is there a clean way of detecting this situatiion.
Thank you.

Glenn
1/13/2007 8:22:22 PM
Hi

Check out the WriteXmlSchema, ReadXmlSchema and WriteXml and ReadXml
overloads of the DataSet type. I haven't actually used these before, and I
can't check that it will do exactly what you want it do (at a PC with no
..Net), but it's worth a look.

Glenn

[quoted text, click to view]

Arne_Vajhøj
1/13/2007 11:59:02 PM
[quoted text, click to view]

Try something like:

DataSet ds = new DataSet("TestDS");
XmlReaderSettings xrs = new XmlReaderSettings();
xrs.ValidationType = ValidationType.Schema;
xrs.Schemas.Add(XmlSchema.Read(new
StreamReader(@"C:\ds.xsd"), ValidationEventHandler));
XmlReader xr = XmlTextReader.Create(@"C:\ds.xml", xrs);
ds.ReadXml(xr);

Jack White
1/14/2007 2:55:14 PM
[quoted text, click to view]

Thanks very much. That does appear to be the correct solution but it still
doesn't trap an invalid ".xml" file for some reason. I've tried different
variations including similar solutions on the web (which all basically boil
down to your own) but it never invokes the handler. Note that my experience
with XML is very limited so I'm not sure what's wrong (I'm a very
experienced developer however). I'll have to research it further but I'm
basically doing this:

1) Create a strongly-typed (wizard-generated) dataset using VS
(wizard-generated constructor creates all tables, constraints, etc.)
2) Populate it with data and write to file using "DataSet.WriteXml("ds.xml",
XmlWriteMode.IgnoreSchema)". Note that my ".xml" file actually uses another
extension but I assume that's not an issue.
3) Read it back in using your code above, passing the wizard-generated
".xsd" file from step 1. Note however that "ds" from your example will
actually be the wizard-generated "DataSet" derivative whose constructor
creates all tables, etc. I'm assuimg this makes no difference (?).

If I now pass in an arbitrary (invalid) ".xml" file in step 3, the handler
is never called. Any idea what's wrong? Thanks again.

Arne_Vajhøj
1/14/2007 3:19:10 PM
[quoted text, click to view]

If I add something either non well formed XML to the file or
add well formed XML that does not comply with the schema then
I get an exception.

Arne
Jack White
1/14/2007 3:39:08 PM
[quoted text, click to view]

Strange. The ".xml" file I'm passing doesn't conform with the ".xsd" file.
The handler isn't called however nor is any exception thrown. After the call
to "ReadXml()", everything created in the constructor remains intact as if
nothing happened. After several hours mucking with this I'm at a loss to
explain it. Anyway, I'll just have to keep probing. Thanks again though
(appreciated).

Jack White
1/15/2007 12:44:17 PM
[quoted text, click to view]

Thanks for your feedback. I was going to respond to your first post in fact
but was working on resolving the issue which I just did moments ago (with
the help of an XML MVP elsewhere though I'm still testing things). I had to
turn on the "XmlSchemaValidationFlags.ReportValidationWarnings" in
"XmlReaderSettings" and then Arne's example works (well, I changed it
slightly). Note that not even MSFT's examples touch this flag however so I
don't understand this (it makes me a little nervous in fact). Problems are
also reported as warnings and not errors which I find counter-intuitive. In
any case, perhaps one of the "ReadXml()" overloads you suggested does throw
an exception but not the version I've been using all along (it doesn't). If
one of them does however then it's not documented and therefore unreliable.
I think Arne's way is really the "official" way in 2.0 anyway (there was
another technique in 1.X that's now obsolete) so I'm probably safer relying
on it. I don't like having to send my ".xsd" file out just for validation
however (it's an internal detail) but I'm hoping it can be avoided somehow.
I'm still looking into it but I'm fairly new to XML and so it's a learning
process. Any advice you can offer (on having to ship my ".xsd" file) would
be welcome however. Thanks again.

Glenn
1/15/2007 1:43:11 PM
Using ReadXmlSchema and ReadXml should throw an exception if the XML doesn't
match the schema, although it'll probably being something vague like a
constraint failure.

[quoted text, click to view]

Jack White
1/16/2007 8:26:08 AM
[quoted text, click to view]

Since none of the examples I've seen turn this flag on it leads me to
believe I have a problem somewhere. I shouldn't have to turn it on IOW if
nobody else has to.

[quoted text, click to view]

It always reports it as a warning. I originally thought it would be reported
as an error but apparently not. The problem is that you can't distinguish
between "acceptable" warnings generated while reading a conforming ".xml"
file (warnings I can safely ignore), and those that really need to be
treated as errors (normally because you're dealing with a non-conforming
".xml" file). My testing shows that a conforming ".xml" file generates no
warnings however so I'll have to assume that any warning is really an error
and treat it that way. That may not be true however so I may actually reject
a conforming ".xml" file which will be a problem. I can't seem to resolve
the issue any other way however.

[quoted text, click to view]

Even when I pass "XmlWriteMode.WriteSchema" to "WriteXml()" and later read
it back in, no errrors are generated.

[quoted text, click to view]

Tampering isn't an issue in my case but it's really an implemenation detail
so I wanted to avoid having to install it merely for this purpose. I'm not
sure what the accepted protocol is however . To validate an ".xml" file, do
you normally install its ".xsd" file for this purpose, assuming you don't
need it for anything else. In any case, my overall experience with this
situation has been very frustrating. I just want to validate an ".xml" file
but apparently I have to become an XML expert to do it. Thanks again for
your help though.

Glenn
1/16/2007 11:21:06 AM
Responses inline...

[quoted text, click to view]

Interesting, I can't remember ever having to do the
"XmlSchemaValidationFlags.ReportValidationWarnings" thing. The method I
use, which is almost definately slower given it doesn't use a reader, or if
it does it's internal, is XmlDocument.Validate().

[quoted text, click to view]

Whenever a problem gets reported you can check ValidationEventArgs.Severity
property to determine what action to take, if any.

[quoted text, click to view]

Was the schema inlined with the XML? If not, it'll infer the schema from
the XML and won't throw an exception.

http://msdn2.microsoft.com/en-us/library/360dye2a.aspx

[quoted text, click to view]

That involved using XmlValidatingReader, which is indeed obselete.

[quoted text, click to view]


I've distributed schemas with application code before now as a plain .xsd
file, although this was to a well constrained user population. If your
worried about people tampering with it, you could store it in a .resx file
in your application.

Glenn

Glenn
1/16/2007 3:55:33 PM

[quoted text, click to view]

Don't worry about distributing an XSD with your application, it's just
another file. And it's there for the purpose of making your application
inputs more robust.

Anyway, back your the validation problem...

Before calling the myDataSet.ReadXml( reader ), you need to call while (
reader.Read() ); Doing so will cause the reader to walk over the document
and pick up any problems.

I promise you, this version works, honest ;-)

class Program
{
static void Main()
{
try
{
//WriteData();

//Mangle the data by hand...

ReadData();
}
catch ( Exception exception )
{
Console.WriteLine( exception.Message );
}
finally
{
Console.WriteLine( "\r\nFinished" );
Console.ReadLine();
}

}
static void WriteData()
{
MyDataSet testDataSet = new MyDataSet();

testDataSet.MyTable.AddMyTableRow( "Abraham Lincoln" );

testDataSet.WriteXml( "MyFile.xml", XmlWriteMode.IgnoreSchema );
}
/// <summary>
///
/// </summary>
static void ReadData()
{

MyDataSet myDataSet = new MyDataSet();

XmlReaderSettings settings = new XmlReaderSettings();

settings.Schemas.Add( null, "..\\..\\MyDataSet.xsd" );
settings.ValidationType = ValidationType.Schema;
settings.ValidationFlags |=
XmlSchemaValidationFlags.ReportValidationWarnings;
settings.ValidationEventHandler += new ValidationEventHandler(
delegate( object sender, ValidationEventArgs args )
{
Console.WriteLine( "Severity : {0}\r\nMessage :{1}",
args.Severity.ToString(), args.Message );
} );

XmlReader reader = XmlReader.Create( "MyFile.xml", settings );

//this line is important!!!!
while ( reader.Read() ) ;

myDataSet.ReadXml( reader );
}
}



Glenn
1/16/2007 6:51:36 PM

[quoted text, click to view]

Firstly, add a while (reader.Read() ); after you create the XmlReader. That
will cause the reader to walk over the document.

And, if you haven't already done so, add someway of capturing and displaying
the results of the validation from the ValidationEventArgs of the
ValidationEventHandler.

I did a test, for instance changing the data type of the Key element
InnerText which resulted in and XmlSeverity.Error and produced a message
indicating the exact problem.

HTH

Glenn


Jack White
1/17/2007 10:11:36 AM
I appreciate your on-going assistance ...

[quoted text, click to view]

Ok (thanks)

[quoted text, click to view]

Ok, I'll look into it but why is "reader.Read()" required? I assume (maybe
naively) that any action on the reader will be validated. And in fact, I've
conducted a number of tests now (not exhaustive but so far so good) and the
handler seems to be trapping everything. That is, the call to
"myDataSet.ReadXml(reader)" attempts to read the entire ".xml" file into
"myDataSet" and so validation occurs no differently than if you read it one
node at a time by calling "reader.Read()" first. Doing that would double the
processing time in fact so it doesn't seem to make any sense (on the surface
anyway). Are you sure it's really required? In any case, I do have two
related questions that maybe you know something about:

1) I noticed that if I add a new field, column, etc. to my "DataSet", VS
updates the ".xsd" file of course but I can still read previous ".xml" files
error/warning-free (i.e., older ".xml" files that don't have these new
elements). I'll experiment with your "reader.Read()" scenario to see what it
does (I'm guessing there won't be any difference) but since the schema of
the older ".xml" file isn't an exact match for the updated ".xsd" file, I'm
not sure why it passess validation (probaly because it's still a valid
subset I'm guessing).

2) The first arg to "settings.Schemas.Add" (i.e., the targetNameSpace) is
null. I'm not too familiar with this argument yet (working on it) but is
this safe. Passing null therefore causes the function to pull
"targetNamespace" from the ".xsd" file itself which I assume is typically
the correct path to go (since I'd presumably pass the same value anyway if I
were to explicitly pass the first arg).

Thanks again.

Jack White
1/17/2007 8:06:08 PM
Ok, thanks for all your help (appreciated). Everything seems to working now
but I'm sure I'll be revisting it once I gain more experience.

glenn
1/18/2007 12:16:31 AM
[quoted text, click to view]

Thinking about it, you're right, why call it twice? The reader.Read()
should get called by the ReadXml(). The strange thing is, when I didn't
call reader.Read() explicitly I didn't get the validation messages. Could
ReadXml() suppress validation warnings? Could it only care that the XML can
create a "valid" DataSet and to hell with everything else?

Sorry, don't know the answer to this one. Where's an MVP when you need one?

[quoted text, click to view]

If your adding columns, the behaviour is absolutely correct, it's forward
compatibility, or maybe backward compatibility. One of the two.

[quoted text, click to view]

In this instance it's quite safe.

Glenn

Thomas T. Veldhouse
1/22/2007 8:01:11 PM
[quoted text, click to view]

Also, I find that it is worth while to embedd the XSD as a resource in the
assembly. It can then be retreived and used without worry about filesystem
access or other issues [like an assembly being GAC'd or building a custom
directory structure].

--
Thomas T. Veldhouse
Key Fingerprint: D281 77A5 63EE 82C5 5E68 00E4 7868 0ADC 4EFB 39F0

Joe Monnin
2/12/2007 1:21:00 PM
I wish I could say it worked for me.

This still isn't working for me.

Here's my XML file ("details.xml"):
<?xml version="1.0" encoding="UTF-8" ?>
<details/>

Here's my XSD file ("Test.xsd")
<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="Test" targetNamespace="http://tempuri.org/Test.xsd"
elementFormDefault="qualified" xmlns="http://tempuri.org/Test.xsd"
xmlns:mstns="http://tempuri.org/Test.xsd"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="details" type="xs:string" />
</xs:schema>

Here's my code:
System.Xml.XmlReaderSettings settings = new System.Xml.XmlReaderSettings();
settings.Schemas.Add(null, "Test.xsd");
settings.Schemas.Compile();
settings.ValidationType = System.Xml.ValidationType.Schema;
settings.ValidationFlags |=
System.Xml.Schema.XmlSchemaValidationFlags.ReportValidationWarnings;

settings.ValidationEventHandler += new
System.Xml.Schema.ValidationEventHandler(delegate(object s,
System.Xml.Schema.ValidationEventArgs args)
{
Console.WriteLine("Severity : {0}\r\nMessage :{1}",
args.Severity.ToString(), args.Message);
});

System.Xml.XmlReader reader = System.Xml.XmlReader.Create("details.xml",
settings);

while (reader.Read()) ;

The XML is compliant, but it's still reporting an error:
Could not find schema information for the element 'details'. I've been
fighing with this for hours. This is ridiculous. Any ideas?

[quoted text, click to view]
usenet NO[at]SPAM tech-know-ware.com
2/13/2007 3:39:53 AM
On 12 Feb, 21:21, Joe Monnin <JoeMon...@discussions.microsoft.com>
[quoted text, click to view]

I haven't checked all of this thread, so apologies if I've missed the
intended point. But your XML instance isn't valid against your
schema. Either the XML instance needs to be (i.e. add in namespace
declaration):

<?xml version="1.0" encoding="UTF-8" ?>
<details xmlns="http://tempuri.org/Test.xsd"/>

Or your schema needs to be (i.e. remove targetNamespace etc.):
<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="Test"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="details" type="xs:string" />
</xs:schema>

HTH,

Pete.
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
for XML to C++ data binding visit
http://www.tech-know-ware.com/lmx
(or http://www.xml2cpp.com)
=============================================
AddThis Social Bookmark Button