[quoted text, click to view] Nick Wong wrote:
[quoted text, click to view] > i am reading in an xml stream and validating it against a given schema.
> the objective is to "mark" "invalid" nodes (according to the xsd type
> defined, or some rules) with an attribute, and then pass this modified
> stream to another process in the pipeline.
>
> as an example,
> <bk:book publisher="Addison Wesley">
> <bk:title>Mythical Man Month</bk:title>
> <bk:author>Frederick Brooks</bk:author>
> <bk:quantity>AAAA</bk:quantity>
> </bk:book>
>
> should become:
> <bk:book publisher="Addison Wesley" valid="false">
> <bk:title>Mythical Man Month</bk:title>
> <bk:author>Frederick Brooks</bk:author>
> <bk:quantity valid="false">AAAA</bk:quantity>
> </bk:book>
>
> note that the <bk:quantity ...> element now has a new 'valid' attribute set
> to 'false'.
I am not sure it is a good idea as obviously the new attribute inserted
(e.g. valid="false") is in itself making the document invalid.
Anyway, I think one way to solve that is to make use of two different
event mechanisms in .NET, the ValidationEventHandler, and the DOM XML
document mutation events the XmlDocument exposes. Here is a simple
example that manages to "mark" invalid element nodes, at least as they
have simple content that is not valid:
using System;
using System.Xml;
using System.Xml.Schema;
public class Test2005032802 {
private XmlElement lastInsertedElement;
private XmlDocument xmlDocument;
private string xmlURL;
private bool valid;
private string lastElementName;
private bool lastElementValid;
private XmlValidatingReader xmlValidator;
public static void Main (string[] args) {
Test2005032802 test = new Test2005032802(args[0]);
test.Load();
}
public Test2005032802 (string url) {
xmlURL = url;
}
public void Load () {
xmlDocument = new XmlDocument();
xmlDocument.PreserveWhitespace = true;
xmlDocument.NodeInserted += new
XmlNodeChangedEventHandler(NodeInsertedHandler);
xmlValidator = new XmlValidatingReader(new XmlTextReader(xmlURL));
xmlValidator.ValidationEventHandler += new
ValidationEventHandler(ValidationHandler);
valid = true;
lastElementValid = true;
lastElementName = "";
Console.WriteLine("Beginning validation:");
xmlDocument.Load(xmlValidator);
Console.WriteLine("Validaton finished: XML document is {0}.", valid
? "valid" : "not valid");
Console.WriteLine("Final OuterXml:");
xmlDocument.Save(Console.Out);
}
void NodeInsertedHandler (object sender, XmlNodeChangedEventArgs args) {
XmlNode currentlyInserted = args.Node;
Console.WriteLine("Node changed with action {0} and node {1} with
type {2} and name {3} and value {4}.", args.Action, currentlyInserted,
currentlyInserted.NodeType, currentlyInserted.Name,
currentlyInserted.Value);
if (currentlyInserted.NodeType == XmlNodeType.Element &&
lastElementName == currentlyInserted.Name &&
!lastElementValid)
{
lastInsertedElement = (XmlElement) currentlyInserted;
lastInsertedElement.SetAttribute("valid", "false");
lastElementName = "";
lastElementValid = true;
}
}
void ValidationHandler (object sender, ValidationEventArgs args) {
Console.WriteLine("Validation {0}: {1}.", args.Severity, args.Message);
if (args.Severity == XmlSeverityType.Error) {
valid = false;
if (xmlValidator.NodeType == XmlNodeType.EndElement) {
lastElementName = xmlValidator.Name;
lastElementValid = false;
}
}
}
}
For instance with the schema being
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema
xmlns:xs="
http://www.w3.org/2001/XMLSchema" version="1.0">
<xs:element name="gods">
<xs:complexType>
<xs:sequence>
<xs:element name="god" type="xs:NCName" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
and the XML instance being
<?xml version="1.0" encoding="UTF-8"?>
<gods
xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="test2005032801Xsd.xml">
<god>Kibo</god>
<god>1Xibo</god>
<god>Jaffo</god>
<god>-Maho</god>
</gods>
the output of the XML at the end is
<gods xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="test2005032801Xsd.xml">
<god>Kibo</god>
<god valid="false">1Xibo</god>
<god>Jaffo</god>
<god valid="false">-Maho</god>
</gods>
But be aware that the code makes certain assumptions on the flow of
events in .NET, based on some test examples, there is no documentation
saying in what way validation and mutation events interact so you have
to look at the current event flow and then write code based on
observations in the current .NET version.
The main observation I have made is that the XmlValidatingReader is
positioned on an XmlNodeType.EndElement when reporting an validation
error and that that element is inserted next into the document tree.
That way you are able to "mark" the node with an attribute.
I have however not tried any schema with elements having complex types
and validation errors in relation to the defined structure of the
element, more logic would need to be added to handle that.
--
Martin Honnen