I was hoping that someone could point me in the right direction. I'm looking to develop a tool that will run an XML file against an XSD schema and if a node doesn't conform to the schema, remove that node from the xml (or output a new xml without that node) and continue through the whole document until it is "Clean" (valid). The code to validate against the schema is strightforward, but how do I use the exceptions thrown by the XmlValidatingReader to clean the XML file? thanks!
First, you shouldn't let the validating reader throw exceptions, but call a delegate you provide for the ValidationEventHandler. Even so, as I explained in http://weblogs.asp.net/cazzu/archive/2004/03/24/95588.aspx, you can't just call sender.Skip() because due to a bug in v1.x, it's not set to the reader raising the event. Therefore, you will need to keep a reference to the reader at the class level in a field, and in the validation handler method, skip the current node: private void OnValidationError(object sender, ValidationEventArgs e) { if (e.Severity = XmlSeverityType.Error) { // Accumulate error, set flag. _thereader.Skip(); } } That should do the job. -- Daniel Cazzulino [MVP XML] Clarius Consulting SA http://weblogs.asp.net/cazzu http://aspnet2.com [quoted text, click to view] "Matthew Wieder" <Development@SatoriGroupInc.com> wrote in message news:Oh#zDjuJEHA.1224@TK2MSFTNGP11.phx.gbl... > I was hoping that someone could point me in the right direction. I'm > looking to develop a tool that will run an XML file against an XSD > schema and if a node doesn't conform to the schema, remove that node > from the xml (or output a new xml without that node) and continue > through the whole document until it is "Clean" (valid). > The code to validate against the schema is strightforward, but how do I > use the exceptions thrown by the XmlValidatingReader to clean the XML file? > thanks! >
--- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system ( http://www.grisoft.com). Version: 6.0.655 / Virus Database: 420 - Release Date: 08/04/2004
I don't think I was clear. I am calling a delegate to handle the event. The question is, once I have accumulated all of the errors, how do I then go and programatically edit the xml document based on where those errors are? The errors give a line number, but how do I map that to a node in the xml document that I can then go and remove? thanks! [quoted text, click to view] Daniel Cazzulino [MVP XML] wrote: > First, you shouldn't let the validating reader throw exceptions, but call a > delegate you provide for the ValidationEventHandler. > Even so, as I explained in > http://weblogs.asp.net/cazzu/archive/2004/03/24/95588.aspx, you can't just > call sender.Skip() because due to a bug in v1.x, it's not set to the reader > raising the event. > Therefore, you will need to keep a reference to the reader at the class > level in a field, and in the validation handler method, skip the current > node: > > private void OnValidationError(object sender, ValidationEventArgs e) > { > if (e.Severity = XmlSeverityType.Error) > { > // Accumulate error, set flag. > _thereader.Skip(); > } > } > > That should do the job. > >
Thanks for Daniel's quick response. Hi Matthew, First of all, I would like to confirm my understanding of your issue. From your description, I understand that you need to remove the invalid nodes from an XmlDocument against an XSD. If there is any misunderstanding, please feel free to let me know. Based on my experience, it's very hard to achieve this. As the validater only returns the line number and line position of the invalidate node, we have to write our own code to map the file position to Xml node. However, the errors reported from the validater are only referring to some key nodes, it might have something to do with other nodes. So removing a single node might not make the document validate. HTH. Kevin Yu ======= "This posting is provided "AS IS" with no warranties, and confers no rights."
You have the issue correct. As you say: "As the validater only returns the line number and line position of the invalidate node, we have to write our own code to map the file position to Xml node." This is the code I qould like to write. Can you advise how I would do this? thanks! [quoted text, click to view] Kevin Yu [MSFT] wrote: > Thanks for Daniel's quick response. > > Hi Matthew, > > First of all, I would like to confirm my understanding of your issue. From > your description, I understand that you need to remove the invalid nodes > from an XmlDocument against an XSD. If there is any misunderstanding, > please feel free to let me know. > > Based on my experience, it's very hard to achieve this. As the validater > only returns the line number and line position of the invalidate node, we > have to write our own code to map the file position to Xml node. However, > the errors reported from the validater are only referring to some key > nodes, it might have something to do with other nodes. So removing a single > node might not make the document validate. > > HTH. > > Kevin Yu > ======= > "This posting is provided "AS IS" with no warranties, and confers no > rights." >
Hi Matthew, Generally, I think we have to write code that can find the invalid node in the XmlDocument according to the line and position first. Then remove this node. After removing all the nodes in the list, validate the XmlDocument again. We can do this again and again, until no errors was found. This is just my suggestion. Let's see if there is any other community member who has better advices. HTH. Kevin Yu ======= "This posting is provided "AS IS" with no warranties, and confers no rights."
"write code that can find the invalid node in the XmlDocument according to the line and position first" - can you help with this part? thanks, -Matthew [quoted text, click to view] Kevin Yu [MSFT] wrote: > Hi Matthew, > > Generally, I think we have to write code that can find the invalid node in > the XmlDocument according to the line and position first. Then remove this > node. After removing all the nodes in the list, validate the XmlDocument > again. We can do this again and again, until no errors was found. This is > just my suggestion. Let's see if there is any other community member who > has better advices. > > HTH. > > Kevin Yu > ======= > "This posting is provided "AS IS" with no warranties, and confers no > rights." >
Have you tried my approach? Having the reader variable at the class level, and skipping invalid nodes using Skip() method? -- Daniel Cazzulino [MVP XML] Clarius Consulting SA http://weblogs.asp.net/cazzu http://aspnet2.com [quoted text, click to view] "Matthew Wieder" <Development@SatoriGroupInc.com> wrote in message news:OBvcX#TKEHA.268@TK2MSFTNGP11.phx.gbl... > You have the issue correct. As you say: > "As the validater only returns the line number and line position of the > invalidate node, we have to write our own code to map the file position > to Xml node." > This is the code I qould like to write. Can you advise how I would do this? > thanks! > > Kevin Yu [MSFT] wrote: > > Thanks for Daniel's quick response. > > > > Hi Matthew, > > > > First of all, I would like to confirm my understanding of your issue. From > > your description, I understand that you need to remove the invalid nodes > > from an XmlDocument against an XSD. If there is any misunderstanding, > > please feel free to let me know. > > > > Based on my experience, it's very hard to achieve this. As the validater > > only returns the line number and line position of the invalidate node, we > > have to write our own code to map the file position to Xml node. However, > > the errors reported from the validater are only referring to some key > > nodes, it might have something to do with other nodes. So removing a single > > node might not make the document validate. > > > > HTH. > > > > Kevin Yu > > ======= > > "This posting is provided "AS IS" with no warranties, and confers no > > rights." > > >
--- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system ( http://www.grisoft.com). Version: 6.0.665 / Virus Database: 428 - Release Date: 21/04/2004
Unless I misunderstood your post, all that will allow me to do is to accumulate a list of the errors. The problem is that the XmlValidatingReader exception just gives a line number and I don't see how to translate that into a node. If I misunderstood or you have a solution, please let me know. thanks! [quoted text, click to view] Daniel Cazzulino [MVP XML] wrote: > Have you tried my approach? Having the reader variable at the class level, > and skipping invalid nodes using Skip() method? >
Here's my "solution": public class MyLoader { XmlValidatingReader _reader; public XPathDocument LoadFilteredDocument(Stream theDoc) { _reader = new XmlValidatingReader(new XmlTextReader(theDoc)); // Add your schemas _reader.ValidationErrorHandler += new ValidationErrorHandler(OnValidate); return new XPathDocument(_reader); } private void OnValidate(object sender, ValidationEventArgs e) { // Just skip the failing node. _reader.Skip(); } } HTH, -- Daniel Cazzulino [MVP XML] Clarius Consulting SA http://weblogs.asp.net/cazzu http://aspnet2.com [quoted text, click to view] "Matthew Wieder" <Development@SatoriGroupInc.com> wrote in message news:eBVNafGLEHA.808@tk2msftngp13.phx.gbl... > Unless I misunderstood your post, all that will allow me to do is to > accumulate a list of the errors. The problem is that the > XmlValidatingReader exception just gives a line number and I don't see > how to translate that into a node. If I misunderstood or you have a > solution, please let me know. > thanks! > > Daniel Cazzulino [MVP XML] wrote: > > Have you tried my approach? Having the reader variable at the class level, > > and skipping invalid nodes using Skip() method? > > >
--- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system ( http://www.grisoft.com). Version: 6.0.665 / Virus Database: 428 - Release Date: 22/04/2004
Hi Matthew, As far as I can think, is that we go through each line in the Xml file before the position of the validation error occurs. During this, we check how many tags we have passed and finally find the node that causes the error. This is quite complicated and I can just provide a general idea. It seems that Daniel has provided us with an example. I think his way is better than mine. HTH. Kevin Yu ======= "This posting is provided "AS IS" with no warranties, and confers no rights."
Thanks - I understand now. The drawback to this route is that it doesn't allow me to display a list of the validation erros to the user and have the user tell me which ones to fix - I must remove them as I find them. I think that the user may want us to take care of certain validation errors, but they may want to manually fix other errors themselves or they may not want to fix certain errors at all. Is there some way to maintain a list of the validation errors and then iterate through the list, fixing as we go? thanks, -Matthew [quoted text, click to view] Daniel Cazzulino [MVP XML] wrote: > Here's my "solution": > > > public class MyLoader > { > XmlValidatingReader _reader; > > public XPathDocument LoadFilteredDocument(Stream theDoc) > { > _reader = new XmlValidatingReader(new XmlTextReader(theDoc)); > // Add your schemas > _reader.ValidationErrorHandler += new > ValidationErrorHandler(OnValidate); > return new XPathDocument(_reader); > } > > private void OnValidate(object sender, ValidationEventArgs e) > { > // Just skip the failing node. > _reader.Skip(); > } > } > > HTH, >
Thanks, that was very helpful and similar to what I need. I believe the best way to proceed is to run the valuator, and maintain a list of the bad element names, with their line numbers. Then,, I would get the elements matching that name from the document and iterate through until I find the one with the matching line number. Once I find it, I can do my repair work. thanks! [quoted text, click to view] Oleg Tkachenko [MVP] wrote: > Matthew Wieder wrote: > >> "write code that can find the invalid node in the XmlDocument >> according to the line and position first" - can you help with this part? > > > Take a look at "Extending the DOM" > http://msdn.microsoft.com/library/en-us/cpguide/html/cpconextendingdom.asp?frame=true, > > it shows how to extend XmlDocument to support IXmlLineInfo interface. >
Using the implementation of the LineInfo XMLDocument class (thanks Oleg!), I have a process which captures the erros in an array, then goes back using the LineInfoDocument and compares the line and position info of each node to the ones in my array. For some reason, the line and position information is not aligned properly, so that an error in an element which the XMLValidatingReader gives as Line Numebr100 and Line Position 100, matches to a node in the LineInfoDocument as Line Number 99 and Line Position 99. The LineInfo implementation is available here: http://msdn.microsoft.com/library/en-us/cpguide/html/cpconextendingdom.asp?frame=true thanks! [quoted text, click to view] Kevin Yu [MSFT] wrote: > Hi Matthew, > > I'd like to know if this issue has been resolved yet. Is there anything > that I can help. I'm still monitoring on it. If you have any questions, > please feel free to post them in the community. > > Kevin Yu > ======= > "This posting is provided "AS IS" with no warranties, and confers no > rights." >
Hi Matthew, I'd like to know if this issue has been resolved yet. Is there anything that I can help. I'm still monitoring on it. If you have any questions, please feel free to post them in the community. Kevin Yu ======= "This posting is provided "AS IS" with no warranties, and confers no rights."
Not bad, but what about the following scenario: The XML document has in it an element of type X which, according to the schema, must contain a type Y element, but does not. Let's say, instead of removing the X element and thereby losing all the information cotnained in it, all we need to do is add an empty element of type Y. I understand this is a little different from the initial problem I proposed, but I'm trying to cover all the scenarios as they come up. How would we handle that? thanks! [quoted text, click to view] Kevin Yu [MSFT] wrote: > Hi Mattew, > > If you need to maintaina list of validation errors, I think we can go > throught the whole document twice. The first time, we get the list of > errors and their positions. Then the second time we let the user choose > which one to fix. We can maintain the errors by order so that each error > will be find at correct positions. > > Kevin Yu > ======= > "This posting is provided "AS IS" with no warranties, and confers no > rights." >
Hi Mattew, If you need to maintaina list of validation errors, I think we can go throught the whole document twice. The first time, we get the list of errors and their positions. Then the second time we let the user choose which one to fix. We can maintain the errors by order so that each error will be find at correct positions. Kevin Yu ======= "This posting is provided "AS IS" with no warranties, and confers no rights."
I have the following situation: I loop through using the XmlValidationReader to get a list of all the validation errors (line and position number). I then loop through the XML document with the XmlTextReader class, and keep callign reader.Read until the reader.LineNumber equals the line number of the first error in the list. I then execute: XmlNode node = xmlLIDoc.ReadNode(reader); node.ParentNode.RemoveChild(node); (where xmlLIDoc is in instance of the XmlDocument class implemented with the LineINfoINterface) but the value of node.ParentNode is undefined. It appears that the node got "removed" from the XML document and hence is orphaned. How can I get ahold of the node in the XmlDocument so I can delete it? thanks! [quoted text, click to view] Matthew Wieder wrote: > Using the implementation of the LineInfo XMLDocument class (thanks > Oleg!), I have a process which captures the erros in an array, then goes > back using the LineInfoDocument and compares the line and position info > of each node to the ones in my array. For some reason, the line and > position information is not aligned properly, so that an error in an > element which the XMLValidatingReader gives as Line Numebr100 and Line > Position 100, matches to a node in the LineInfoDocument as Line Number > 99 and Line Position 99. The LineInfo implementation is available here: > http://msdn.microsoft.com/library/en-us/cpguide/html/cpconextendingdom.asp?frame=true > > > thanks! > > > Kevin Yu [MSFT] wrote: > >> Hi Matthew, >> >> I'd like to know if this issue has been resolved yet. Is there >> anything that I can help. I'm still monitoring on it. If you have any >> questions, please feel free to post them in the community. >> >> Kevin Yu >> ======= >> "This posting is provided "AS IS" with no warranties, and confers no >> rights." >> >
Hi Matthew, Generally, if a node's ParentNode property is null, it means that the node hasn't been added to the DOM tree or the node is the root node which doesn't have a parent. So please try to check if this node has been removed yet from the tree. Kevin Yu ======= "This posting is provided "AS IS" with no warranties, and confers no rights."
Don't see what you're looking for? Try a search.
|