[quoted text, click to view] Phil Hobgen wrote:
[quoted text, click to view] > I think this is probably a
> result of the fact that in the Unicode recommendation it says that "\w"
> should allow underscores because of its common use in programming languages.
Interesting, recently someone run into the problem with \w including the
"_" in some regular expression languages in programming
languages/libraries (e.g. JavaScript/ECMAScript, or the .NET framework
Regex class) but not in the XSD schema regular expression language. I
did not know about the Unicode recommendation. Do you happen to have a
link to that part?
[quoted text, click to view] > Could someone tell me, if I change to use dotNet v2.0 will this behave in
> the way recommended by the W3C or is the behaviour the same as in dotNet v1.1?
With .NET 2.0 with both the new XmlReader with the proper
XmlReaderSettings to validate and the (obsolete) XmlValidatingReader the
following does not validate:
<value>abc_de</value>
schema excerpt:
<xs:element name="value" maxOccurs="unbounded">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="\w{6}"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Validation error message
"Error: The 'value' element is invalid - The value 'abc_de' is
invalid according to its datatype 'String' - The Pattern constraint failed..
So with .NET 2.0 \w in a pattern follows the W3C XSD schema
specification (at least as far as not including "_" in \w).
--
Martin Honnen --- MVP XML