all groups > dotnet xml > december 2003 >
You're in the

dotnet xml

group:

CDATA delimiter within CDATA Section


CDATA delimiter within CDATA Section Cade Perkins
12/4/2003 5:20:55 PM
dotnet xml:
How can the CDATA ending delimiter "]]>" be represented within a CDATA
section itself?

Consider an XML document that is intended to contain an embedded,
uninterpreted XML example. Generally, the easiest way to represent it would
be to put the embedded XML example inside a CDATA section. But if the
example has a CDATA section itself, then the ending delimiter will be
interpreted as the end of the "real" CDATA section. Here's even a simpler
example with a script's conditional statement:

<test>
<!CDATA[
...
if g[a[i]]>f ...
...
]]>
</test>

This element returns errors when viewed and/or validated in VS.Net and IE 6.
They both interpret the "]]>" within the "if g[a[i]]>f ..." statement as the
end of the CDATA section. Multiple SGML sources on the web indicate that
the above XML should process correctly without change. Others say that the
">" (greater than) should be escaped inside the CDATA section using the
entity reference &gt;. For example:

<test>
<!CDATA[
...
if g[a[i]]&gt;f ...
...
]]>
</test>

But this doesn't work either because entity references are not interpreted
within a CDATA section. That's the whole point of the CDATA section... to
represent literal character data without escapes.

Is it even possible then? Or is this just a bug in all MS XML parsers?

Re: CDATA delimiter within CDATA Section Bjoern Hoehrmann
12/5/2003 2:07:09 AM
* Cade Perkins wrote in microsoft.public.dotnet.xml:
[quoted text, click to view]

Re: CDATA delimiter within CDATA Section Rowland Shaw
12/5/2003 8:27:15 AM
Two options:
<test>
<![CDATA[
...
if g[a[i]]>]]&gt;<![CDATA[f ...
...
]]>
</test>

And the more readable:
<test>
<![CDATA[
...
if g[a[i]] > f ...
...
]]>
</test>


"Cade Perkins" <msnews@perkcan.net> wrote...
[quoted text, click to view]

Re: CDATA delimiter within CDATA Section Julian F. Reschke
12/5/2003 10:11:05 AM
[quoted text, click to view]

Just don't use CDATA as well. Simply escape all "<" as "&lt;" and you're
done.
Re: CDATA delimiter within CDATA Section Cade Perkins
12/5/2003 11:50:20 AM
Julian F. Reschke said:
[quoted text, click to view]

Thanks for the help! This doesn't work for me in my case, but other
responses help me find a reasonable solution. For a few characters, this
fix would be fine, but if I am automating the process and simply want to
store a block of unknown text data, I don't want to have to consider and
replace every possible meta character that I need to escape using entity
references. The CDATA solution would work great except for the inability to
escape the end delimiter. I'm surprised that it is not part of the
specification. I don't understand all the roots of SGML and I imagine it
has an explanation, but every other language/specification I've worked with
allows some way of escaping the delimiter.

Re: CDATA delimiter within CDATA Section Cade Perkins
12/5/2003 11:55:20 AM
Thanks for the help! I'm trying to automate some XML generation, but want
to store literal blocks of text without worrying about having to replace all
possible meta characters with entity references. Although it's not
"pretty", I guess I'll just have to replace all instanced of "]]>" with
"]]>&gt;<![CDATA[" before I place the data in an XML element.

As for the second option (just adding a space), it would probably be great
for the simple example, but other times I wouldn't be able to arbitrarily
add spaces to the data just to make it work.

[quoted text, click to view]

Re: CDATA delimiter within CDATA Section Ayende Rahien
12/5/2003 8:58:04 PM

[quoted text, click to view]

Do this:
replace all the < and > in your inner xml document to &lt; and &gt; and that
would solve it.

AddThis Social Bookmark Button