all groups > dotnet xml > september 2005 >
You're in the

dotnet xml

group:

XmlTextReader URL Limitation???


XmlTextReader URL Limitation??? Q
9/28/2005 1:25:45 PM
dotnet xml:
I am feeding XmlTextReader a URL that returns the XML that then gets parsed.
The URL forms a query that affects how much data is returned in XML but not
the format of the data.

The problem is that when the URL string exceeds about 163 characters
(strange number) XmlTextReader seems to choke on it and it doesn't seem like
any XML is actually being returned and the code throws an exception on
reading the first element. Shorter URL strings seem to work fine. Strings
longer than ~163 chars work fine when pasted into a browser, so the server
is returning good XML.

Sorry for the long post.....
Code, XML, error msg follow:

Imports System
Imports System.Xml
Imports System.Xml.XPath
Imports System.Xml.Xsl
Imports System.Xml.Schema
Imports System.Collections

Module Module1
Sub FPO_XmlQuery()
Dim urlA As String
Dim urlB As String
Dim strMatches As String
Dim intMatchCount As Integer
Dim i As Integer

' this url will work.
urla =
"http://www.freepatentsonline.com/xml-search.pl?type=uspatent&query=ttl/cryo
genic&sort=chron&date_range=all&start=51&session_id=ABCD123EFGHIJKLMNOPQRSTU
V"
' this url will fail.
urlb =
"http://www.freepatentsonline.com/xml-search.pl?type=uspatent&query=ttl/cryo
genic&stemming=yes&sort=chron&date_range=all&start=51&session_id=ABCD123EFGH
IJKLMNOPQRSTUV"
' both urls return the exact same xml and both work fine if you
paste them into a browser and get the xml back.
' it appears that the length of the url string is an issue for
XmlTextReader.

Dim r As XmlTextReader
r = New XmlTextReader(urlA) ' sub in urlA or urlB here or
"xm1.txt" to read the xml from a local file.
r.ReadStartElement("results") ' this is where the exception
error occurs with the "long" url
strMatches = r.ReadElementString("matches")
Console.WriteLine("matches: {0}", strMatches)
Console.WriteLine("query: {0}", r.ReadElementString("query"))

intMatchCount = 50 ' CInt(strMatches)

For i = 1 To intMatchCount
r.ReadStartElement("uspatent")
Console.WriteLine("match: {0}", r.ReadElementString("match"))
Console.WriteLine("document: {0}",
r.ReadElementString("document"))
Console.WriteLine("title: {0}", r.ReadElementString("title"))
Console.WriteLine("link: {0}", r.ReadElementString("link"))
r.ReadEndElement() ' </uspatent>
Next

r.ReadEndElement() ' </results>

End Sub



Sub Main()
Console.WriteLine("XML tests...")

Console.WriteLine("Free Patent Read *********************")
FPO_XmlQuery()

While (True = True) ' here to keep console window alive.
End While
End Sub

End Module
----------------------------------------------------------------------------
---------
sample of xml returned:

<?xml version="1.0"?>
<results>
<matches>1826</matches>
<query>TTL:cryogen^4.0</query> <uspatent>
<match>51 </match>
<document>6722140</document>
<title><![CDATA[ Cascade cryogenic thermoelectric cooler for cryogenic
and room temperature applications]]></title>
<link>http://www.freepatentsonline.com/us6722140.html</link>
</uspatent>
<uspatent>
<match>52 </match>
<document>6718775</document>
<title><![CDATA[ Dual chamber cooling system with cryogenic and
non-cryogenic chambers for ultra high vacuum system]]></title>
<link>http://www.freepatentsonline.com/us6718775.html</link>
</uspatent>
<uspatent>
<match>53 </match>
<document>6712880</document>
<title><![CDATA[ Cryogenic process utilizing high pressure absorber
column]]></title>
<link>http://www.freepatentsonline.com/us6712880.html</link>
</uspatent>
....
....
....
----------------------------------------------------------------------------
----------------------
Error message text:

Unhandled Exception: System.Net.WebException: The underlying connection was
clos
ed: The server committed an HTTP protocol violation.
at System.Net.HttpWebRequest.CheckFinalStatus()
at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
at System.Net.HttpWebRequest.GetResponse()
at System.Xml.XmlDownloadManager.GetNonFileStream(Uri uri, ICredentials
crede
ntials)
at System.Xml.XmlDownloadManager.GetStream(Uri uri, ICredentials
credentials)

at System.Xml.XmlUrlResolver.GetEntity(Uri absoluteUri, String role, Type
ofO
bjectToReturn)
at System.Xml.XmlTextReader.CreateScanner()
at System.Xml.XmlTextReader.Init()
at System.Xml.XmlTextReader.Read()
at System.Xml.XmlReader.MoveToContent()
at System.Xml.XmlReader.ReadStartElement(String name)
at XMLtest2.Module1.FPO_XmlQuery() in C:\XMLtest\XMLtest2\Module1.vb:line
68
at XMLtest2.Module1.Main() in C:\XMLtest\XMLtest2\Module1.vb:line 94


Re: XmlTextReader URL Limitation??? Alex Krawarik [MSFT]
9/29/2005 9:41:05 AM
What build are you using please? I transposed your code into C# and this is
no repro for me on Whidbey RTM builds.

C:\>csc FPO_XmlQuery.cs
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.

FPO_XmlQuery.cs(17,16): warning CS0219: The variable 'urla' is assigned but
its value is never used

C:\Documents and Settings\alexkr\My Documents\My
Tests\Managed>FPO_XmlQuery.exe
XML tests...
Free Patent Read *********************
matches: 1827
query: TTL:cryogen^4.0
match: 51
document: 6722866
title: Pump system for delivering cryogenic liquids
link: http://www.freepatentsonline.com/us6722866.html

.... <snipped for length> ...

match: 100
document: 6622758
title: Interlock for cryogenic liquid off-loading systems
link: http://www.freepatentsonline.com/us6622758.html

C:\>


[quoted text, click to view]


Re: XmlTextReader URL Limitation??? Q
9/29/2005 5:36:46 PM
I'm using Visual Studio .NET 2003 and .NET v1.1. As it turns out, the error
has nothing to do with XmlTextReader at all. The problem is with the
parsing of http headers in .NET. .Net 1.0 SP3 and .NET 1.1 SP1 tightened up
the parsing of http headers for security reasons (to block out mal-formed
headers). KB 888527 has info about this. The workaround is to set the
useUnsafeHeaderParsing = true in the app.config file as the knowledge-base
article states. The "fix" is to have the XML server send clean http
headers. It would be curious to see if the default behavior for Whidbey has
useUnsafeHeaderParsing = true. That would be a potential security hole
(albeit, maybe a minor one.)

For me, the problem is solved.... Thanks for your response.

[quoted text, click to view]

AddThis Social Bookmark Button