trying to parse poorly formed xml in C#

Hi All -

I've run into a problem trying to parse a xml stream.

I've consumed a web service that returns the following string:

<?xml version="1.0" encoding="utf-8"?>
<!-- Elevation Values of -1.79769313486231E+308 (Negative Exponential Value) may mean the data source does not have values at that point. --> <USGS_Elevation_Web_Service_Query>
<Elevation_Query x="-104.8913446" y="39.93402002"> <Data_Source>NED Contiguous U. S. 1/3W arc second elevation data</Data_Source>
<Data_ID>NED.CONUS_NED_13W</Data_ID> <Elevation>5081.4971523335</Elevation>
<Units>FEET</Units>
</Elevation_Query>

I've worked through some different test xml files and successsfully selected (using xpath query) the nodes' inner text using the code below:

XmlDocument doc = new XmlDocument();
document.Load(<path to test xml doc>);
string xpath = ("Customers/Customer[@id='5']/LastName");
XmlNode node = document.SelectSingleNode(xpath);

My questions:
1. why is this technique not working on the string returned by the web service? I gather the xml string is poorly formed (the tag "<USGS_Elevation_Web_Service_Query>" is not closed).

2. Any suggestions on a workaround or something I'm missing? perhaps Xpath syntax problem?

Thanks -

Matt
[1467 byte] By [MattSeitz] at [2007-11-20 9:10:07]
# 1 Re: trying to parse poorly formed xml in C#
If the XML document is ill-formed you shouldn't expect any XPath query to perform well.

AFAIK the best approach would be to track down the bug (or developer that made the bug) or attempt to fix the XML yourself - making it well formed. And then try to extract the data. I'm 99.9% sure there are opensource utilities out there that will help you fix ill-formed XML documents.

One that might help you is the HTML Tidy Project (http://tidy.sourceforge.net/). Anyway, in your case it might be as easy as just adding the missing end tag.

- petter
wildfrog at 2007-11-9 11:53:26 >
# 2 Re: trying to parse poorly formed xml in C#
Thanks - I added the missing tag manually and that did the trick!
MattSeitz at 2007-11-9 11:54:28 >