<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Parsing Atom with libxml2</title>
	<atom:link href="http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/feed" rel="self" type="application/rss+xml" />
	<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2</link>
	<description>a geek commodity</description>
	<lastBuildDate>Mon, 23 Jan 2012 20:27:23 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
	<item>
		<title>By: Phil</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1182</link>
		<dc:creator>Phil</dc:creator>
		<pubDate>Mon, 26 Nov 2007 22:06:19 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1182</guid>
		<description>I really must have made a terrible typo first time around as setting the namespace context really does work. How very embarrassing and frustrating. Thanks all!</description>
		<content:encoded><![CDATA[<p>I really must have made a terrible typo first time around as setting the namespace context really does work. How very embarrassing and frustrating. Thanks all!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Phil</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1181</link>
		<dc:creator>Phil</dc:creator>
		<pubDate>Mon, 26 Nov 2007 10:38:58 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1181</guid>
		<description>Sam, Adam and Aristotle, I was strongly under the impression that I had tried that. It is possible that I made a typo. Your comments certainly suggest it, although the interpreter didn&#039;t report any problems with my syntax.

I used &quot;//feed&quot; in my example simply to demonstrate that regardless of the base document the query should return something. My actual XPath query was &quot;/feed&quot;.

Asbjørn, you&#039;re right but given that I thought I&#039;d tried the single-line sprinkling suggested by Sam and Aristotle, I was hoping to draw out the XPath experts and it seems to have worked ;)

I did actually look for an equivalent of xpath.flipBozoBit() which would allow me to query the default namespace directly.

Edward - I had libxml2 to hand and wanted to run some very specific atom-based queries. Plus I wanted to increase my understanding of the library.

Jeni and anon, thanks very much for that, it&#039;s a useful selector I&#039;d forgotten about.</description>
		<content:encoded><![CDATA[<p>Sam, Adam and Aristotle, I was strongly under the impression that I had tried that. It is possible that I made a typo. Your comments certainly suggest it, although the interpreter didn&#8217;t report any problems with my syntax.</p>
<p>I used &#8220;//feed&#8221; in my example simply to demonstrate that regardless of the base document the query should return something. My actual XPath query was &#8220;/feed&#8221;.</p>
<p>Asbjørn, you&#8217;re right but given that I thought I&#8217;d tried the single-line sprinkling suggested by Sam and Aristotle, I was hoping to draw out the XPath experts and it seems to have worked <img src='http://philwilson.org/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>I did actually look for an equivalent of xpath.flipBozoBit() which would allow me to query the default namespace directly.</p>
<p>Edward &#8211; I had libxml2 to hand and wanted to run some very specific atom-based queries. Plus I wanted to increase my understanding of the library.</p>
<p>Jeni and anon, thanks very much for that, it&#8217;s a useful selector I&#8217;d forgotten about.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alf</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1180</link>
		<dc:creator>alf</dc:creator>
		<pubDate>Mon, 26 Nov 2007 10:14:11 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1180</guid>
		<description>PHP5&#039;s SimpleXML (based on libxml2) has a registerXPathNamespace function - maybe Python has an equivalent?



Otherwise yes, in the past I&#039;ve just mangled the &quot;xmlns=&quot; bit of the default namespace declaration so that it doesn&#039;t apply any more.</description>
		<content:encoded><![CDATA[<p>PHP5&#8242;s SimpleXML (based on libxml2) has a registerXPathNamespace function &#8211; maybe Python has an equivalent?</p>
<p>Otherwise yes, in the past I&#8217;ve just mangled the &#8220;xmlns=&#8221; bit of the default namespace declaration so that it doesn&#8217;t apply any more.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Asbjørn Ulsberg</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1179</link>
		<dc:creator>Asbjørn Ulsberg</dc:creator>
		<pubDate>Mon, 26 Nov 2007 09:12:05 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1179</guid>
		<description>What would you do with XML nodes in the empty namespace (xmlns=&quot;&quot;), then? I kind of agree with you, though. However, I don&#039;t think it&#039;s worth making so much fuzz about; it only requires one more line of code to define the Atom namespace with a prefix and then sprinkle the prefix out in the XPath statements.

Making the default namespace equivalent of the empty namespace should probably be done explicitly anyhow, with an optional parameter to parseFile() or something similar. It has to be under the author&#039;s control whether he wants to access empty namespaced elements or not, and I think the default behaviour should be as it currently is.</description>
		<content:encoded><![CDATA[<p>What would you do with XML nodes in the empty namespace (xmlns=&#8221;"), then? I kind of agree with you, though. However, I don&#8217;t think it&#8217;s worth making so much fuzz about; it only requires one more line of code to define the Atom namespace with a prefix and then sprinkle the prefix out in the XPath statements.</p>
<p>Making the default namespace equivalent of the empty namespace should probably be done explicitly anyhow, with an optional parameter to parseFile() or something similar. It has to be under the author&#8217;s control whether he wants to access empty namespaced elements or not, and I think the default behaviour should be as it currently is.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeni Tennison</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1178</link>
		<dc:creator>Jeni Tennison</dc:creator>
		<pubDate>Mon, 26 Nov 2007 08:47:01 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1178</guid>
		<description>It&#039;s hardly ideal, but you could use paths like &quot;//*[name() = &#039;feed&#039;]&quot;. Really there should be a way of binding a prefix (eg atom) to the namespace before you evaluate any XPaths, so you can do &quot;//atom:feed&quot;.</description>
		<content:encoded><![CDATA[<p>It&#8217;s hardly ideal, but you could use paths like &#8220;//*[name() = 'feed']&#8220;. Really there should be a way of binding a prefix (eg atom) to the namespace before you evaluate any XPaths, so you can do &#8220;//atom:feed&#8221;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aristotle Pagaltzis</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1177</link>
		<dc:creator>Aristotle Pagaltzis</dc:creator>
		<pubDate>Mon, 26 Nov 2007 06:21:00 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1177</guid>
		<description>&lt;code&gt;&gt;&gt;&gt; import libxml2
&gt;&gt;&gt; doc = libxml2.parseFile(&quot;/tmp/feed.atom&quot;)
&gt;&gt;&gt; xc = doc.xpathNewContext()
&gt;&gt;&gt; xc.xpathRegisterNs(&quot;atom&quot;,&quot;http://www.w3.org/2005/Atom&quot;)
0
&gt;&gt;&gt; results = xc.xpathEval(&quot;//atom:feed&quot;)
&gt;&gt;&gt; len(results)
1&lt;/code&gt;</description>
		<content:encoded><![CDATA[<p><code>&gt;&gt;&gt; import libxml2<br />
&gt;&gt;&gt; doc = libxml2.parseFile("/tmp/feed.atom")<br />
&gt;&gt;&gt; xc = doc.xpathNewContext()<br />
&gt;&gt;&gt; xc.xpathRegisterNs("atom","http://www.w3.org/2005/Atom")<br />
0<br />
&gt;&gt;&gt; results = xc.xpathEval("//atom:feed")<br />
&gt;&gt;&gt; len(results)<br />
1</code></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Edward O'Connor</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1176</link>
		<dc:creator>Edward O'Connor</dc:creator>
		<pubDate>Mon, 26 Nov 2007 05:18:37 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1176</guid>
		<description>Why not use the Universal Feed Parser?</description>
		<content:encoded><![CDATA[<p>Why not use the Universal Feed Parser?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: anon</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1175</link>
		<dc:creator>anon</dc:creator>
		<pubDate>Mon, 26 Nov 2007 03:49:01 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1175</guid>
		<description>For the lazy one-shot jobs where you don&#039;t want to write the extra few lines for your own xpathContext to resolve namespaces correctly, can do the lazy pretend-there-aren&#039;t-any idiom:
results = doc.xpathEval(&quot;//*[local-name()=&#039;feed&#039;]&quot;)</description>
		<content:encoded><![CDATA[<p>For the lazy one-shot jobs where you don&#8217;t want to write the extra few lines for your own xpathContext to resolve namespaces correctly, can do the lazy pretend-there-aren&#8217;t-any idiom:<br />
results = doc.xpathEval(&#8220;//*[local-name()='feed']&#8220;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adam Fitzpatrick</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1174</link>
		<dc:creator>Adam Fitzpatrick</dc:creator>
		<pubDate>Mon, 26 Nov 2007 02:24:47 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1174</guid>
		<description>You only need to make a small change.

&gt;&gt; import libxml2
&gt;&gt; doc = libxml2.parseFile(&quot;/home/pip/allposts.xml&quot;)
&gt;&gt; ctxt = doc.xpathNewContext()
&gt;&gt; ctxt.xpathRegisterNs(&quot;a&quot;, &quot;http://www.w3.org/2005/Atom&quot;)
&gt;&gt; results = doc.xpathEval(&quot;//a:feed&quot;)

You can reuse the XPath context object for other XPath queries on the same document.

There are two subtle things to note. First, prefix:localname in XPath matches an element with that local name in the namespace referred to by that prefix, but a name without a prefix in an XPath expression always means that name in &quot;the namespace you have when you don&#039;t have a namespace&quot; (or &quot;the null namespace&quot; as Daniel Veillard less whimsically describes it in the email Aristotle Pagaltzis quotes in the blog post you refer to). Like Veillard says, XPath just doesn&#039;t have the &quot;default namespace&quot; concept like XML itself does.

It doesn&#039;t help that the Namespaces in XML specification doesn&#039;t define a practical term for &quot;the null namespace&quot;; it uses cumbersome language like &quot;the namespace name has no value&quot; (see the definition of &quot;expanded name&quot;, or section 6.2 (Namespace Defaulting) for example).

Incidentally, though this characteristic of XPath is very inconvenient for element names, *attribute* names with no prefix in XML are also in the null namespace, so XPath&#039;s behaviour is obviously a much better fit for matching attribute names.

The other issue is that XPath implementations basically never use the document&#039;s namespace prefix bindings (quite reasonably so, for two reasons: those bindings can differ on every element in the document; and, more commonly, different documents can and do use different prefixes, and you basically never want to discriminate between documents on the basis of the prefix).

This means that option 1 won&#039;t work (because the lack of prefix in the source document isn&#039;t the problem), option 2 won&#039;t be necessary, and option 3 won&#039;t be a problem if there turns out to be a next time after all.</description>
		<content:encoded><![CDATA[<p>You only need to make a small change.</p>
<p>&gt;&gt; import libxml2<br />
&gt;&gt; doc = libxml2.parseFile(&#8220;/home/pip/allposts.xml&#8221;)<br />
&gt;&gt; ctxt = doc.xpathNewContext()<br />
&gt;&gt; ctxt.xpathRegisterNs(&#8220;a&#8221;, &#8220;http://www.w3.org/2005/Atom&#8221;)<br />
&gt;&gt; results = doc.xpathEval(&#8220;//a:feed&#8221;)</p>
<p>You can reuse the XPath context object for other XPath queries on the same document.</p>
<p>There are two subtle things to note. First, prefix:localname in XPath matches an element with that local name in the namespace referred to by that prefix, but a name without a prefix in an XPath expression always means that name in &#8220;the namespace you have when you don&#8217;t have a namespace&#8221; (or &#8220;the null namespace&#8221; as Daniel Veillard less whimsically describes it in the email Aristotle Pagaltzis quotes in the blog post you refer to). Like Veillard says, XPath just doesn&#8217;t have the &#8220;default namespace&#8221; concept like XML itself does.</p>
<p>It doesn&#8217;t help that the Namespaces in XML specification doesn&#8217;t define a practical term for &#8220;the null namespace&#8221;; it uses cumbersome language like &#8220;the namespace name has no value&#8221; (see the definition of &#8220;expanded name&#8221;, or section 6.2 (Namespace Defaulting) for example).</p>
<p>Incidentally, though this characteristic of XPath is very inconvenient for element names, *attribute* names with no prefix in XML are also in the null namespace, so XPath&#8217;s behaviour is obviously a much better fit for matching attribute names.</p>
<p>The other issue is that XPath implementations basically never use the document&#8217;s namespace prefix bindings (quite reasonably so, for two reasons: those bindings can differ on every element in the document; and, more commonly, different documents can and do use different prefixes, and you basically never want to discriminate between documents on the basis of the prefix).</p>
<p>This means that option 1 won&#8217;t work (because the lack of prefix in the source document isn&#8217;t the problem), option 2 won&#8217;t be necessary, and option 3 won&#8217;t be a problem if there turns out to be a next time after all.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sam Ruby</title>
		<link>http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2/comment-page-1#comment-1173</link>
		<dc:creator>Sam Ruby</dc:creator>
		<pubDate>Mon, 26 Nov 2007 01:45:39 +0000</pubDate>
		<guid isPermaLink="false">http://philwilson.org/blog/2007/11/parsing-atom-with-libxml2#comment-1173</guid>
		<description>Default namespaces are a serialization artifact.  Once read into memory, whether the namespace was a default, or even what prefix was used, doesn&#039;t much matter.  So, what you need to do is register a prefix for you to use at runtime, and use it.

xp = doc.xpathNewContext()

xp.xpathRegisterNs(&quot;atom&quot;, &quot;http://www.w3.org/2005/Atom&quot;)

results = xp.xpathEval(&quot;/atom:feed&quot;)

Note: the above works even if somebody uses the default prefix, or a prefix of atom or even a prefix of a.  Also note that it is faster not to use // if you know the path.

A more complete example:

http://www.intertwingly.net/code/venus/filters/mememe.plugin</description>
		<content:encoded><![CDATA[<p>Default namespaces are a serialization artifact.  Once read into memory, whether the namespace was a default, or even what prefix was used, doesn&#8217;t much matter.  So, what you need to do is register a prefix for you to use at runtime, and use it.</p>
<p>xp = doc.xpathNewContext()</p>
<p>xp.xpathRegisterNs(&#8220;atom&#8221;, &#8220;http://www.w3.org/2005/Atom&#8221;)</p>
<p>results = xp.xpathEval(&#8220;/atom:feed&#8221;)</p>
<p>Note: the above works even if somebody uses the default prefix, or a prefix of atom or even a prefix of a.  Also note that it is faster not to use // if you know the path.</p>
<p>A more complete example:</p>
<p><a href="http://www.intertwingly.net/code/venus/filters/mememe.plugin" rel="nofollow" onclick="pageTracker._trackPageview('/outgoing/www.intertwingly.net/code/venus/filters/mememe.plugin?referer=');">http://www.intertwingly.net/code/venus/filters/mememe.plugin</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>

