Whilst trying to parse some Atom (my Blogger backup) with libxml2 I appear to have run into the same problem that Aristotle hit two years ago in XPath vs the default namespace: easy things should be easy, to wit: The story is that you can’t match on the default namespace in XPath.
>> import libxml2
>> doc = libxml2.parseFile("/home/pip/allposts.xml")
>> results = doc.xpathEval("//feed")
>> len(results)
0
Unbelievable.
Immediate potential solutions:
- XSLT my Atom document to add “atom:” to all my default-namespaced elements
- use an entirely different method of parsing
- remove the atom namespace declaration from the top of the file
- something else
Option 3 looks like the only sane route to take in this one-off job, but I’m quite surprised that I have to do it at all.
Actually, this turned out to be my fault – I was parsing two documents at the same time, one with a namespace declaration set correctly (for parsing my Atom file), and one with no namespaces set. I used the latter for my xpath query, which clearly didn’t work – many thanks to everyone who left a comment!