Pretty Print XML in Python

February 3, 2011

The xmlpp library can be used to convert xml into a pretty format. Unfortunately, it still puts text nodes on their own lines:

<tag>
  <tag>
    Text
  </tag>
</tag>

when what we really want is:

<tag>
  <tag>Text</tag>
</tag>

Older versions of xmlpp also didn't support namespace prefixes - the version of March 2010 does.

It's pure python, so download it from http://xmlpp.codeplex.com/ and then include it in your source.

Import it, then run:

import xmlpp as xmlpp
from StringIO import StringIO
import re

xmlpp.pprint(lXml, output=lStringIO, indent=2, width=160)
lUglyXml = lStringIO.getvalue()
text_re = re.compile('>\n\s+([^<>\s].*?)\n\s+</', re.DOTALL)    
lPrettyXml = text_re.sub('>\g<1></', lUglyXml)
return lPrettyXml

This will output the XML formatted nicely, as we expect.

References