So I'm flipping through some old weblog entries, and I run across this one:
So when is it a good idea to use XML for your data? The easy answer is that you should use XML when it is likely to be easier (in the long run) than creating your own parser. Using XML carries some cost. XML is verbose, and parsing is guaranteed to be slower than a custom parser. Why is XML such the rage then? Aside from the hype there is one very good reason to use XML. For many types of data, it is easier to just load the data into a DOM and extract the information from that, than it is to write a custom parser. That means less time spent debugging code, more time spent focusing on the problem at hand.Gotta say, no way. XML isn't just a format designed to avoid having to write a parser; going down this path may seem like a good idea in the beginning, but over time it's eventually going to bite you in the ass in a big way--just ask James Duncan Davidson, the original creator of Ant, who later came to admit that
- He originally chose to use XML as the format for Ant scripts because he didn't want to write a parser, and
- He really regrets it and apologizes to the Java community at large for it.
The problem, in the case of Ant--and its successors, like MSBuild--in using XML is that it is a strictly-hierarchical format, and not everything follows a strictly hierarchical format (even though it might seem to at first). More importantly, XML is a hideously verbose format, and the "self-descriptive" tags that everybody blathers on about are only self-descriptive to carbon-based life forms (and then only if semantically-rich terms are used for the tag names). For example, does this "self-descriptive" XML have any meaning to you?
It obviously avoids the verbosity that frequently plagues XML, but clearly surrenders a lot of the self-descriptiveness as a result.<p><a>34</a><s>046604143</s><ph>42049941499<ph></p>
So when is it a good idea to use XML for your data? My criteria are a bit more stringent:
- When your data is naturally hierarchical to begin with
- When exchange with foreign platforms (which is to say, platforms not native to what you're currently authoring in) is important
- When pre-existing tool support (XSLT, XML viewers, import/export utilities, etc) is of paramount importance