Go to content Go to navigation

XML and RDF

As you probably already know, my PhD topic is related with these two technologies from W3C, and how they interact with each other. Unfortunately, their relationship is often misunderstood. While it is true that they overlap to a certain extent, it is also true that they have definite roles. In my humble opinion, there are some of the factors that contribute to the confusion. The first one is that the XML specification is all about syntax. As Erik Wilde and Robert J. Glushko point out in a funny and very interesting article titled 'XML Fever', the XML Information Model (InfoSet) is not widely known. It is difficult to understand XML technologies (such as XPath, XQuery and XSLT) when you don't go beyond the XML syntax. It would have been great if W3C had published (and promoted) InfoSet together with the XML syntax specification. The recognition of this duality, and a clearer separation between the data model (tree) and the serialization (character sequence with characteristic angle brackets) would have been very important, from my point of view.

Other common pitfall is to assume that RDF is an XML application. With RDF, W3C clearly specified the data model and a normative serialization from the beginning. The only problem is that this serialization (RDF/XML) is based in XML, consequently too many people assume they can access the RDF data model with XML tools. I'm convinced that an XML-based serialization of RDF is a cornerstone for the Semantic Web... however, I'm sad that it has lead to so much confusion. One of the authors of the article I cited above recently asked for a RDF/XML parser written in XSLT, but I'm not really sure if such a thing is really useful or even feasible.

This week there was some discussion on a W3C mailing list on the relation between the RDF and XML data models. My wish would be that they could be more similar to each other. For instance, I think that the RDF policy for identifiers (i.e.: the usage of URIs) is better than the QNames of XML+Namespaces. Is it possible to re-formulate XML with a generalization of the QNames to URIs? Are CURIEs an intermediate step?

And what about the trendy JSON? Well, I've already discussed JSON in a previous entry.