Meta Tags: The Poor Man’s RDF?
I’ve always thought that what makes del.icio.us so successful despite a lot of recent competition is that it exemplifies the same kind of thinking Tim Berners-Lee described in his “Axioms of Web Architecture” as the “Principle of Least Power,” or what the designers of the Internet called “end-to-end architecture.” Instead of trying to build a powerful, heavyweight system that would anticipate a user’s every need, Joshua and company have been building an extremely simple, general service that can be endlessly adapted and easily plugged into other systems.
The variety of “meta” tagging schemes del.icio.us users have evolved is great evidence of this. Probably the best known example, the “for:” tag prefix, has actually become commonly enough used that Joshua and company have given it formal recognition within the system, but there are lots of other schemes in usage. I’m a frequent user of “cite:” (to attribute links I get from other people), for example, and now that I’ve been using the new del.icio.us media support to run an ad-hoc podcast, I’ve been exploring the use of “meta” tags for music metadata.
It was this last usage that led me to an epiphany. As I started tagging songs I posted with metadata like “artist:thenational” and “soundslike:afghanwhigs,” I realized that I was, in effect, creating triples and using tags as a sort of poor man’s RDF.
For those who aren’t familiar with RDF (which is probably almost everyone, since RDF is rarely explained well), it might be helpful to pause for a moment to explain the concept of triples. The first thing you need to understand is that RDF is an attempt to standardize the way we represent the relationships between data, just as XML standardizes the way we represent that data’s structure. Information in RDF is expressed as a series of statements that each have a subject, verb, and object–triples. For example: The National [subject] sounds like [verb] the Afghan Whigs [object].
A collection of these statements is referred to as a “triple store,” which can be understood as a sort of ad-hoc database. While traditional relational databases require that new properties be formally added to the model either by adding new columns to an existing table or by adding additional tables, adding a new property in RDF is as simple as asserting in a triple that value X is a certain property of entity Y.
So, then, a collection of del.icio.us links containing “meta” tags can be thought of as a triple store. “Interesting,” you may say, “but why should I care?” Which is a very reasonable question given RDF’s current track record of real-world usefulness.
Well, as it happens, my del.icio.us/triple store epiphany has given me an idea for a Cocoalicious feature (yes, I do plan to release a new version of Cocoalicious some day). It occurred to me that I could turn Cocoalicious into a sort of free-form database by treating “meta” tags as triples, compiling a list of the implicit properties contained in these triples, and allowing the user to add columns displaying the values of those properties for each post to the main table view (probably by way of a context menu like the one you get by right-clicking on the iTunes column headers). That way the user could implicitly add his or her own metadata the Cocoalicious model, which could be used to sort and query the post list.
I suppose this feature might be too esoteric to be useful to that many people, and it has some problems (for example, how do we handle the situation where a single post has two values for the same property) but the Cocoalicious project is all about experimentation, so I think I’m going to implement it anyway. Anybody have any interesting thoughts on the subject?
August 7th, 2005 at 1:07 am
That is a fantastic idea.
This also lends itself to an approach I have been thinking of for representing RDF data in Lisp cons cells. RDF uses a triple as its model, but Lisp uses a pair. The most obvious way to use a pair for an RDF triple is to use two of them, with one pointing to the other. When I examined all the combinations for mapping the triple to a pair of cons cells on paper, the one you identified from del.icio.us tags jumped out at me.
I’m just amazed to see the connection with tags and RDF laid out, even though I thought it must be there somewhere. I think a regular tag is an RDF statement with a blank predicate; what do you think?
But the idea of how to integrate this with Cocoalicious UI takes the biscuit. Go do it immediately!
August 7th, 2005 at 10:41 am
That’s a good idea. I’d suggest going ahead and using an RDF engine. (I like Redland with Python bindings.) The del.icio.us feeds are RSS 1.0 so the del.icio.us/for/xyz feeds are already have the RDF for the post, link, description, author and date. You just need to add the triple for:xyz.
As for “too esoteric” maybe “leading edge”. The notion of pure tags is just not expressive enough. There’s always a presumed triple something like isabout:python. “for:bob” is an ad hoc response to the need for more expressiveness. I wouldn’t be too surprised if the developers generalized and allowed posting any triple in the feeds. The obstacle of course is how to encourage the use of standard predicates, which some would reject as being unfolksonomic. But as an option why not?
(If you go for full RDF then Cocoalicious could serve as a specialized aggregator and produce RDF as well as being a client.)
August 9th, 2005 at 7:20 pm
Little bits of joy
Ghost Train is a brilliant hack. It lets you visually “script” tests of web apps by clicking, pointing and dragging. That is to say, you can generate test scaffolding for your web app’s frontend by using the app. It then generates…
February 7th, 2006 at 12:46 pm
In case you haven’t seen it, further down the spiral of metadata enriched tags:
http://geobloggers.blogspot.com/2006/01/advanced-tagging-and-tripletags.html