philwilson.org

Microformats and aggregation

19 June, 2005

I always hit a mental wall when I think about microformats.

I’ve just been re-reading Charlene Li’s “Structured blogging – an introduction and the implications” and Alf Eaton’s “About reviews and microformats” and I just can’t get over the fact that, as useful as it would be for specialised search engines for everyone to use some kind of marvellous microformat to mark-up the content of their reviews etc., it’s just not good enough for aggregation. I just don’t want to have to parse your content to see if it’s got a review stuck in the middle somewhere, I want all the review metadata in separate elements in the feed itself, preferably using the RVW namespace so that I can aggregate and store data sensibly and separately from, but related to, the main content.

The question Alf asks in his post is really good (I’ve only really just got around to reading it properly): is XHTML a good enough storage system?

For storage of the marked-up data as-is, alongside the content, it’s probably fine, but as soon as want to do something useful, or practical with that data, it’s not.

See other posts tagged with general and all posts made in June 2005.

Comments

ryan king
19 June, 2005 at 22:36

You say: “so that I can aggregate and store data sensibly and separately from, but related to, the main content.”

But the review is the content. There’s no need to seperate parts of it from the others.

Pip
19 June, 2005 at 22:57

I knew someone would say that 😉

A review is not a single entity, it is made up of all those parts that hReview and RVW take into account. From the perspective of syndication (which is where I was coming from), the ‘content’ is the main review text.

Things like the rating, review type and so on should be stored separately, not in the same field as, the review body.