Categorising Blogs

Ben Hammersley and Azeem Azhar are debating how to create a decentralised categorisation service for blogs, to support a “More Like this” sort of thing…


The debate goes on – Ben responds to Azeem with

It’s a nice idea, but I have an odd feeling about it. It just seems a bit un-emergent, not-very-decentralized, vendor-specific.

to which Azeem has this:

The idea he has is that trackbacks between blogs. Without needing to know what the labels are we can “create plans of meaning”.

One dramatic problem is this: there is no sensible relationship between categories. I may call place a post in a category called “tech” and include within it news about PDAs, XML and cool MacOS software. Ben my TB back to a post of mine and place it in the very specific category of “FOAF news”. The plane of meaning is lost, or it has gone through a messy wormhole.

Azeem envisages:

[…] a public service categorisation engine. These would have a number of requirements:

  • creation of many, multi-faceted taxonomies–to ensure they are decentralised, vendor-independent […]
  • a way of looking at a blog posting (with its cues and clues) and spitting out a set of suggested categories

The first (taxonomies) is tough. I am not a taxonomy expert. That is the area of IAs. However, the design of standards-based taxonomies is well-understood and almost a mature business. There are a number of publically available ones out there. Let’s assume we can find a decentralised way to create many taxonomies. (Big assumption).

The second element is the service itself. It is simply a remote services call. I send the blog post with whatever clues I can (such as posts I am tracking-back to or URLs in the post) and it figures out several candidate categories the post could be in. It could also figure out sites which might want to receive a track-back or a ping about that particular posting.

I have a few more thoughts on this. In particular, how we can create a market for taxonomy definition allowing multiple taxonomies to emerge, what cues you could use to generate category (and “others who might like to see this” information) …


Ben Hammersley wrote a couple of days ago about Trackback, RDF, and the LazyWeb

Here’s the thing: I want to make a More Like This From Others button for each of the entries below. Clicking on it would bring a list of entries, formatted just like the blog, with excerpts of entries on a similar subject from other people. […] So here’s what I’d like. Movable Type blogs now automatically create trackbacks when they can. These trackbacks contain RDF, denoting the category the MT blog has that category within. MT produces RDF indexes too (in the flavour of RSS 1.0). So, what I want is a little app that takes the trackback. Follows it back to the originating site, find the RDF snippet, takes the index.rdf, and gives back all the entries within the index.rdf that are on the same subject as the trackback one.

Azeem Azhar responds:

There are two problems to overcome:

  • People are notoriously bad at categorising things accurately or consistently
  • Predefined controlled vocabularies ensure consistency but laziness or uncertainty by authors means these CVs are rarely used well, unless financial incentives are attached.


So what I would like? When I author a blog post to be able to submit it to a categorisation server. This server to perform analysis on the content, analysis on my context (what it already knows about me), analysis on the context of the blog post (what URLs am I quoting, what am I tracking back to, and analyses of those posts) to provide suggested categories which I can select.

The categories would need to come from an agreed set of taxonomies. DMoz might be one taxonomy but you would need very many more (geographical ones, directed, standardised efforts like the MeSH, or perhaps they could be created by examining and analysis all the categories contained in RSS feeds collected by the method Ben suggests.)

There are further useful ideas in the comments to Ben’s post, including contributions from Burningbird, Phil Ringnalda and Sam Ruby

Proactive application of technology to business

My interests include technology, personal knowledge management, social change