Archive for July, 2005

Jul 31 2005

Microformat Uses?

Published by Ian Davis under Uncategorized and tagged as

It’s fun to see all these microformats being specified. I’m looking for some examples of where they’re being consumed – event aggregators, social networks, licenced works search engines? Got any good ones?

One response so far

Jul 30 2005

Fit For Purpose

Published by Ian Davis under Uncategorized and tagged as

A cool idea from Dan Brickley from a must-read post for any RDF practitioner:

The cultural shift we need, and the toolset to accompany it, is a shift to application-oriented validation. Instead of an absolute, universal ‘yes’ or ‘no’ for some RDF document, we need a more nuanced approach. An RDF document might be well-suited for use in a photo metadata application, but missing some data that is needed for an addressbook.

I think Dan is onto something really interesting here. He’s contrasting the XML view of a document suitable for a single purpose with the RDF view of documents suitable for many purposes whether alone or in combination. The traditional way has been for each application to define their own format forcing the author to duplicate information across multiple incompatible formats. This isn’t the RDF way, whose data model allows documents to contain data suitable for multiple applications without duplication. Dan’s suggesting that rather than checking for conformance to a specific schema, applications could query the RDF file presented to determine whethet its fit for purpose. He points to Sparql as the tool for the job, perhaps with an online catalogue.

Rather than a catalogue, why not a simple RDF vocabulary that allows people to associate queries with classes of applications?

2 responses so far

Jul 30 2005

FRBR and RDF

Published by Ian Davis under Uncategorized and tagged as

Yesterday I published an RDF schema for FRBR. This is a collaboration with Richard Newman, based on work he started last year and incorporates ideas from Leigh Dodds and Bruce D’Arcus.

Now you might be thinking that we’ve just written a schema describing those cute furry creatures that whistle and toot when you stroke them. Sadly no. FRBR stands for Functional Requirements for Bibliographic Records and is a report issued by the International Federation of Library Associations and Institutions. It describes a high level conceptual model of creative works and how they are represented in the real world. Bet you wish we had done the furry alien thing.

In the library and cataloguing world, FRBR is a big deal. It standardises a set of terms and relationships that are essential to any cataloguer. The central concept is that of the Work which is a distinct intellectual or atistic creation. A work is an abstract notion and is completely intangible. You can talk about a work but you can’t physically touch it. When someone conceives of a new work they typically try to express its ideas in some way. A composer might have the notion of a musical composition in mind, and they could express it by writing notes on a stave, humming into a microphone or playing it on a piano. These are Expressions, another core concept in FRBR. A Work is realized through Expressions. An Expression comprises the specific words, symbols or notes that are used to express the Work. Sometimes the same Work is expressed in different words, think of all the different variations of the story of Rumpelstiltskin that you might have read. These are all different Expressions of the same Work. An Expression is usually published and may have several different but related editions. These are called Manifestations. Whenever a particular Expression of a Work is reprinted or reissued without materially changing its content, then it becomes a new Manifestation. The final core concept of FRBR is that of the Item. An Item is an actual physical copy of a particular Manifestation. That Shrek DVD on my shelf is a different physical Item to the one on your shelf, but they are both examples of a specific Manifestation – they look identical when we compare them but they are physically distinct.

That’s not all there is to FRBR, it also describes people, places, objects and events, but these take a supporting role for the central concepts of Work, Expression, Manifestation and Item. The FRBR report is freely available and quite readable. Our RDF expression of that work is less readable at the moment – it has very little prose and a lot of raw data – but we’re working on it and we’d love to have feedback on all aspects of it.

Here are some examples that I’ve come up with based on my understanding of FRBR that might help:

Lord of the Rings is a Work conceived by J.R.R. Tolkien. The primary Expression of that Work is the book that he wrote. This had several Manifestations – the original publication, a single volume edition, multiple reprints. I have a tattered copy in my book collection – this is one Item. A different Expression of the same Work is the screenplay co-written by Peter Jackson. This had several Manifestations – scripts for the actors, production instructions etc. The films themselves were further Expressions of the same Work. The theatre version and the extended versions were different Expressions with their own Manifestations onto VHS and DVD. I have several of these Items on my shelf.

This one I’m less sure about but I think it could be significant. In the Web Architecture what we call a Resource, FRBR would call a Work. Each Representation of that Resource is an Expression of that Work. In other words the HTML and XML versions of a particular page are different Expressions of that page. When a Web browser requests a particular Expression it gets a snapshot of it at a point in time, this is a Manifestation. The Web Architecture doesn’t name this explicitly but it is implicit in some of the HTTP negotiation that goes on around character sets and ranges. The actual bytes that are transmitted and end up on my hard disk are the Item relating to this Manifestation.

This is why I’m pretty excited to have the opportunity to work on something like FRBR. I think it’s going to be a core referent for many other schemas and will enable a base level of common vocabulary between disparate systems. I want to see MusicBrainz, AudioScrobbler, IMDB, Creative Commons, Amazon and so many others using it to describe their catalogues and metadata in a feely interchangeable fashion!

Finally, if you managed to get through all that, treat yourself to some pictures of cute furry aliens. All together now, ahhhhh.

4 responses so far

Jul 28 2005

Querying Collections With Sparql

Published by Ian Davis under Uncategorized and tagged as

Chatlog of danbri working out how to query RDF collections in Sparql:

SELECT ?feature, ?a, ?b WHERE
{
  ?x :name ?feature .
  ?x foaf:isPrimaryTopicOf wikipedia:Clifton_Suspension_Bridge .
  ?x :centerLine  ( [gml:pos ?a] [gml:pos ?b] )  .
}

Comments Off

Jul 28 2005

Web Principles

Published by Ian Davis under Uncategorized and tagged as

Words close to my heart from lesscode.org’s Motherhood and Apple Pie

The good news in all this is that there is a resurgence of interest in the web’s basic principles that is somewhat oriented toward the business community. I believe this renewal of interest to be the result of increased communication through weblogs and other web-friendly collaboration tools combined with massive adoption of free and open source development methods. The bad news is that a huge portion of the software industry isn’t involved and are in many ways blocking progress using techniques that are hard to describe as anything other than dishonest.

Comments Off

Jul 12 2005

Upgrading

Published by Ian Davis under Uncategorized and tagged as

Pardon the dust while I upgrade to WordPress 1.5.1 – it may take some time :)

Comments Off

Jul 08 2005

RSS Issues Wiki

Published by Ian Davis under Uncategorized and tagged as ,

Once again I’ve cleaned up the RSS Issues Wiki. It was completely clobbered by wikispam so now I’ve been forced to add an edit password. Just email me and I’ll send it to you. I had to reconstruct each page individually, I hope I got it all right, please let me know if not. To get a sense of the amount of spam that is being added to wikis around the world consider that the pre-spam version of this wiki including change history was over 300MB while the post-spam version is 65KB!

P.S. I made a terrible mistake with this wiki that I’ll not repeat: I started it without getting a critical mass of interested parties first. I wrote about the survival of the wiki on the Silkworm blog a few days ago.

Comments Off

Jul 04 2005

Talis, Web 2.0 and All That

Published by Ian Davis under Uncategorized and tagged as ,

I’ve recently started working for a company called Talis as the technical lead for the research group. Talis is a pretty mature company. It was established as a co-operative providing library systems in 1969, so it’s a year older than me and the same age as the Internet. It embraced the web with the first web-based OPAC in the mid nineties. A few years later it changed its structure from a co-operative to a private company and more recently it’s been undergoing an internal revolution, driven by the new CEO Dave Errington and his vision for Talis 2.0 as a modern software company. Being somewhat of an outsider looking in I can see that the changes have been significant and painful but the results are showing, and they’re exciting. Dave’s building a new company focussed on innovation, using the existing business almost as an incubator. As the technical lead for the research group I’m responsible for those incubated projects. Number one on the list is the Silkworm platform. Although the name’s a working title, and will almost certainly change, it’s very suggestive of threads being woven to invoke a tranformation. Silkworm is going to be a platform built on Web 2.0 principles: participation, openness and communication. You can read more about it in the white paper (PDF), but the best way to follow development is on the blog.

Web 2.0 is a cunning moniker. As Danny pointed out, it’s pretty hard to find a concrete definition of what it actually is. Some think it’s about AJAX or cool applications such as Flickr and Google Maps. Others believe that it’s about web services and that finally all those specs are going to be used.

Here’s my take on it: Web 2.0 is an attitude not a technology. It’s about enabling and encouraging participation through open applications and services. By open I mean technically open with appropriate APIs but also, more importantly, socially open, with rights granted to use the content in new and exciting contexts. Of course the web has always been about participation, and would be nothing without it. It’s single greatest achievement, the networked hyperlink, encouraged participation from the start. Somehow, through the late nineties, the web lost contact with its roots and selfish interests took hold. This is why I think the Web 2.0 label is cunning: semantically it links us back to that original web and the ideals it championed, but at the same time it implies regeneration with a new version. Technology has moved on and it’s important that the social face of the web keeps pace.

Web 2.0 isn’t the Semantic Web. Some might say it’s the semantic web (lower case) or that it’s a stepping stone to the Semantic Web. I don’t hold either of those views. I believe that the Semantic Web is actually a part of Web 2.0 which is to say not only that Web 2.0 is more important than the Semantic Web but that Web 2.0 requires the Semantic Web. For Web 2.0 to function as a social enabler it requires remixable available via accessible APIs. XML is hailed as the lingua franca of web applications but, as I’ve written before, XML isn’t enough and I think the RDF model is necessary to provide readliy remixable data. Just think smushing. I’m in the minority though, most archetypical Web 2.0 applications are producing XML in incompatible dialects but I hope to demonstrate that there is real value buried behind RDF/XML and with the current activity around Sparql as a query language it’s going to be easier than ever to access all that data in a uniform manner.

So, I’m pretty excited to be at Talis right now. We’re working on a new platform building on core web technologies. We’re going to dogfood it, and plan to be running core Talis applications on the platform as soon as possible. But better still the goal is to make this platform open and available to others on the web both as users and developers as we work towards the architecture of participation.

Comments Off