Jun 13 2006

Towards Copy and Paste eRDF

Published by Ian Davis under Uncategorized and tagged as , ,

Some recent discussion on the W3C’s RDF-in-XHTML list has got me thinking about how to enhance Embedded RDF to enable copy-and-paste metadata. Being able to copy some markup from one document to another is a key requirement of the RDF-in-XHTML taskforce. One strong use case is for embedded Creative Commons licences. Here the Creative Commons people would like to be able to generate a snippet of XHTML to be copied into the user’s document without them having to change markup in multiple places. Currently eRDF requires at least three changes to the XHTML document to enable the embedding:

  1. Addition of profile attribute on head
  2. Schema prefix declarations in head
  3. Embedded markup in body

I think there’s a small change that could be made to the eRDF parsing rules to eliminate number two on that list. Currently schema prefixes are declared in link tags in the head of the document:

<link rel="schema.foaf" href="http://xmlns.com/foaf/0.1/" />

My idea is to relax the constraint that the schema declaration must be in a link tag and allow it to appear anywhere in the document for any element that has a rel attribute. So this would allow:

<a rel="schema.foaf" href="http://xmlns.com/foaf/0.1/" />foaf</a>

The current specification already allows duplicate schema prefixes to be declared:

The order in which the schemas are declared is not significant. Where two identical schema prefixes are declared the first takes precedence. Any subsequent declarations are ignored.

It would require the prefix name “schema” to be reserved throughout the document but I don’t think that would cause any serious problems.

With this change it would be possible to write licence links like this:

<div>
  This work is licensed under a
  <a href="http://web.resource.org/cc/"
     rel="schema.cc">Creative Commons</a>
  <a rel="license cc-license"
     href="http://creativecommons.org/licenses/by/2.0/uk/">
     Attribution 2.0 England & Wales License
  </a>
</div>

Thoughts and comments welcome as always.

2 responses so far

Nov 14 2005

Classes In Embedded RDF

Published by Ian Davis under Uncategorized and tagged as ,

I updated the embedded RDF extractor stylesheet to support the two constructs I wrote about a couple of days ago, namely using cite to generate subject URIs and hyphen prefixed class names to denote RDF classes.

I’ll update the main documentation shortly but in the meantime here’s some info on the embedded Class support.

For tags with an id attribute and for anchors any token beginning with a hypen in the class attribute are considered to be RDF class names. So the following XHTML:

<p id="ian" class="-foaf-Person">I am a person</p>

generates the following triple:

<#ian> rdf:type foaf:Person .

A more complex example:


  <p id="ian" class="-foaf-Person">
    <span class="foaf-name">Ian Davis</span> has a homepage
    <a href="http://purl.org/NET/iand" rel="foaf-homepage" class="-foaf-Document">here</a>
  </p>

Generates these triples:


<#ian> rdf:type foaf:Person .
<#ian> foaf:name "Ian Davis" .
<#ian> foaf:homepage <http://purl.org/NET/iand> .
<http://purl.org/NET/iand> rdf:type foaf:Document .

Or one with mixed properties and classes in the class attribute:


  <p id="ian" class="-foaf-Person">
   Ian is owed $1 by <span class="foaf-knows -foaf-Person" id="eric">Eric Miller</span>
  </p>

Which embeds the following triples:


<#ian> rdf:type foaf:Person .
<#ian> foaf:knows <#eric> .
<#eric> rdf:type foaf:Person .

As a final test, I’ve augmented my homepage with classes, added a meta link to the embedded RDF extraction service and confirmed that Piggy Bank picks up the types. Works a treat.

2 responses so far

Nov 09 2005

Embedded Trackbacks

Published by Ian Davis under Uncategorized and tagged as , , ,

Given the cite extension I mentioned earlier today it becomes possible
to embed another well-known schema: trackbacks. The trackback RDF is notorious for being embedded in an HTML comment which hides it from everything except screen scrapers. Here’s the trackback rdf from this posting:


<!--
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	    xmlns:dc="http://purl.org/dc/elements/1.1/"
	    xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">

  <rdf:Description rdf:about="http://internetalchemy.org/2005/11/embedded-trackbacks"
    dc:identifier="http://internetalchemy.org/2005/11/embedded-trackbacks"
    dc:title="Embedded Trackbacks"
    trackback:ping="http://internetalchemy.org/2005/11/embedded-trackbacks.tb" />

</rdf:RDF>
-->

Here’s one way of embedding that information in an XHTML document.


<html  xmlns="http://www.w3.org/1999/xhtml">
  <head profile="http://purl.org/NET/erdf/profile">
    <title>Embedded Trackbacks</title>
    <link rel="schema.dc" href="http://purl.org/dc/elements/1.1/" />
    <link rel="schema.trackback" href="http://madskills.com/public/xml/rss/module/trackback/" />
  </head>
  <body>
    <blockquote cite="http://internetalchemy.org/2005/11/embedded-trackbacks">
      <h1 class="dc-title">Embedded Trackbacks</h1>
      <p>
         Link to this post at this permanent URL:
        <a href="http://internetalchemy.org/2005/11/embedded-trackbacks"
           class="dc-identifier">http://internetalchemy.org/2005/11/embedded-trackbacks</a>
        or ping it with a track back using this URL:
        <a href="http://internetalchemy.org/2005/11/embedded-trackbacks.tb"
           class="trackback-ping">http://internetalchemy.org/2005/11/embedded-trackbacks.tb</a>
      </p>
    </blockquote>
  </body>
</html>

I’ve worked on an enhanced embedded RDF extractor that understands the cite usage and here’s its output. Hopefully I should be able to release the cite-enabled version later tonight.


<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:ns211="http://madskills.com/public/xml/rss/module/trackback/">
  <rdf:Description rdf:about="">
    <admin:generatorAgent rdf:resource="http://purl.org/NET/erdf/extract"/>
  </rdf:Description>
  <rdf:Description rdf:about="http://internetalchemy.org/2005/11/embedded-trackbacks">
    <dc:title>Embedded Trackbacks</dc:title>
    <dc:identifier>http://internetalchemy.org/2005/11/embedded-trackbacks</dc:identifier>

    <ns211:ping>http://internetalchemy.org/2005/11/embedded-trackbacks.tb</ns211:ping>
  </rdf:Description>
</rdf:RDF>

One response so far

Nov 09 2005

Enhancing Embedded RDF

Published by Ian Davis under Uncategorized and tagged as ,

I’ve been thinking more about embedded RDF, prompted by the many people at the conference who have been grilling me about it. There’s a real desire to get more of RDF embeddable and there are two key areas I’m going to focus on: (1) assertions about external URIs and (2) classes (at Eric Miller’s very vocal insistance!).

Issue one is easiest I think. The embedded RDF rules state that class attributes on tags inside anchor elements represent properties of the target of that anchor. Because you can’t nest anchor elements in HTML, this limits the embedded triples to having literal values.

As an illustration, the following HTML shows an embedded assertion that the Dublin Core title of http://example.com/physics is the literal “My physics page”:

<a href="http://example.com/physics"><span class="dc-title">My physics page</span></a>

It would be useful to be able to embed assertions with URI values, e.g. the FOAF topic of the page. For example:

<http://example.com/physics> foaf:topic <http://en.wikipedia.org/wiki/Physics> .

At the moment you’d have to nest anchors to achieve this in embedded RDF. A way around this would be to utilise the cite attribute. The HTML 4.01 specification says this about cite: The value of this attribute is a URI that designates a source document or message. This attribute is intended to give information about the source from which the quotation was borrowed.

So, possibly, it would be reasonable to use cite attributes to specify the subject of a series of triples almost as though we were quoting some metadata from that URI. So, my example could be rewritten as:

<blockquote cite="http://example.com/physics">
  <p>
    <span class="dc-title">My physics page</span> which is about
    <a href="http://en.wikipedia.org/wiki/Physics" rel="foaf-topic">physics</a>
  </p>
</blockquote>

which would represent the following two triples:

<http://example.com/physics> dic:title "My physics page" .
<http://example.com/physics> foaf:topic <http://en.wikipedia.org/wiki/Physics> .

The cite attribute can be used on q, blockquote, ins and del tags which collectively allow all kinds of markup to be contained within them, certainly enough for our needs.

The second problem, that of embedded classes, is harder. Eric wants this for better Piggy Bank integration and has promised me mucho lucre as an incentive (well mucho more than zero, mucho less than $2).

There are a number of approaches. The first is actually already supported. You can just embed an rdf:typelink in the HTML:

<p id="ian">I am a <a rel="rdf-type" href="http://xmlns.com/foaf/0.1/Person">person</a></p>

I don’t think this is what Eric needs because the usage is likely to be low - it’s extraneous information for humans. Besides, who’s going to be linking to RDF property definitions in their prose? (RDF geeks excluded of course).

An alternative is to do something funky in the XSLT that extracts the RDF from the HTML. It could dereference each schema referenced in the head of the document and extract all the domains and ranges. This is possible but how would it deal with domains or ranges that have blank nodes as their value? Probably simply by ignoring them. A more serious problem is that often the author knows more than the schema. I might be using the foaf:depiction property to relate me to my picture. The FOAF schema declares the domain of foaf:depiction to be owl:Thing but obviously I know that I’m describing the property of a foaf:Person and it would be nice to be able to explicitly state this.

Another approach is to continue using the class attribute but lexically separate types, perhaps by prefixing with a special character. The CSS grammar doesn’t allow much wiggle room: basically it’s a hypen or nothing! Perhaps a class name prefixed with a hyphen should be interpreted as the name of a type like this:

<p id="ian" class="-foaf-Person">I am a person</p>

which would infer an additional triple:

<#ian> rdf:type foaf:Person .

I could live with this since I believe the number of type declarations in a document will be a small compared to the use of properties. What do you think? I’m casting around for other suggestions here too. If you have a good idea let me know and help me win Eric’s dollar.

6 responses so far

Nov 09 2005

Naked Metadata Using Embedded RDF

Published by Ian Davis under Uncategorized and tagged as , ,

Jonathan O’Donnell has implemented embedded RDF in his page on Naked Metadata. His reasoning echos exactly the microformats principles of visible metadata and don’t repeat yourself:

When I first learned to put Dublin Core into Web pages, I often found myself replicating data. I would place a DC.creator tag in the head, even though the name of the author was on the Web page. This annoyed me, because I knew that it is bad practice to replicate data like that. When I mentioned this to a workmate at the time, he said that I could probably make a link from the metadata field to the data in XML. At that stage, I didn’t understand enough XML to even understand the concept, much less make it work.

Fast forward eight years to DC-ANZ 2005, where Eve Young and Baden Hughes made the point that people updating Web pages often don’t update the metadata. One of the problems that they talked about was that metadata in the header is essentially invisible to people editing the page (when, for example, using some wysiwyg editors).

In general, data (including metadata) should be stored in one place only. This prevents drift: if it is only stored in one place, it can only be updated in that place.

Comments Off