Archive for December, 2006

Dec 16 2006

Search Assistance at eBay

Published by Ian Davis under Uncategorized and tagged as

I spotted an interesting assistive search technique at eBay today:

Image of eBay's search results page showing alternate searches using fewer search terms

It’s interesting to me because of its simplicity and it could be pretty effective at teaching people how to refine searches which is something that doesn’t just come naturally. It’s enlightening to watch an inexperienced person use a search engine and compare it with a pro who magically seems to know the right search terms to type in to get results.

Comments Off

Dec 15 2006

Shit Happens

Published by Ian Davis under Uncategorized and tagged as ,

I’m amused to find the following when I went to read Mark Pilgrim’s cynical jibe at some broken encoding on a web page. Yes, amazingly software doesn’t always work the way we want it to. And serving up an internal error as 200 rather than 500? The shame of it.

Screenshot of a webpage informing the visitor that the site's database is unavailable

2 responses so far

Dec 15 2006

Set the Controls for the Heart of the Web

Published by Ian Davis under Uncategorized and tagged as , , ,

Jim Hendler continues his exploration of the dark side of the semantic web with a must-read editorial for IEEE Intelligent Systems, the well-respected AI journal:

A key realization that Berners-Lee had with respect to the design of RDF is having unique names for different terms, with a social convention for precisely differentiating them, could in and of itself be an important addition to the Web. If you and I decide that we will use the term “http://www.cs.rpi.edu/~hendler/elephant” to designate some particular entity, then it really doesn’t matter what the other blind men think it is, they won’t be confused when they use the natural language term “elephant” which is not even close, lexigraphically, to the longer term you and I are using. And if they choose to use their own URI, “http://www.other.blind.guys.org/elephant” it won’t get confused with ours.

Too right! I paraphrase this in my RDF tutorial as Don’t use my names for your ideas, unless you mean to refer the exact same thing. Then please do!. This Wittgenstein avoidance technique was a stroke of genius: the person who creates the name gets to define it and the definitions can be in terms of other names. Anyone wanting to talk about the same thing can check the definition and if they agree with it they should feel free to use that name, if not then they can just create their own. Because there’s this way to define names in terms of one another then every new name created can potentially be related to any other. This can even be done for private interpretations of concepts that one person may consider to be close enough for their own purposes. If I want to say “cricket” and “chess” are both “games” then I can do so for my own purposes, even if other people consider one to be a “sport” and the other a “pastime”

The second key innovation that TBL made that Jim doesn’t refer to is the selection of URI syntax for the names. Arguably this is actually the more important innovation since without it the earlier unique naming technique cannot work in practice. URIs have the important property of dereferenceability which means that they can be used to fetch information about the thing the name represents. We see this every day when we type URIs starting with http:// into our web browsers. For names that someone has created to represent concepts like cricket it’s useful to send the definition when the URI is accessed. That way the user can decide whether they want to use that name for their own conversations. The use of URIs for naming is what makes the Semantic Web the Semantic Web

One response so far

Dec 14 2006

Mummy Google

Published by Ian Davis under Uncategorized and tagged as

Aaron Swartz:

I was talking with a friend the other day about that perennial subject of conversation in the Valley, Google. And finally she gave me the clue that made the whole place make sense. “It’s about infantilizing people,” she explained. “Give them free food, do their laundry, let them sit on bouncy brightly-colored balls. Do everything so that they never have to grow up and learn how to live life on their own.”

My mission is to convince Talis to give me a big bouncy ball to sit on.

Comments Off

Dec 14 2006

Equilibrium Points

Published by Ian Davis under Uncategorized and tagged as , ,

Over on the Long Tail blog I found a fascinating fact about the render times of Pixar movies: In 1995 each frame of Toy Story took two hours to render yet in 2005 an average frame of Cars took 15! One commenter remarks:

If I remember right, [Tony De Rose's] first project at Pixar was to work on a system to simplify/accelerate the “rigging” of control points for facial features in 3d models. What he found was that every improvement he made in simplifying facial animation was being used by the animators to adding more precision and detail, rather than to reduce the amount of time required to do the facial animation.

And another:

I’ve also heard that the time to get from midtown to downtown Manhattan has remained constant (~45 minutes) for the last one or two hundred years. The increases in transportation technology just allow the city to pack in more density.

Which reminds me of the traffic speed in London:

Despite the congestion charge, traffic in central London moves at just 10mph – the same speed as horse-drawn carriages a century ago.

Average traffic speed has improved by only 1.5mph since the toll’s introduction in 2003, mayor Ken Livingstone has admitted before the London Assembly.

That means cars in central London now travel at the speed of a running chicken, instead of a running house mouse.

It seems to me that there’s a general principle at work here which creates an equilibrium between forces. In the Pixar case it’s the opposition of cost, quality and time that makes the render times rise rather than fall. With traffic it’s time versus convenience. In physics we have the ideal gas laws that constrain the temperature, pressure and volume of a gas (which explains why aerosol sprays come out cold and hot air balloons rise).

The gas laws define a strongly constrained system (change the volume and the pressure or temperature has to change to compensate) but the Pixar and traffic systems are weakly constrained – there’s some give in the system.

With limited capacity or capability people naturally make compromises on their usage of any particular resource. Back in 1996 the Pixar artists pushed their cost and time resources to the limits to get the best quality and the same is true today but the budgets have increased. Each system seems to have a natural equilibrium point – the traffic in London will always move at the speed of a running chicken because that’s the minimum speed commuters will tolerate. I wonder if there are any equilibrium systems in the web?

Comments Off

Dec 14 2006

AI it aint

Published by Ian Davis under Uncategorized and tagged as , ,

It’s great to see someone like Jim Hendler get it:

…you use a small amount of Sem Web (think Foaf or Skos) to add a bit of organizational knowledge (and to webize with URIs) to tagging sites, microformats, and etc. It is the realization that the REST approach to the world is a wonderful way to use RDF and it is enpowered by the emerging standards of SPARQL, GRDDL, RDF/A and the like.

And a final flourish…

And to my AI buddies holed up in your Ivy covered towers, it’s true, I have sold out to the Dark Side — get over it!

I find all the deep OWL reasoning talk at conferences like ISWC very tiring. I mean, I find it fascinating but it’s unrealistic to expect that it’s going to be useful on any kind of web scale any time in the next 20 years. What does work today and is truly a uniqueness of the semantic web is the universal data model that RDF provides and the decentralisation of that through URIs. Yes, you can build an AI system on top of that but it was hard enough when the researchers had free reign over the data representation and execution context. I’m not sure why some think it’s going to be any easier on the Web.

What will make a difference is the volume of data available that could be accessible to the eventual AI applications. But I never got the impression that interpreting the data was the hard bit about AI.

RDF puts the web of data within our grasp and a light touch with some weak semantics will help organise this better for applications and humans to deal with. But, as I wrote a couple of days ago, the role of the human is inseparable from the web.

Don’t get me wrong, I’m a believer in strong AI and I expect a machine will one day be capable of independent thought, but I don’t think it will happen in my lifetime. Unless, that is, the Singularity occurs in the next 20 years and then I’ll have all the time in the world ;-)

3 responses so far

Dec 13 2006

Google Map Image Maker

Published by Ian Davis under Uncategorized and tagged as

This looks useful:

With P2K*P2K, you can create (up to) 2000*2000px high resolution maps, and they are download-able in GIF format. All you need is type in the latitude, longitude of the location, choose the size and zoom level. Lightening fast.

Here’s one of my local area with the mislabelling of the village of Thorpe Malsor as Loddington. Loddington is actually the unnamed village at the far left of the map.

Comments Off

Dec 11 2006

Redefining Web 2.0

Published by Ian Davis under Uncategorized and tagged as

Tim O'Reilly at Web 2.0 conferenceTim O’Reilly has posted yet another definition of Web 2.0. This time he has synthesised several earlier definitions into something that he hopes is clearer:

Web 2.0 is the business revolution in the computer industry caused by the move to the internet as platform, and an attempt to understand the rules for success on that new platform. Chief among those rules is this: Build applications that harness network effects to get better the more people use them. (This is what I’ve elsewhere called “harnessing collective intelligence.”)

While I think this is still too woolly, I thought I’d pick up on the “chief” rule of “applications that harness network effects to get better the more people use them”. Now, I actually believe this is more a definition of social software than Web 2.0 in general. It’s very similar to the definition Tom Coates wrote almost two years ago:

Social Software can be loosely defined as software which supports, extends, or derives added value from, human social behaviour – message-boards, musical taste-sharing, photo-sharing, instant messaging, mailing lists, social networking.

However, O’Reilly’s definition is slightly stronger. It distinctly says that the application gets “better” the more people use them. In the comments to that post, a visitor asks whether that final phrase means “to get better as more people use them” or “to get better as people use them more”. O’Reilly’s answer is:

…”both”. More people, or more usage, or both. A system with lots of people doing very little might be less powerful than one with a mid-sized group doing a lot. Most participatory systems involve both large, little-involved groups and smaller, more committed core groups

So the definition is applications that harness network effects to get better as more people use them or as people use them more. O’Reilly states his definition as a set of rules that will help inform businesses how to be successful using the Internet as a platform. I thought it would be interesting to explore a few Web 2.0 applications and see how this rule applies in each case. I often find that taking things to extremes helps me better understand the underlying problem so I imagined each candidate application having only a single user and asked if the experience were any different from having many users.

For starters, an easy one: eBay. My guess is that eBay must have tens of millions of users buying and selling all manner of goods. What if there were only a single user? Clearly in that case the single user can’t sell to herself and there’s nothing new to buy either. eBay is definitely better with more people using it so under O’Reilly’s definition eBay is doing a lot of the right sort of thing to be successful on the Internet.

last.fm? With only a single user this becomes simply a place to record your favourite tracks. There’s some utility in knowing your all time favourite artist and album, but it’s so much better when your tastes can be correlated with thousands of other people’s.

What about Basecamp, the lightweight project management application built by 37signals? If there were only one user then it seems to me that she could use Basecamp as effectively as when there are dozens or hundreds. Each user has their own project area which is entirely isolated from other peoples. This is by design but it means that Basecamp doesn’t get better the more people use it.

Stikkit was one of the startups launched at the recent Web 2.0 Summit. It’s an application that uses some smart pattern recognition software to overlay structure on your random notes. You can invite other people to contribute or comment on your notes. However, although Stikkit enables you to collaborate with others their participation doesn’t significantly improve the experience. The vast amount of utility that comes from Stikkit is available whether there is a single user or many.

What about something that is features heavily with the Web 2.0 mashup scene: Google Maps. Clearly the availability of the APIs coupled with some very interesting data makes this application a great poster-child for the Web 2.0 movement. However adding more users makes it no better than having a single user. The reason for this is pretty easy to see: Google Maps it’s purely a spectator sport. There’s no participation allowed or encouraged and, as far as I know, Google doesn’t even use incidental usage data to improve the application. So, perhaps this is a different kind of Web 2.0 application, one that doesn’t follow the chief rule of getting better the more people use it.

To be successful in any medium you need to exploit the advantages that medium gives you. You wouldn’t expect any televised news reporting to be successful if it were presented like a newspaper? The intrinsic advantage that television gives you is the capability to instantaneously broadcast moving pictures and sounds to millions of people. Any news reporting that doesn’t use that to the full or treats the medium like an extension of print is doomed to failure. O’Reilly is telling us that to be successful on the Internet you need to exploit the innate advantages of the Internet as a medium.

What are those advantages? There are a few, but the single most important one is the capability to enable almost zero cost communication and exchange of information between any number of people. In Tom Coates’ terms it enables human social behaviour on unparalleled scales. In the same way that television encompasses the advantages of print and adds new capabilities, the Internet enables the same broadcast mode of television but add the capability to incorporate multiway communication between the parties. Building applications that function as though the Internet were a cheaper way to broadcast moving pictures and sound is akin to newscasters simply reading out of copies of that morning’s newspaper on tv!

This communication advantage is the underlying reason for O’Reilly’s chief rule which could be expanded to:

Web 2.0 is the business revolution in the computer industry caused by the move to the internet as platform, and an attempt to understand the rules for success on that new platform. Because the primary advantage of the Internet as a medium is that it enables almost zero cost communication and exchange of information between any number of people the chief rule for success is to exploit this and build applications that harness network effects to get better the more people use them.

Basecamp and Stikkit are taking advantage of another capability that the Internet provides: location transparency. By storing data centrally and using the Internet to access that data these applications eliminate the need for their users to worry about where their data is. It’s available wherever they are. However, although this is a capability the Internet provides, it’s not unique and hence isn’t a significant advantage of the Internet medium. Both applications would work as well over a private dial-in network. Less convenient, but still perfectly workable. The same is true to a lesser extent of Google maps.

Because Basecamp and Stikkit aren’t exploiting the primary advantage of the Internet as a medium then I predict that unless they embrace supporting, extending, or deriving added value from, human social behaviour they will ultimately be left behind and fail.

The idea that the Internet uniquely enables this multiway communication isn’t new. There’s a long history from Tim Berners-Lee’s original web browser that let the user edit the page, through Dave Winer’s prescient Two-Way-Web ideas right up to Richard MacManus’ Read/Write Web. It’s no mystery why the Internet’s killer app has been email.

It’s also no surprise why the Bubble generation of Internet applications failed so spectacularly. They failed to realise the real advantage that the Internet brought and treated it like a cheaper broadcast or worse still a better newspaper.

Web 2.0 is about using the Internet for what it’s good at: building web applications that enable and benefit from human social behaviour on a massive scale.

3 responses so far

Dec 08 2006

We’re Blogging

Published by Ian Davis under Uncategorized and tagged as ,

Talis has a bunch of company blogs that a lot of us contribute to. However, many Talisians have their own personal blogs too: (hope I didn’t miss anyone – let me know!)

I make that over 10% of the company!

Comments Off

Dec 08 2006

Ed Parsons Moving On

Published by Ian Davis under Uncategorized and tagged as ,

With interest I note that Ed Parsons is stepping down as CTO of the Ordnance Survey. Some in the industry are puzzled but after having watched the interaction between Ed and projects like OpenStreetmap I’d like to hazard a guess. Ed has shown some interest in the open data movement (here, here and especially here for example). I think his guarded support should be read as very encouraging given his position as an executive officer of a government organisation that depends on commercialising its data to fund itself. In fact, I like to think that perhaps Ed would have liked the Ordnance Survey to be rather more open than it currently is. I’ve never spoken to him so this is purely speculation, but I think the key part of his post is this quote from the OS intranet:

Ed is keen at this stage of his career to help develop more innovative areas of the GI industry. His decision comes as Ordnance Survey is focusing on a period of a consolidation in its strategic IT development and direction.

I read that as saying “Ed wanted to innovate, we didn’t”

Comments Off

Next »