Nov 09 2006

The Great Database in the Sky

Published by Ian Davis at 7:42 pm under Uncategorized

I’m baffled.

I’ve just watched Mårten Mickos from MySQL give a 10 minute talk on what he terms the “Great Database in the Sky” almost exactly describing the our community’s vision of a “web of data” while remaining completely ignorant of the semantic web.

To start, he characterised Google as giving unstructured people access to unstructured data whereas MySQL gives structured people access to structured data, meaning that MySQL is targeted towards developers who understand how to structure data “properly”. A strange polarisation in my view, but I guess he’s trying to put clear blue water between the Google approach and the traditional database approach. At Talis, we don’t see this distinction at all and our core platform technology, Bigfoot, unifies structured and unstructured data.

He went on to describe his vision of a skype for database access, combining my data, your data and public data into the next generation OLAP, running a trillion transactions per day. An example could be weather data and he asked what if you could run a SQL statement across all the data sources in the world, something like SELECT CurrentWindDirection, CurrentWindSpeed FROM AllTheWorldsWeatherStations, MyOwnWeatherStation, MyFriendsWeatherStation.

It’s a noble goal, but he’s not the first to suggest it. It’s also not a future vision because you can do it today with Sparql. It’s at the heart of Bigfoot and there are many other public services that can be used to learn and experiment. You can even query across HTML pages containing embedded structured data.

He followed it up by saying if this were achievable then a whole new generation of web 2.0 applications could be possible. Nothing controversial there, we share the same vision! But we think it’s closer than he does.

What else? Oh yes, he said “we may need a DNS of SQL servers” and that “routing may be an issue”. Another point of agreement, that’s why we built a directory of data collections and services and built web services to route straight into that content.

Then, “how do you make data definitions understandable to others?”. That’s almost like a problem statement for RDF! And yet he didn’t mention it in his list of technologies that might be candidates for the solution: RSS, Atom, Jabber, HTML, HTTP, XML, SQL and SMS.

He concluded his talk with the tagline “The data is the platform” and then took a question from the audience: “How is this different from the semantic web?”.

This is where it became evident that there is a deep disconnect between the traditional database community and the semantic web community. Mårten’s response was rather vague, that this wasn’t as broad as the semantic web and that the semweb includes unstructured data so wasn’t appropriate.

What a shame and what a failure of the semantic web community if the CEO of MySQL AB cannot see how his vision for an interconnected web of data is the same as ours! We must try harder and demonstrate at all levels the value of the semantic web approach to people like Mårten. SWEO and SWIG will help, but the convincing arguments will come from the practical applications of the semantic web being developed to solve real world problems.

Which is why I’m at Talis.

10 Responses to “The Great Database in the Sky”

  1. Alex Jameson 09 Nov 2006 at 8:48 pm

    Ian,

    I couldn’t agree more… over the last 6 months I have been talking about my vision for the future of data (Data2.0, corny tagline I know but hey…)
    Now I come from a practical background in O/RMapping, with my own WinFS like product Base4.

    Part of the problem I think for us non-semweb people, is the dismissive tone of some of the comments you get when you start getting excited about this stuff, statements like: “this is nothing new, it’s just RDF” are common catch cries that we often hear when a semweb advocate has only taken a cursorly glance at what we are discussing.

    It seems the focus is alway on ‘what is the same’ about the idea, rather than ‘what is different’. When perhaps ‘what is different’ is much more interesting because of the very fact that it originated somewhere else, i.e. RDBMS or O/RMapping. And often that somewhere else is a who lot more practical and pragmatic too.

    BTW, I am not saying ALL or even most semweb guys are dismissive, there are plenty who aren’t, but it only takes 2 or 3 dismissive comments, and newbies are likely to bury the head in the sand even further, no one likes hearing they are unoriginal!

    Cheers
    Alex

  2. iandon 09 Nov 2006 at 9:14 pm

    Alex,

    I agree with you. It’s all too easy to point and say “we already thought of that” but much much harder to show concrete benefits of an alternate approach. The semweb community needs to spend more time understanding other domains and the motivations of the users in those domains.

    For example, lots of people use databases and understand them pretty well. However most databases don’t let you easily reference data in other databases. Within a single vendor’s product there are usually schemes for addressing other tablespaces on the same machine and sometimes on different machines. But in general you can’t reference data in another vendor’s software or in a very remote location. That’s a problem that the semantic web solves very simply by giving all important data elements a URI which can refer to local things or remote things across network and vendor boundaries.

    So, when a user encounters the problem of referencing data in different domains then the semantic web offers one possible solution to that. I’d like to see more effort spent on this kind of benefit evaluation from both sides.

    Ian

  3. Marten Mickoson 09 Nov 2006 at 10:56 pm

    Ian,

    Thanks for commenting on my presentation, and sorry if I did not give proper credit to the semantic web, which I should have. I certainly had not come up with those ideas myself - but I hope I gave them a flavour which lends itself to SQL databases. I can also gladly admit that I am no expert on the semantic web.

    But I would also wonder if it isn’t actually a strength of the concepts of the semantic web that even an amateur like myself can present something so that it sounds like the semantic web.

    Marten Mickos, MySQL AB

  4. iandon 09 Nov 2006 at 11:15 pm

    Hi Mårten,

    I was certainly struck by how similar your vision is to the semantic web one. I think we’re all striving for the same goal using different technologies. It would be great if the different communities could work together more closely toward this common goal. I would love to see MySQL be a pioneer for the Semantic Web but I don’t know how to catalyse this. What do you suggest?

    Ian

  5. Kingsley Idehenon 09 Nov 2006 at 11:15 pm

    Ian,

    As I am sure you know, this matter comes down to the ability to expose existing data (SQL, XML, Free Text) as RDF Instance Data.

    This is why we built a Virtual DBMS using an ORDBMS many years ago (circa. 1998, see the story at: http://virtuoso.openlinksw.com/wiki/main/Main/VOSHistory/ ).

    Virtuoso provides the ability to expose Native or 3rd party data as RDF Instance Data via our RDF VIEWs capability etc.. There’s lots of material on this subject at: http://virtuoso.openlinksw.com/wiki/main/Main/VOSRDF/ .

    I expect other DBMS vendors to follow :-)

  6. iandon 09 Nov 2006 at 11:22 pm

    I’d love to see MySQL support production of RDF directly from the database. Even better I’d love to see them support the Sparql protocol!

  7. Andrew Tetlawon 10 Nov 2006 at 6:12 am

    A DB with a HTTP REST API built in: http://couchdb.org

  8. Alex Jameson 10 Nov 2006 at 12:32 pm

    FYI

  9. [...] The most interesting part for me was what he said about the relationship to the Database community. Some new things I will have to learn, that is for sure (eg, he referred to the term “dataspaces”, which seem to be the new buzzword in that community). And, clearly, there is some extra outreach to be done in that space; I just read the blogs of Ian and Danny on the keynote of Mårten Mickos at the the Web 2.0 Summit, which does not look very good  We already had hallway discussion on trying to organize an event around the relationship between SW and databases in 2007 (eg, at W3C); maybe the topic should be a bit larger than I originally thought. To be followed up… [...]

  10. Internet Alchemy » Bridging Two Worldson 21 Nov 2006 at 1:39 pm

    [...] Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback: Trackback URL. « OpenStreetMap GrandChallenge [...]