Semantics are important. They always have been and they always will be, because they are fundamental to the process of communication. What’s been changing recently is the means that we use to record and communicate information, especially with the advent of the internet.
Currently the web is mainly based on HTML documents, but the content of these documents just tells browsers how to display stuff - other machines find it hard to interpret what the text actually means. It has no context.
The semantic web tries to solve this by allowing people and machines to publish data in formats that are meant to be used for storing, describing and transmitting data... such as RDF and ontologies - a kind of standardised machine-readable way to describe data models. This also helps with the interlinking of data, because different people can easily use standard terminology.
This might sound complicated but it's quite simple really, and it doesn't necessarily mean that you need to publish everything twice: once for humans and once for machines. Admittedly, RDF in XML form is not very readable to humans but its concept makes a lot of sense.
RDF is based around the concept of the triple i.e. every piece of information can be expressed as a combination of a subject, a predicate and an object (or a thing, a property and a value, if you will). For example, Ric has a mass of 82kg. (Ric being the thing, mass the property and 82kg its value). Furthermore, the value in an RDF statement can itself be another thing: Ric's home town is Manchester. Using triples as the basic building blocks, you can represent complex data and concepts.
With a bit of thought, software can be designed with semantics and RDF in mind (it's not too far removed from the principle of object oriented design), and it is then simple to provide the data in the correct format when another person or computer requests it. Modern web frameworks make this very easy (for example Rails, with its support for RESTful routing and respond_to method).
If the things being referred to in your RDF statements have URIs that you can look-up using http, then browsing to them can provide the user or machine with information about those resources, in the form of a web page or an XML document (which might itself include links or references to other related resources). This is what is meant by Linked Data. It’s about making data linkable and browseable in a web environment. Combined with RDF it allows users to flexibly create data models to allow data to be combined together in interesting ways.
Surely at the moment, most data is stored in a database and if we want to present this an HTML document, an RSS/XML feed, or something else - it's just a matter of creating a new "view" for it. Doesn't that allow for the same situations?
Hi Spode. Good question.
The data itself can still be stored in a database, and I imagine that in many cases that is still the best option.
The semantic web is all about exposing the data in a way that other systems can understand, without having to write bespoke software to prepare that data (or to consume it). It's basically all just about giving the data a machine-understandable context. Presenting a feed as RSS rather than HTML doesn't add much more value to the data - it's just a different format.
So are you suggesting a unified interaction method then? Surely SQL is pretty unified? Not entirely, obviously - but it's close.
Is this a replacement for complex APIs that involved writing a new class everytime a new service comes out?
Yes, but unfortunately it's not my idea! - Tim Berners-Lee (Mr. Internet) explains his vision here and here.
What I'm talking about is a level of abstraction above databases: you can (and probably should) still store the data in a way that makes sense to your app. The semantic web and linked data are about giving context to what you expose to the rest of the world, in a way that other humans and computers can understand. You wouldn't want to give every Tom, Dick and Harry access to your full SQL database anyway.
It will certainly simplify interaction with APIs, by more closely aligning the human and machine versions of the data. In many cases, it will actually remove the need for a separate API, as you'll just be able to dereference the appropriate URI, asking for the data in the format you're interested in.
1 to 5 of 5