Linked Data Generation from Digital Libraries

Anastasia Dimou (anastasia.dimou@ugent.be – @natadimou)

Pieter Heyvaert (pheyvaer.heyvaert@ugent.be – @PHaDventure)

Ben Demeester (ben.demeester@ugent.be – @Ben__DM)

imec – Ghent University – IDLab (imec.be – ugent.be – idlab.technology – rml.io)

Knowledge acquisition, modeling and publishing are important in digital libraries with large heterogeneous data sources. The process of extracting, structuring, and organizing knowledge from one or multiple data sources is required to construct knowledge-intensive systems and services for the Semantic Web. This way, the processing of large and originally semantically heterogeneous data sources is enabled and new knowledge is captured. Thus, offering existing data as Linked Data increases its shareability, extensibility and reusability. However, using Linking Data, as a means to represent knowledge, has proven to be easier said than done!

During this tutorial, we will elaborate the importance of semantically annotating data and how existing technologies facilitate the generation of their corresponding Linked Data: We will (i) introduce the [R2]RML1, 2 language(s) to generate Linked Data derived from different heterogeneous data sources, e.g., tabular data in databases, hierarchical data in XML published as Open Data or in JSON derived from a Web API; and (ii) show to non-Semantic Web experts how to annotate their data with the RMLEditor3 which, thanks to its innovative user interface, allows all underlying Semantic Web technologies to be invisible to the end users. In the end, participants, independently of their knowledge background, will have model, annotate and publish some Linked Data on their own!

The goal is to show that domain-experts can easily model the knowledge as Linked Data without being aware of Semantic Web technologies or dependent on Semantic Web experts. By the end of this tutorial, knowledge management or domain experts, data specialists and publishers should know how to model the knowledge that appears in their data as Linked Data, as well as how to annotating their data to generate and publish them as Linked Data.

The tutorial is organized as follows:

In the first session, the participants follow the introduction to Linked Data and Semantic Web and the presentation of exemplary tools that allow them to semantically annotate and publish Linked Data. In the second session, the participants follow the tutorial organizers as they introduce the tools to semantically annotate some sample data and publish them. Thus, there is less time to experiment on their own with the tool chain and data.

1 http://rml.io/

2 https://www.w3.org/TR/r2rml/

3 http://rml.io/RMLeditor.html

Made with in Porto @ FEUP InfoLab / INESC TEC