Lots of data from different domains is published as Linked Open Data. While there are quite a few browsers for that data, as well as intelligent tools for particular purposes, a versatile tool for deriving additional knowledge by mining the Web of Linked Data is still missing. In this challenge entry, we introduce the RapidMiner Linked Open Data extension. The extension hooks into the powerful data mining platform RapidMiner, and offers operators for accessing Linked Open Data in RapidMiner, allowing for using it in sophisticated data analysis workflows without the need to know SPARQL or RDF. As an example, we show how statistical data on scientific publications, published as an RDF data cube, can be linked to further datasets and analyzed using additional background knowledge from various LOD datasets.
Keywords: RapidMiner, Data Mining, Linked Open Data, Scalability, Predictive analytics
Author: Ristoski, Petar
Date created: 2014-12-19 05:00:00.000
Time required: P15M