Summarizing Linked Data RDF Graphs Using Approximate Graph Pattern Mining

Mussab Zneika, ETIS Lab, ENSEA - University of Cergy-Pontoise - CNRS, Cergy, France
Claudio Lucchese, HPC Lab., ISTI-CNR, Pisa, Italy
Dan Vodislav, ETIS Lab, ENSEA - University of Cergy-Pontoise - CNRS, Cergy, France
Dimitris Kotzinos, ETIS Lab, ENSEA - University of Cergy-Pontoise - CNRS, Cergy, France

Dec. 24 2015

Accepted at EDBT ’16: the 19th International Conference on Extending Database Technology [1].

Abstract. The Linked Open Data (LOD) cloud brings together information described in RDF and stored on the web in (possibly distributed) RDF Knowledge Bases (KBs). The data in these KBs are not necessarily described by a known schema and many times it is extremely time consuming to query all the interlinked KBs in order to acquire the necessary information. To tackle this problem, we propose a method of summarizing large RDF KBs using top-K approximate RDF graph patterns, which we transform in an RDF schema that describes the contents of the KB. We also add information on the number of various instances of the patterns. Thus we can then query the RDF graph summary to identify whether the necessary information is present and if it is present in significant numbers whether to be included in a federated query result.

References

[1]   Mussab Zneika, Claudio Lucchese, Dan Vodislav, and Dimitris Kotzinos. Summarizing linked data rdf graphs using approximate graph pattern mining. In EDBT ’16: Proceedings of the 19th International Conference on Extending Database Technology, 2016.

Share on