S2X: graph-parallel querying of RDF with GraphX

Abstract: RDF has constantly gained attention for data publishing due to its flexible data model, raising the need for distributed querying. However, existing approaches using general-purpose cluster frameworks employ a record-oriented perception of RDF ignoring its inherent graph-like structure. Recently, GraphX was published as a graph abstraction on top of Spark, an in-memory cluster computing system. It allows to seamlessly combine graph-parallel and data-parallel computation in a single system, an unique feature not available in other systems. In this paper we introduce S2X, a SPARQL query processor for Hadoop where we leverage this unified abstraction by implementing basic graph pattern matching of SPARQL as a graph-parallel task while other operators are implemented in a data-parallel manner. To the best of our knowledge, this is the first approach to combine graph-parallel and data-parallel computation for SPARQL querying of RDF data based on Hadoop

Location
Deutsche Nationalbibliothek Frankfurt am Main
Extent
Online-Ressource
Edition
Postprint
Language
Englisch
Notes
Wang F., Luo G., Weng C., Khan A., Mitra P., Yu C. (eds) Biomedical Data Management and Graph Online Querying. Big-O(Q) 2015, DMAH 2015. Lecture Notes in Computer Science, vol 9579, isbn: 978-3-319-41575-8
cc_by_nc_nd http://creativecommons.org/licenses/by-nc-nd/4.0/deed.de cc

Classification
Informatik
Keyword
Hadoop
RDF
SPARQL
Semantic Web

Event
Veröffentlichung
(where)
Freiburg
(who)
Universität
(when)
2016
Creator

DOI
10.1007/978-3-319-41576-5_12
URN
urn:nbn:de:bsz:25-freidok-122783
Rights
Der Zugriff auf das Objekt ist unbeschränkt möglich.
Last update
25.03.2025, 1:55 PM CET

Data provider

This object is provided by:
Deutsche Nationalbibliothek. If you have any questions about the object, please contact the data provider.

Time of origin

  • 2016

Other Objects (12)