Cascading map-side joins over HBase for scalable join processing : : [technical report]
Abstract: One of the major challenges in large-scale data processing with MapReduce is the smart computation of joins. Since Semantic Web datasets published in RDF have increased rapidly over the last few years, scalable join techniques become an important issue for SPARQL query processing as well. In this paper, we introduce the Map-Side Index Nested Loop Join (MAPSIN join) which combines scalable indexing capabilities of NoSQL storage systems like HBase, that suffer from an insufficient distributed processing layer, with MapReduce, which in turn does not provide appropriate storage structures for efficient large-scale join processing. While retaining the flexibility of commonly used reduce-side joins, we leverage the effectiveness of map-side joins without any changes to the underlying framework. We demonstrate the significant benefits of MAPSIN joins for the processing of SPARQL basic graph patterns on large RDF datasets by an evaluation with the LUBM and SP2Bench benchmarks. For most queries, MAPSIN join based query execution outperforms reduce-side join based execution by an order of magnitude
- Standort
-
Deutsche Nationalbibliothek Frankfurt am Main
- Umfang
-
Online-Ressource
- Sprache
-
Englisch
- Anmerkungen
-
CoRR (arXiv:1206.6293), url: http://arxiv.org/corr/home
cc_by_nc_nd http://creativecommons.org/licenses/by-nc-nd/4.0/deed.de cc
- Klassifikation
-
Informatik
- Schlagwort
-
Hadoop
RDF
SPARQL
Semantic Web
- Ereignis
-
Veröffentlichung
- (wo)
-
Freiburg
- (wer)
-
Universität
- (wann)
-
2012
- Urheber
-
Schätzle, Alexander
Przyjaciel-Zablocki, Martin
Hornung, Thomas Daniel
Dorner, Christopher
Lausen, Georg
- Beteiligte Personen und Organisationen
-
Technische Fakultät
Institut für Informatik
Albert-Ludwigs-Universität Freiburg
- DOI
-
10.6094/UNIFR/12281
- URN
-
urn:nbn:de:bsz:25-freidok-122812
- Rechteinformation
-
Der Zugriff auf das Objekt ist unbeschränkt möglich.
- Letzte Aktualisierung
-
25.03.2025, 13:49 MEZ
Datenpartner
Deutsche Nationalbibliothek. Bei Fragen zum Objekt wenden Sie sich bitte an den Datenpartner.
Beteiligte
- Schätzle, Alexander
- Przyjaciel-Zablocki, Martin
- Hornung, Thomas Daniel
- Dorner, Christopher
- Lausen, Georg
- Technische Fakultät
- Institut für Informatik
- Albert-Ludwigs-Universität Freiburg
- Universität
Entstanden
- 2012