A survey and Experimental Comparison of Distributed SPARQL Engines for Very Large RDF Data
2017-11-28 15:09
716 查看
发表于VLDB2017上的一篇文章,对当前主流的分布式SPARQL查询处理引擎做了大量实验进行对比。
原文链接:http://www.vldb.org/pvldb/vol10/p2049-abdelaziz.pdf
实验涉及的系统包括:AdPart [46] ,AdPart-NA [46], CliqueSquare [25] ,DREAM [38] ,EAGRE [56] ,gStoreD[45], H-RDF-3X [29] ,H2RDF+ [41], HadoopRDF [30], Partout [36] ,PigSparql [14] ,S2RDF [15] ,S2X [51],Sedge[57], Sempala[50], SHAPE [32] ,SHARD [47] TriAD [48] ,TriAD-SG
[48], Trinity.RDF [33] ,WARP [28] .
实验使用的数据集包括LUBM,Watdiv和Bio2RDF,评估的主要指标为启动代价和查询性能。
首先对背景做一个简单的介绍,RDF是一种知识表示的模型,是知识图谱的一种表现形式,基本的数据单元为三元组,表现形式为<主体,谓词,客体>,例如<苹果,类型,水果>。RDF图是RDF三元组的一种图表现形式,将主体和客体当作顶点,谓词当作边构建一个大型图。而SPARQL是一种结构化的被用来检索RDF数据的查询语言。包括一系列三元组模式和约束条件。由于RDF结构的灵活性越来越多的知识被表示成RDF格式,例如DBpedia,YAGO和FreeBase等。一个简单的RDF图和SPARQL查询图如下所示:
原文链接:http://www.vldb.org/pvldb/vol10/p2049-abdelaziz.pdf
实验涉及的系统包括:AdPart [46] ,AdPart-NA [46], CliqueSquare [25] ,DREAM [38] ,EAGRE [56] ,gStoreD[45], H-RDF-3X [29] ,H2RDF+ [41], HadoopRDF [30], Partout [36] ,PigSparql [14] ,S2RDF [15] ,S2X [51],Sedge[57], Sempala[50], SHAPE [32] ,SHARD [47] TriAD [48] ,TriAD-SG
[48], Trinity.RDF [33] ,WARP [28] .
实验使用的数据集包括LUBM,Watdiv和Bio2RDF,评估的主要指标为启动代价和查询性能。
首先对背景做一个简单的介绍,RDF是一种知识表示的模型,是知识图谱的一种表现形式,基本的数据单元为三元组,表现形式为<主体,谓词,客体>,例如<苹果,类型,水果>。RDF图是RDF三元组的一种图表现形式,将主体和客体当作顶点,谓词当作边构建一个大型图。而SPARQL是一种结构化的被用来检索RDF数据的查询语言。包括一系列三元组模式和约束条件。由于RDF结构的灵活性越来越多的知识被表示成RDF格式,例如DBpedia,YAGO和FreeBase等。一个简单的RDF图和SPARQL查询图如下所示:
相关文章推荐
- 卫星图像中的车辆分析--A Large Contextual Dataset for Classification, Detection and Counting of Cars
- Solaris vs. Linux: Ecosystem-based Approach and Framework for the Comparison in Large Enterprise
- A Relational Model of Data for Large Shared Data Banks
- (OK) angular2-data-table is a Angular2 component for presenting large and complex data.
- An Architecture for Fast and General Data Processing on Large Clusters
- Managing Catastrophic Loss of Sensitive Data: A Guide for IT and Security Professionals
- [paper note] A Survey of Motion Planning and Control Techniques for Self-driving Urban Vehicles
- 论文笔记 A Large Contextual Dataset for Classification,Detection and Counting of Cars with Deep Learning
- How to: Send and Receive Large Amounts of Data to and from a Web Service
- How to delete a large number of data in SharePoint for List when refreshing data?
- 《A Distributed Graph Engine for Web Scale RDF Data》2013——笔记
- Bigtable: A Distributed Storage System for Structured Data : part1 Abstract and Introduction
- 论文笔记《A Survey of Model Compression and Acceleration for Deep Neural Networks》
- 《Mining Large Streams of User Data for Personalized Recommendations》笔记
- BigQueue:The Architecture and Design of a Publish & Subscribe Messaging System Tailored for Big Data
- How to delete a large number of data in SharePoint for List when refreshing data?
- Solaris vs. Linux: Ecosystem-based Approach and Framework for the Comparison in Large Enterprise
- Kettle解析JSON错误,We MUST have the same number of values for all paths,We can not find and data with path [$.
- 《An Experimental Comparison of Partitioning Strategies in Distributed Graph Processing》——论文笔记
- Solaris vs. Linux: Ecosystem-based Approach and Framework for the Comparison in Large Enterprise