Comparison between Hive, Impala, Drill and SparkSQL
2018-01-16 18:12
621 查看
Hive | Impala | Drill | SparkSQL | |
Project Goal | Offline batch processing stuff; Long running job performing data heavy operation, such as joins on huge data sets | Run real-time queries on top of existing Hadoop warehouse | Provides distributed query capability across multiple big data platform. Query data from any or all of those data sources at the same time and can push down into the underlying storage system. | Execute SQL query, then deal with the result sets. |
Similarity | Impala is designed based on Hive. Using the same metadata. All designed for Hadoop env. | Support query data from a variety of different datasources. (RDBMS, NoSQL, File, JSON...) All support JDBC/ODBC drivers. | ||
| | | | |
Difference | Suitable for Offline data processing | Focus on online real-time data processing | Not only hadoop project | |
| | Schema Free: all data is internally represented as either a simple or complex JSON data structure | | |
| | Fully support SQL Query (ANSI SQL:2003) | Just have SQL query capabilities Subset of SQL (SQL-Like) | |
| | Supported by many BI tools | | |
| | | Better security support for data accessing |
https://www.javacodegeeks.com/2015/12/apache-spark-vs-apache-drill.html
相关文章推荐
- SQL数据分析概览——Hive、Impala、Spark SQL、Drill、HAWQ 以及Presto+druid
- SQL数据分析概览——Hive、Impala、Spark SQL、Drill、HAWQ 以及Presto+druid
- hive,shark,sparkSQL,hive on spark,impala,drill比较
- spark sql and hive 3g数据测试
- Hive on Spark and Spark sql on Hive
- A High Level Comparison Between Oracle and SQL Server - Part II
- [置顶] Hive文件存储格式 :Parquet sparksql ,impala的杀手锏
- Impala,Hive,SparkSQL数据清洗后对后续查询的影响比较
- SQL: difference between inner and outer join
- HIVE和SPARKSQL计算引擎在TEXT导入PARQUET格式的HIVE存储引擎分片数量机制
- OLAP: Hive, Impala and Redshift
- spark-sql读取hive挂载alluxio
- Hive和sparksql中的dayofweek
- Cannot resolve the collation conflict between "Chinese_PRC_CI_AS" and "SQL_L及由于排序规则不同导致查询结果为空的问题
- spark-sql 不兼容的hive语法列表
- Spark SQL 与 Spark SQL on Hive 区别
- Spark SQL/Hive 同一列的多行记录合并为一行
- 一起学spark(10) -- spark SQL中的结构化数据之一 : Apache Hive
- sql查询使用BETWEEN ... AND()
- Apache Spark 2.2.0 中文文档 - Spark SQL, DataFrames and Datasets Guide | ApacheCN