您的位置:首页 > 数据库 > Redis

各种nosql数据库的比较Cassandra,MongoDB,CouchDB,Redis,Riak,HBase

2011-12-01 21:32 651 查看
CouchDB
开发语言:: Erlang

主要优点: 数据一致性,易用

许可: Apache

Protocol: HTTP/REST

适用: 积累性的、较少改变的数据。或者是需要版本比较多的

举例: CRM, CMS systems. 允许多站部署.

Redis

开发语言:: C/C++

主要优点: 一个字 快

许可: BSD

Protocol: Telnet-like

适用: 总数据集快速变化且总量可预测.内存需求较高

举例: 股票价格、实时分析、实时数据收集、实时通信.

MongoDB

开发语言:: C++

主要优点: 类似SQL. (查询, 索引)

许可: AGPL (Drivers: Apache)

Protocol: Custom, binary (BSON)

适用: 动态查询; 索引比map/reduce方式更合适时; 跟CouchDB一样,但数据变动更多.

举例: 任何用Mysql/PostgreSQL的场合,但是无法使用预先定义好所有列的时候

Cassandra

开发语言:: Java

主要优点: 最好的BigTable和Dynamo

许可: Apache

Protocol: Custom, binary (Thrift)

适用: 写入比查询多,只支持Java

举例: 银行,金融行业.

Riak

开发语言:: Erlang & C, some Javascript

主要优点: 容错性高

许可: Apache

Protocol: HTTP/REST

适用: 类似 Cassandra,但比较简单. 如果你需要非常好的可扩展性,可用性和容错性,但你要多站点部署必须付费。

举例: 销售数据的收集。 工厂控制系统。 几秒钟的停机就会有伤害的地方。.

HBase

开发语言:: Java

主要优点: 支持数十亿的列

许可: Apache

适用: 类似 BigTable.gae上就是BigTable

举例: Facebook

CouchDB

Written in: Erlang
Main point: DB consistency, ease of use
License: Apache
Protocol: HTTP/REST
Bi-directional (!) replication,
continuous or ad-hoc,
with conflict detection,
thus, master-master replication. (!)
MVCC – write operations do not block reads
Previous versions of documents are available
Crash-only (reliable) design
Needs compacting from time to time
Views: embedded map/reduce
Formatting views: lists & shows
Server-side document validation possible
Authentication possible
Real-time updates via _changes (!)
Attachment handling
thus, CouchApps (standalone js apps)
jQuery library included

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.

For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.

Redis

Written in: C/C++
Main point: Blazing fast
License: BSD
Protocol: Telnet-like
Disk-backed in-memory database,
but since 2.0, it can swap to disk.
Master-slave replication
Simple keys and values,
but complex operations like ZREVRANGEBYSCORE

INCR & co (good for rate limiting or statistics)
Has sets (also union/diff/inter)
Has lists (also a queue; blocking pop)
Has hashes (objects of multiple fields)
Of all these databases, only Redis does transactions (!)
Values can be set to expire (as in a cache)
Sorted sets (high score table, good for range queries)
Pub/Sub and WATCH on data changes (!)

Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).

For example: Stock prices. Analytics. Real-time data collection. Real-time communication.

MongoDB

Written in: C++
Main point: Retains some friendly properties of SQL. (Query, index)

License: AGPL (Drivers: Apache)
Protocol: Custom, binary (BSON)
Master/slave replication
Queries are javascript expressions
Run arbitrary javascript functions server-side
Better update-in-place than CouchDB
Sharding built-in
Uses memory mapped files for data storage
Performance over features
After crash, it needs to repair tables
Better durablity coming in V1.8

Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.

For example: For all things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.

Cassandra

Written in: Java
Main point: Best of BigTable and Dynamo
License: Apache
Protocol: Custom, binary (Thrift)
Tunable trade-offs for distribution and replication (N, R, W)
Querying by column, range of keys
BigTable-like features: columns, column families
Writes are much faster than reads (!)
Map/reduce possible with Apache Hadoop
I admit being a bit biased against it, because of the bloat and complexity it has partly because of Java (configuration, seeing exceptions, etc)

Best used: When you write more than you read (logging). If every component of the system must be in Java. (“No one gets fired for choosing Apache’s stuff.”)

For example: Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data analysis.

Riak

Written in: Erlang & C, some Javascript
Main point: Fault tolerance
License: Apache
Protocol: HTTP/REST
Tunable trade-offs for distribution and replication (N, R, W)
Pre- and post-commit hooks,
for validation and security.
Built-in full-text search
Map/reduce in javascript or Erlang
Comes in “open source” and “enterprise” editions

Best used: If you want something Cassandra-like (Dynamo-like), but no way you’re gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you’re ready to pay for multi-site
replication.

For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt.

HBase

Written in: Java
Main point: Billions of rows X millions of columns
License: Apache
Protocol: HTTP/REST (also Thrift)
Modeled after BigTable
Map/reduce with Hadoop
Query predicate push down via server side scan and get filters
Optimizations for real time queries
A high performance Thrift gateway
HTTP supports XML, Protobuf, and binary
Cascading, hive, and pig source and sink modules
Jruby-based (JIRB) shell
No single point of failure
Rolling restart for configuration changes and minor upgrades
Random access performance is like MySQL

Best used: If you’re in love with BigTable.

And when you need random, realtime read/write access to your Big Data.

For example: Facebook Messaging Database (more general example coming soon)

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐