您的位置：首页 > 数据库 > Mongodb

MongoDB vs Hadoop

2013-07-18 09:42 106 查看

MongoDB has its own MapReduce framework and Hadoop has HBase. HBase is a scalable database similar to MongoDB.

The main flaw in Hadoop is that it has a single point of failure, namely the “NameNode”. If the NameNode goes down, the entire system becomes unavailable.

MongoDB has no such single point of failure. If at any point in time, one of the primaries, config-servers, or nodes goes down, there is a replicated resource which can take over the responsibility of the system automatically

MongoDB supports rich queries like traditional RDBMS systems and is written in a standard JavaScript shell.

Hadoop has two different components for writing MapReduce (MR) code, Pig and Hive. Pig is a scripting language (similar to python, perl) that generates MR code, while Hive is a more SQL-like language. Hive is mainly used to structure the data and provides a
rich set of queries.

Data has to be in JSON or CSV format to be imported into MongoDB. Hadoop, on the other hand can accept data in almost any format.

Hadoop structures data using Hive, but can handle unstructured data easily using Pig. With the help of Apache Sqoop, Pig can even translate between RDBMS and Hadoop.

MongoDB (written in C++) manages memory more cost-efficiently than Hadoop’s HBase (written in Java).

Both systems also take a different approach to space utilization. MongoDB pre-allocates space for storage, improving performance, but wasting space. Hadoop optimizes space usage, but ends up with lower write performance by comparison with MongoDB.

Typically, MongoDB is used with systems less than approximately 5 TB of data. Hadoop, on the other hand, has been used for systems larger than 100 TB, including systems containing petabytes of data.

http://osintegrators.com/MongoAndHadoop

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航