您的位置:首页 > 数据库

Hbase(nosql)体系结构有基本操作 笔记八

2015-12-07 12:59 337 查看
5 Hbase(nosql)体系结构有基本操作 flume pig

Google bigtable的开源实现

列式数据库

可集群化

可以使用shell web api多种方式访问

适合高速读写的场景

Hql查询语言

noSQL的典型代表

逻辑模型

以表的形式存放数据

表由行和列组成,每个列属于某个列族,由行和列确定的存储单元称为元素

每个元素保存了同一份数据的多个版本,由时间戳来区分

行键

数据行在表里的唯 一标识,作为检索记录的主键

访问表里的行:

通过单个行键访问

给定行键的范围访问

全表扫描

行键可以是最大长度不超过64k的任意字符串

列族和列

列族需要在定义表时指定

表是在插入记录时动态生成

列表示: <列族>:<限定符>

Hbase在磁盘上按列族存储

时间戳

对应每次数据操作的时间

Hbase支持两种数据版本的回收方式:

每个数据单元,只存储指定个数的最新版本

保存指定时间长度的版本

时间查询: 最新数据/全部版本数据

元素由行键 列族:限定符 时间戳来决定

元素以字节码形式存放,没有类型之分

物理模型

适合海量数据的秒组查询

表中的记录,按照行键进行拆分,拆分成一个个的region(startkey,endkey)

Region存储在region server(单独的物理机器)中

Hbase-default.xml 列族存放的最大值为10g

体系结构

主从式结构,由hmaster和hregionServer组成

通过zookeeper的master election机制来保证hmaster的运行

Hbase中有两张物殊的表

-root- 记录了.meta.表的region信息

.meta. 记录用户表的region信息

用户访问数据先访问zookeeper--->-root-,接着找.meta.找到用户数据的位置

Hbase的伪分布的安装

Hbase的安装与配置 查找0.20.2对应hbase的版本

单机安装

下载地址:http://mirror.bjtu.edu.cn/apache/hbase/hbase-0.90.5

解压到指定目录

可以将hbase添加到二环境变量中etc/profile

Export HBASE_HOME=解压路径

使配置文件生效source /etc/profile

修改/software/hbase/hbase-0.90.5/conf/hbase-env.sh文件,设置java_home

# The java implementation to use. Java 1.6 required.

export JAVA_HOME=/sdk/jdk1.6.0_34

/配置hbase-ste.xml文件,添加如下内容:

<configuration>

<property>

<name>hbase.rootdir</name>

<value>file:///software/hbase/hbase-0.90.5/data</value>

</property>

</configuration>

启动hbase并验证

安装目录下的bin/start-hbase.sh

查看启动情况: jdk_home/bin/ jps,可以看到如下内容:

root@vm:/sdk/jdk1.6.0_34/bin# jps

4131 DataNode

5761 TaskTracker

3375 NameNode

4894 SecondaryNameNode

4955 JobTracker

6079 Jps

5973 HMaster

安装目录下的bin/hbase shell,显示如下:

root@vm:/software/hbase/hbase-0.90.5# bin/hbase shell

HBase Shell; enter 'help<RETURN>' for list of supported commands.

Type "exit<RETURN>" to leave the HBase Shell

Version 0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011

hbase(main):001:0> quit

root@vm:/software/hbase/hbase-0.90.5#

伪分布模式

在单点模式的基础上

1 编辑hbase-env.sh添加HBASE_CLASSPATH环境变量,添加如下内容

# Extra Java CLASSPATH elements. Optional.

export HBASE_CLASSPATH=/software/hadoop/hadoop-0.20.2/conf

#打开文件最后的配置项,hbase用自已实例的zookeeper来管理

export HBASE_MANAGES_ZK=true

2 编辑hbase-site.xml打开分布模式

<configuration>

<property>

<name>hbase.rootdir</name>

<!--

<value>file:///software/hbase/hbase-0.90.5/data</value>

-->

<value>hdfs://localhost:9000/hbase<value>

</property>

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

<property>

<name>hbase.zookeeper.quorum</name>

<value>vm</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

可选文件 regionservers所在节点,设置节点名

覆盖hadoop核心jar包

//首先禁用原有的jar文件

root@vm:/software/hbase/hbase-0.90.5/lib# mv hadoop-core-0.20-append-r1056497.jar hadoop-core-0.20-append-r1056497.old

//添加hadoop安装目录下的核心jar包到lib目录

root@vm:/software/hbase/hbase-0.90.5/lib# cp /software/hadoop/hadoop-0.20.2/hadoop-0.20.2-core.jar .

root@vm:/software/hbase/hbase-0.90.5/lib# ls

...

guava-r06.jar log4j-1.2.16.jar

hadoop-0.20.2-core.jar protobuf-java-2.3.0.jar

hadoop-core-0.20-append-r1056497.old ruby

jackson-core-asl-1.5.5.jar servlet-api-2.5-6.1.14.jar

...

启动hbase,同上,显示如下结果:

root@vm:/sdk/jdk1.6.0_34/bin# jps

10477 Jps

5145 JobTracker

5049 SecondaryNameNode

5898 TaskTracker

4285 DataNode

9562 HMaster

10320 HRegionServer

3496 NameNode

9509 HQuorumPeer

root@vm:/sdk/jdk1.6.0_34/bin#

验证启动

完全分布模式

配置hosts,确保主机名可以解析为ip

编辑hbase-env.xml

编辑hbase-site.xml

编辑regionservers文件

把hbase复制到其它节点

启动hbase

验证启动

也可以通过ie访问http://localhost:60010/master.jsp

Shell操作

Notallmetaregionsonlineexception问题

修改/etc/hosts文件,添加如下内容:

127.0.0.1 localhost

127.0.0.1 vm

Help 帮肋

State 查看数据库状态

hbase(main):003:0> status

1 servers, 0 dead, 0.0000 average load

Version 查看数据库版本

hbase(main):004:0> version

0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011

创建表

Create ‘表名’,’列族名称1’,’列族名称2’

添加记录

Put ‘表名称’,’行名称’,’列名称’,’值’

查看记录

Get ‘表名’,’行名称

查看表中的记录总数

Count ‘表名’

删除记录

Delete ‘表名’,’行名称’,’列名称’

删除一张表

Drop ‘表名’

查看所有记录

Scan ‘表名’

查看某个表中的列有所有数据

Scan ‘表名’,{COLUMNS=>’列族名称:列名称’}

更新记录

重写一遍进行覆盖

具体操作如下:

创建表

hbase(main):001:0> create 'user','user_id','address','info'

0 row(s) in 2.2520 seconds

查看

hbase(main):002:0> list

TABLE

user

1 row(s) in 0.0300 seconds

查看表中列族的描述信息

hbase(main):003:0> describe 'user'

DESCRIPTION ENABLED

{NAME => 'user', FAMILIES => [{NAME => 'address', B true

LOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COM

PRESSION => 'NONE', VERSIONS => '3', TTL => '214748

3647', BLOCKSIZE => '65536', IN_MEMORY => 'false',

BLOCKCACHE => 'true'}, {NAME => 'info', BLOOMFILTER

=> 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =

> 'NONE', VERSIONS => '3', TTL => '2147483647', BLO

CKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE

=> 'true'}, {NAME => 'user_id', BLOOMFILTER => 'NO

NE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE

', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE

=> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'tr

ue'}]}

1 row(s) in 0.0340 seconds

删除表

Disable ‘user’

Drop ‘user’

添加记录

put 'users','retacn','info:age','32';

put 'users','retacn','info:birthday','1984-0*-0*';

put 'users','retacn','info:company','none';

put 'users','retacn','address:contry','china';

put 'users','retacn','address:province','shandong';

put 'users','retacn','address:city','zibo';

put 'users','zhansan','info:age','30';

put 'users','zhansan','info:birthday','1984-08-08';

put 'users','zhansan','info:company','hony';

put 'users','zhansan','address:contry','china';

put 'users','zhansan','address:province','shandong';

put 'users','zhansan','address:city','zibo'

取得一行(id)记录

hbase(main):020:0> get 'users','retacn'

COLUMN CELL

address:city timestamp=1449211197068, value=zibo

address:contry timestamp=1449211197048, value=china

address:province timestamp=1449211197061, value=shandong

info:age timestamp=1449211197012, value=32

info:birthday timestamp=1449211197025, value=1984-09-04

info:company timestamp=1449211197038, value=none

取得一行(id),一个列族的所有数据

hbase(main):022:0> get 'users','retacn','address'

COLUMN CELL

address:city timestamp=1449211197068, value=zibo

address:contry timestamp=1449211197048, value=china

address:province timestamp=1449211197061, value=shandong

取得一行(id),一个列族的一列的所有数据

hbase(main):023:0> get 'users','retacn','info:age'

COLUMN CELL

info:age timestamp=1449211197012, value=32

更新记录

重复添加即可覆盖

hbase(main):024:0> put 'users','retacn','info:age','31'

0 row(s) in 0.0220 seconds

hbase(main):025:0> get 'users','retacn','info:age'

COLUMN CELL

info:age timestamp=1449211528248, value=31

取得单元格数据的版本数据

hbase(main):026:0> get 'users','retacn',{COLUMN=>'info:age',VERSIONS=>1}

COLUMN CELL

info:age timestamp=1449211528248, value=31

1 row(s) in 0.0360 seconds

hbase(main):027:0> get 'users','retacn',{COLUMN=>'info:age',VERSIONS=>2}

COLUMN CELL

info:age timestamp=1449211528248, value=31

info:age timestamp=1449211197012, value=32

版本号

hbase(main):030:0> describe 'users'

DESCRIPTION ENABLED

{NAME => 'users', FAMILIES => [{NAME => 'address', true

BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', CO

MPRESSION => 'NONE', VERSIONS => '3', TTL => '21474

83647', BLOCKSIZE => '65536', IN_MEMORY => 'false',

BLOCKCACHE => 'true'}, {NAME => 'info', BLOOMFILTE

R => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION

=> 'NONE', VERSIONS => '3', TTL => '2147483647', BL

OCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACH

E => 'true'}, {NAME => 'user_id', BLOOMFILTER => 'N

ONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NON

E', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE

=> '65536', IN_MEMORY => 'false', BLOCKCACHE => 't

rue'}]}

取得单元格数据的某个版本数据

hbase(main):029:0> get 'users','retacn',{COLUMN=>'info:age',TIMESTAMP=>1449211528248}

COLUMN CELL

info:age timestamp=1449211528248, value=31

全表扫描

hbase(main):031:0> scan 'users'

ROW COLUMN+CELL

retacn column=address:city, timestamp=1449211197068, value=zibo

retacn column=address:contry, timestamp=1449211197048, value=chin

a

retacn column=address:province, timestamp=1449211197061, value=sh

andong

retacn column=info:age, timestamp=1449211528248, value=31

retacn column=info:birthday, timestamp=1449211197025, value=1984-

09-04

retacn column=info:company, timestamp=1449211197038, value=none

zhansan column=address:city, timestamp=1449211208677, value=zibo

zhansan column=address:contry, timestamp=1449211208664, value=chin

a

zhansan column=address:province, timestamp=1449211208670, value=sh

andong

zhansan column=info:age, timestamp=1449211208579, value=30

zhansan column=info:birthday, timestamp=1449211208596, value=1984-

09-03

zhansan column=info:company, timestamp=1449211208605, value=hony

2 row(s) in 0.0720 seconds

删除行的’info:age’ 字段

Delete ‘users’,’retacn’,’info:age’

删除整行

Deleteall ‘users’,’retacn’

统计表的行数

Count ‘users’

清空表

Truncate ‘users’

退出hbase shell

quit

Hbase 中javaAPI的操作

示例代码如下:

/**

* Copyright (C) 2015

*

* FileName:HbaseApiTest.java

*

* Author:<a href="mailto:zhenhuayue@sina.com">Retacn</a>

*

* CreateTime: 2015-12-4

*/

// Package Information

package cn.yue.hbase;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HColumnDescriptor;

import org.apache.hadoop.hbase.HTableDescriptor;

import org.apache.hadoop.hbase.MasterNotRunningException;

import org.apache.hadoop.hbase.ZooKeeperConnectionException;

import org.apache.hadoop.hbase.client.HBaseAdmin;

import org.apache.hadoop.hbase.client.HTable;

import org.apache.hadoop.hbase.client.Put;

/****

* 测试hbase javaAPI

*

* @version

*

* @Description:

*

* @author <a href="mailto:zhenhuayue@sina.com">Retacn</a>

*

* @since 2015-12-4

*

*/

public class HbaseApiTest {

public static final String TABLE_NAME="employee";

public static final String FAMILY_NAME="id";

public static final String ROW_KEY="retacn";

/**

*

* @param args

* @throws IOException

*/

public static void main(String[] args) throws IOException {

Configuration conf=new Configuration();

conf.set("hbase.rootdir", "hdfs://localhost:9000/hbase");

conf.set("hbase.zookeeper.quorum","127.0.0.1");

conf.set("hbase.zookeeper.property.clientPort","2181");

//用于创建删除表

final HBaseAdmin hbaseAdmin=new HBaseAdmin(conf);

//创建表

createTable(hbaseAdmin);

}

/**

* 创建表

* @param hbaseAdmin

* @throws IOException

*/

private static void createTable(final HBaseAdmin hbaseAdmin) throws IOException {

if(!hbaseAdmin.isTableEnabled(TABLE_NAME)){

HTableDescriptor tableDescriptor=new HTableDescriptor(TABLE_NAME);

HColumnDescriptor family=new HColumnDescriptor(FAMILY_NAME);

tableDescriptor.addFamily(family);

hbaseAdmin.createTable(tableDescriptor);

}

}

}

创建完成查看结果如下:

hbase(main):001:0> list

TABLE

employee

users

2 row(s) in 0.7510 seconds

hbase(main):002:0> describe 'employee'

DESCRIPTION ENABLED

{NAME => 'employee', FAMILIES => [{NAME => 'id', BL true

OOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMP

RESSION => 'NONE', VERSIONS => '3', TTL => '2147483

647', BLOCKSIZE => '65536', IN_MEMORY => 'false', B

LOCKCACHE => 'true'}]}

添加一条记录

/**

* Copyright (C) 2015

*

* FileName:HbaseApiTest.java

*

* Author:<a href="mailto:zhenhuayue@sina.com">Retacn</a>

*

* CreateTime: 2015-12-4

*/

// Package Information

package cn.yue.hbase;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HColumnDescriptor;

import org.apache.hadoop.hbase.HTableDescriptor;

import org.apache.hadoop.hbase.MasterNotRunningException;

import org.apache.hadoop.hbase.ZooKeeperConnectionException;

import org.apache.hadoop.hbase.client.HBaseAdmin;

import org.apache.hadoop.hbase.client.HTable;

import org.apache.hadoop.hbase.client.Put;

/****

* 测试hbase javaAPI

*

* @version

*

* @Description:

*

* @author <a href="mailto:zhenhuayue@sina.com">Retacn</a>

*

* @since 2015-12-4

*

*/

public class HbaseApiTest {

public static final String TABLE_NAME="employee";

public static final String FAMILY_NAME="id";

public static final String ROW_KEY="retacn";

/**

*

* @param args

* @throws IOException

*/

public static void main(String[] args) throws IOException {

Configuration conf=new Configuration();

conf.set("hbase.rootdir", "hdfs://localhost:9000/hbase");

conf.set("hbase.zookeeper.quorum","127.0.0.1");

conf.set("hbase.zookeeper.property.clientPort","2181");

//用于创建删除表

final HBaseAdmin hbaseAdmin=new HBaseAdmin(conf);

//创建表

createTable(hbaseAdmin);

//添加一条记录

final HTable hTable=new HTable(conf, TABLE_NAME);

Put put=new Put(ROW_KEY.getBytes());

put.add(FAMILY_NAME.getBytes(), "age".getBytes(), "32".getBytes());

hTable.put(put);

//删除表

//hbaseAdmin.deleteTable(TABLE_NAME);

}

/**

* 创建表

* @param hbaseAdmin

* @throws IOException

*/

private static void createTable(final HBaseAdmin hbaseAdmin) throws IOException {

if(!hbaseAdmin.isTableEnabled(TABLE_NAME)){

HTableDescriptor tableDescriptor=new HTableDescriptor(TABLE_NAME);

HColumnDescriptor family=new HColumnDescriptor(FAMILY_NAME);

tableDescriptor.addFamily(family);

hbaseAdmin.createTable(tableDescriptor);

}

}

}

添加完成后查看结果如下:

hbase(main):003:0> get 'employee','retacn'

COLUMN CELL

id:age timestamp=1449369554428, value=32

如果出现以下错误信息,需要添加配置以下

15/12/06 10:37:31 ERROR zookeeper.ZKConfig: no clientPort found in zoo.cfg

示例代码如下:

conf.set("hbase.zookeeper.property.clientPort","2181");

查询一条记录,示例代码如下:

/**

* 查询一条记录

*

* @param hTable

* @throws IOException

*/

private static void getRecord(final HTable hTable) throws IOException {

Get get = new Get(ROW_KEY.getBytes());

final Result result = hTable.get(get);

final byte[] value = result.getValue(FAMILY_NAME.getBytes(), "age".getBytes());

System.out.println(result + "\t" + new String(value));

}

查询结果如下:

keyvalues={retacn/id:age/1449369554428/Put/vlen=2} 32

查询所有记录,示例代码如下:

/**

* 查询所有记录

*

* @param hTable

* @throws IOException

*/

private static void getAll(final HTable hTable) throws IOException {

Scan scan = new Scan();

final ResultScanner scanner = hTable.getScanner(scan);

for (Result result : scanner) {

final byte[] value = result.getValue(FAMILY_NAME.getBytes(), "age".getBytes());

System.out.println(result + "\t" + new String(value));

}

}

查询结果如下:

keyvalues={retacn/id:age/1449369554428/Put/vlen=2} 32
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: