Hadoop And cloud : November 2013

Wednesday, 27 November 2013

HBASE TABLE SNAPSHOT

STEP 1 :-

Configuration

a) Add a property.

[ nitin@nitin-ubuntu:~ # ] sudo vim /etc/hbase/conf/hbase-site.xml

<property>
<name>hbase.snapshot.enabled</name>
<value>true</value>
</property>

b) Restart hbase

[ nitin@nitin-ubuntu:~ # ] sudo /usr/lib/hbase/bin/stop_hbase.sh

[ nitin@nitin-ubuntu:~ # ] sudo /usr/lib/hbase/bin/start_hbase.sh

Step 2 :-

Take a Snapshot

[ nitin@nitin-ubuntu:~ # ] hbase shell

hbase> snapshot 'MY_TABLE', 'SNAP_MYTABLE'

Step 3 :-

Listing Snapshots

[ nitin@nitin-ubuntu:~ # ] hbase shell

hbase> list_snapshots

Step 4 :-

Deleting Snapshots

[ nitin@nitin-ubuntu:~ # ] hbase shell

hbase> delete_snapshot 'SNAP_MYTABLE'

Step 5 :-

Clone a table from snapshot

[ nitin@nitin-ubuntu:~ # ] hbase shell

hbase> clone_snapshot 'SNAP_MYTABLE', 'NEW_TABLE'

Step 6 :-

Export to another cluster:-

[ nitin@nitin-ubuntu:~ # ] hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot SNAP_Target_toys_Prods -copy-to hdfs://CLUSTER2:8020/hbase

STEP 1:-

Rowcounter is a mapreduce job to count all the rows of a table. This is a good utility to use as a sanity check to ensure that HBase can read all the blocks of a table if there are any concerns of metadata inconsistency. It will run the mapreduce all in a single process but it will run faster if you have a MapReduce cluster in place for it to exploit.

hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename>

[ nitin@nitin-ubuntu:~] # hbase org.apache.hadoop.hbase.mapreduce.RowCounter TARGET_TBL_NAME

Monday, 25 November 2013

QUERYING JSON RECORDS VIA HIVE

Step 1 :-

Json file

{

"Foo": "ABC",

"Bar": "20090101100000",

"Quux": {

"QuuxId": 1234,

"QuuxName": "Sam"

}
}

Step 2 :-

Create Table

CREATE TABLE json_table ( json string );

Step 3 :-

Upload data into hive table

LOAD DATA LOCAL INPATH '/tmp/example.json' INTO TABLE `json_table`;

Step 4 :-

Retrieve data

select get_json_object(json_table.json,'$')from json_table;

Step 5 :-

Retrieve Nested data

selectget_json_object(json_table.json,'$.Foo')asfoo, get_json_object(json_table.json,'$.Bar')as bar, get_json_object(json_table.json,'$.Quux.QuuxId')as qid, get_json_object(json_table.json,'$.Quux.QuuxName')as qname from json_table;

Friday, 15 November 2013

HBASE BACKUP AND RESTORE TABLE

STEP 1

EXPORT :-

Export is a utility that will dump the contents of table to HDFS in a sequence file. Invoke via:

[ nitin@nitin-ubuntu:~ ]# hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir>

[ nitin@nitin-ubuntu:~ ]# hbase org.apache.hadoop.hbase.mapreduce.Export HBASEEXPORTTABLE DUMP

STEP 2

IMPORT :-

Import is a utility that will load data that has been exported back into HBase. Invoke via:

[ nitin@nitin-ubuntu:~ ]# hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>

[ nitin@nitin-ubuntu:~ ]# hbase org.apache.hadoop.hbase.mapreduce.Import HBASEIMPORTABLE DUMP

Hadoop And cloud

Wednesday, 27 November 2013

HBASE TABLE SNAPSHOT

HBASE TABLE ROW COUNT

Monday, 25 November 2013

QUERYING JSON RECORDS VIA HIVE

Friday, 15 November 2013

HBASE BACKUP AND RESTORE TABLE

Ansible Cheat sheet

Search This Blog