Wednesday, 27 November 2013

HBASE TABLE SNAPSHOT


STEP 1 :-

Configuration

a)  Add a property.

[ nitin@nitin-ubuntu:~ # ] sudo vim /etc/hbase/conf/hbase-site.xml


  <property>
    <name>hbase.snapshot.enabled</name>
    <value>true</value>
  </property>

b) Restart hbase
 

[ nitin@nitin-ubuntu:~ # ] sudo /usr/lib/hbase/bin/stop_hbase.sh
[ nitin@nitin-ubuntu:~ # ] sudo /usr/lib/hbase/bin/start_hbase.sh


Step 2 :-

Take a Snapshot

[ nitin@nitin-ubuntu:~ # ] hbase shell
hbase> snapshot 'MY_TABLE', 'SNAP_MYTABLE'

Step 3 :-

Listing Snapshots
[ nitin@nitin-ubuntu:~ # ] hbase shell
hbase> list_snapshots

Step 4 :-

Deleting Snapshots
[ nitin@nitin-ubuntu:~ # ] hbase shell
hbase> delete_snapshot 'SNAP_MYTABLE'

Step 5 :-

Clone a table from snapshot
[ nitin@nitin-ubuntu:~ # ] hbase shell
hbase> clone_snapshot 'SNAP_MYTABLE', 'NEW_TABLE'

Step 6 :-
Export to another cluster:-
[ nitin@nitin-ubuntu:~ # ]  hbase  org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot SNAP_Target_toys_Prods -copy-to hdfs://CLUSTER2:8020/hbase








HBASE TABLE ROW COUNT



STEP 1:-

Rowcounter  is a mapreduce job to count all the rows of a table. This is a good utility to use as a sanity check to ensure that HBase can read all the blocks of a table if there are any concerns of metadata inconsistency. It will run the mapreduce all in a single process but it will run faster if you have a MapReduce cluster in place for it to exploit.



hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename>
[ nitin@nitin-ubuntu:~] # hbase org.apache.hadoop.hbase.mapreduce.RowCounter TARGET_TBL_NAME

Monday, 25 November 2013

QUERYING JSON RECORDS VIA HIVE



Step 1 :- 

Json file 



{

"Foo": "ABC",

"Bar": "20090101100000",

"Quux": {

"QuuxId": 1234,

"QuuxName": "Sam"

}
}


Step 2 :- 

Create Table 

CREATE TABLE json_table ( json string );


Step 3 :-

Upload data into hive table 

LOAD DATA LOCAL INPATH '/tmp/example.json'  INTO TABLE `json_table`;

Step 4 :- 

Retrieve data 

select get_json_object(json_table.json, '$') from json_table;

Step 5 :-

Retrieve Nested data

select get_json_object(json_table.json, '$.Foo') as foo,
       get_json_object(json_table.json, '$.Bar') as bar,       get_json_object(json_table.json, '$.Quux.QuuxId') as qid,       get_json_object(json_table.json, '$.Quux.QuuxName') as qname from json_table;






Friday, 15 November 2013

HBASE BACKUP AND RESTORE TABLE


STEP 1

EXPORT  :-

Export is a utility that will dump the contents of table to HDFS in a sequence file. Invoke via:



[ nitin@nitin-ubuntu:~ ]# hbase 
org.apache.hadoop.hbase.mapreduce.Export                   <tablename> <outputdir>

[ nitin@nitin-ubuntu:~ ]#  hbase org.apache.hadoop.hbase.mapreduce.Export HBASEEXPORTTABLE DUMP

STEP 2

IMPORT :- 

Import is a utility that will load data that has been exported back into HBase. Invoke via:



[ nitin@nitin-ubuntu:~ ]# hbase 
org.apache.hadoop.hbase.mapreduce.Import                   <tablename> <inputdir>

[ nitin@nitin-ubuntu:~ ]#  hbase org.apache.hadoop.hbase.mapreduce.Import HBASEIMPORTABLE DUMP






Ansible Cheat sheet

Install Ansible  # yum install ansible Host file configuration  File  [ansible@kuber2 ~]$ cat /etc/ansible/hosts     [loca...