Monday, 7 April 2014

PUPPET INSTALLATION AND CONFIGURATIONS



Step 1 :-

Install RHEL EPEL repository on Centos 6.x on each system 


wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
wget http://rpms.famillecollet.com/enterprise/remi-release-6.rpm
sudo rpm -Uvh remi-release-6*.rpm epel-release-6*.rpm

Step 2 :- 
Install Puppet on system 
On Master sever :- 
[root@PUPPET-MASTER ] # yum install puppet-server
[root@PUPPET-MASTER ] # /etc/init.d/puppetmaster start
On  all client agent system  :- 
[root@PUPPET-CLIENT1 ~] # yum install puppet
[root@PUPPET-CLIENT1 ~] # /etc/init.d/puppet restart

Step 3 :-

Configuration


On all client agent system :-

[root@PUPPET-CLIENT1 ~] # vim /etc/puppet/puppet.conf

Add below parameter into agent section

server = PUPPET-MASTER
runinterval = 120


[root@PUPPET-CLIENT1 ~] # /etc/init.d/puppet restart
Step 4 :-
On Master sever :-
List all cerificates from all clinet connected to server.
[root@PUPPET-MASTER ] # puppetca -l
  "PUPPET-CLIENT1" (E3:2B:85:FD:56:2E:34:A3:E0:FF:7A:33:3A:36:33:8C)
Sign all cerificate using below command 
[root@PUPPET-MASTER ] # puppetca -s PUPPET-CLIENT1

Step 5 :-
On all client agent system
[root@PUPPET-CLIENT1 ~] # puppet agent --test
info: Caching catalog for ug-th-0215-nn
notice: Finished catalog run in 0.10 seconds
[root@PUPPET-CLIENT1 ~] #

Above command say successful connection to sever with certificate .
Step 6 :-
Push customize conf files from master to server.
[root@PUPPET-MASTER ] # vim  /etc/puppet/manifests/site.pp
Add below entry into site.pp
import 'nodes.pp'
Step 7 :-
Create nodes.pp file 
[root@PUPPET-MASTER ] # vim  /etc/puppet/manifests/nodes.pp 
node 'PUPPET-CLIENT1' {
include nginx
}

node 'PUPPET-CLIENT2' {
include nginx
}
[r
Step 8 :-

[root@PUPPET-MASTER ] # mkdir -p /etc/puppet/modules/nginx/{manifests,files}

lets create the nginx class file which lives in /etc/puppet/modules/nginx/manifests/init.pp

[root@PUPPET-MASTER ] # vim /etc/puppet/modules/nginx/manifests/init.pp

# Manage nginx webserver
class nginx {
package { 'nginx':
ensure => installed,
}

service { 'nginx':
ensure => running,
}

file { 'nginxconfig':
name => '/etc/nginx/nginx.conf',
source => 'puppet:///modules/nginx/nginx.conf',
}
}


Add customize file into /etc/puppet/modules/nginx/files/

Step 9 :-

[root@PUPPET-MASTER ] # puppet apply /etc/puppet/manifests/site.pp

Step 10 :-

Login to client system and check for /etc/nginx/nginx.conf you will get your customize file , it will take 2 min to update client system. 


Step 11(Optional) :-

Hadoop Manifests

# Make sure /etc/puppet/module/ have permission of puppet puppet , unzip your hadoop file /etc/puppet/modules/hadoop/files/


class hadoop{


group { "hadoop":
        ensure => present,
        gid    => 1000,
}

user { "hadoop":
        ensure     => present,
        shell      => "/bin/bash",
       managehome        =>  true,
home => "/home/hadoop",
password => '$1$jE8T0sFs$PMKB4bfP21IRqzZ14mwTR/',
}

file { "Hadoophome":
name   => "/home/hadoop",
  ensure => "directory",
         owner  => "hadoop",
         group  => "hadoop",
         mode   => 700,
}

file { "hadoopconf":
name => "/usr/local/hadoop",
recurse => true,
ensure => "directory",
         owner  => "hadoop",
         group  => "hadoop",
mode   => 755,
source => 'puppet:///modules/hadoop/hadoop/',

}

Thursday, 13 March 2014

STANDALONE HBASE INSTALLATION CENTOS 6.X




In standalone mode, HBase does not use HDFS -- it uses the local filesystem instead and it runs all HBase daemons and a local ZooKeeper all up in the same JVM. Zookeeper binds to a well known port so clients may talk to HBSE

Step 1: -

Configure Cloudera repo

[nitin@nitin-ubuntu ~]# cat /etc/yum.repos.d/cdh.repo
cloudera-cdh4]
name = Cloudera CDH, Version 4.4.0
baseurl = http://archive.cloudera.com/cdh4/redhat/5/x86_64/cdh/4.4.0/
gpgkey = http://archive.cloudera.com/redhat/cdh/RPM-GPG-KEY-cloudera
gpgcheck = 1
nitin@nitin-ubuntu ~]#

Step 2: -

Install Hbase-master

[nitin@nitin-ubuntu ~]# yum clean all
[nitin@nitin-ubuntu ~]# yum install hbase-master

Step 3 :- 

Add following lines into hbase-site.xml

[nitin@nitin-ubuntu ~]# cat /etc/hbase/conf/hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///BIG_DATA/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/BIG_DATA/zookeeper</value>
</property>
</configuration>



Step 4 :-

Edit datadir for zookeeper

[nitin@nitin-ubuntu ~]# cat /etc/zookeeper/conf/zoo.cfg
dataDir=/BIG_DATA/zookeeper/


Step 5:- 

Create level one directory and change permission

[nitin@nitin-ubuntu ~]# mkdir /BIG_DATA/
[nitin@nitin-ubuntu ~]# chown -R hbase:hbase /BIG_DATA/

Step 6:-
Configure /etc/hosts in stanalone hbase system as well as client which is connecting to it .

[nitin@nitin-ubuntu ~]# cat /etc/hosts
10.10.10.110 nitin-ubuntu


[nitin@nitin-CLIENT1 ~]# cat /etc/hosts
10.10.10.110 nitin-ubuntu

Step 7 :-
Restart Hbase

[nitin@nitin-ubuntu ~]# /etc/init.d/hbase-master restart




















Monday, 17 February 2014

HDFS TO AWS S3





Step 1 :-

Login to Cloudera manager

Go to services and Click on hdfs

Go to configuration and click on view and edit roles

Click on service-wide configuration

Click on advance

Step 2 :-

Add below details in Cluster-wide Configuration Safety Value for core-site.xml

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>XXXXXXXXXXXXXXXXXXX</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>XXXXXXXXXXXXXXXXXXXXXXX</value>
</property>


Step 3 :-

Save configuration. 

Click on Action and Deploy Configuration 

Restart hdfs once 


Step 4:-

[nitin@nitin-ubuntu ~]# sudo -u hdfs hadoop distcp s3n://big-store/getting_price.sh hdfs://<CLSUTER1>:8020/user/nitin/

[nitin@nitin-ubuntu ~]# sudo -u hdfs hadoop distcp s3n://big-store/getting_price.sh hdfs://10.10.10.216:8020/user/nitin/

Friday, 7 February 2014

BACKUP HIVE META-STORE WITH POSTGRESQL IN CDH4.X




1) Go to scm server database

[nitin@nitin-ubuntu:~] # cd /var/lib/cloudera-scm-server-db/data

2) check file generated_password.txt . This file is cerated by cloudera manager


[nitin@nitin-ubuntu:~] # cat generated_password.txt

8UlBunj0MM

The password above was generated by /usr/share/cmf/bin/initialize_embedded_db.sh (part of the cloudera-manager-server-db package)
and is the password for the user 'cloudera-scm' for the database in the current directory.

Generated at 20140128-230553.


3) Login to PostgreSQL with password from above file that is 8UlBunj0MM

[nitin@nitin-ubuntu:~] # psql --user cloudera-scm --port=7432 –dbname=postgres
Password for user cloudera-scm:************
postgres=# \q

Note :- Make sure you give exact username and password

4) Take A dump

[nitin@nitin-ubuntu:~] # pg_dump hive -U cloudera-scm --port=7432 > hive.sql
Password:********

5) Its Done.

Tuesday, 28 January 2014

ACCESS HBASE TABLE WITH TABLEAU DESKTOP 8.0




Hope you have Tableau installed on system .

Concept :-

You can't directly connect to hbase  table via tableau you need to connect to hive table and hive internally mapped to hbase table.

Please check below link for more explanation :

http://nosql.mypopescu.com/post/17262685876/visualizing-hadoop-data-with-tableau-software-and

Step 1 :- 


Download Tableau driver for hive


Step 2 :- (Driver installation)

Install Above downloaded driver.

Step 3 :- (Configure ODBC driver)

Click on start go to Data Source (ODBC).

Click on System DSN.

Select Cloudera ODBC driver for Apache Hive.

Fill the details.

Save Setting.

Step 4 :- (Run Hive as Thrift service)

[ nitin@nitin-ubuntu:~ # ] $ sudo hive --service hiveserver --hiveconf /etc/hive/conf/hive-site.xml

Make sure you have auxpath set in above hive-site.xml and all jar present .

Below jar needed by hive client to talk to hbase and get data from hbase.

For example :- 

<property>
    <name>hive.aux.jars.path</name>
    <value>file:///usr/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.4.0.jar,file:///usr/lib/hbase/lib/hbase-0.94.6-cdh4.4.0.jar,file:///usr/lib/zookeeper/zookeeper-3.4.5-cdh4.4.0.jar,file:///usr/share/cmf/lib/guava-14.0.jar
   </value>
</property>


Step 4 :- (Connect tableau to hive tables)

Select tableau from start menu.

Go to data Click on connect data than click on cloudera database.

It will ask you to make connections.

Give your hive thrift  server IP and port as 10000.

Click on connect.

If its connected properly than you will get default in schema section.

Select table where you want to make computation.

Click OK.








Ansible Cheat sheet

Install Ansible  # yum install ansible Host file configuration  File  [ansible@kuber2 ~]$ cat /etc/ansible/hosts     [loca...