Monday, 17 February 2014

HDFS TO AWS S3





Step 1 :-

Login to Cloudera manager

Go to services and Click on hdfs

Go to configuration and click on view and edit roles

Click on service-wide configuration

Click on advance

Step 2 :-

Add below details in Cluster-wide Configuration Safety Value for core-site.xml

<property>
<name>fs.s3n.awsAccessKeyId</name>
<value>XXXXXXXXXXXXXXXXXXX</value>
</property>

<property>
<name>fs.s3n.awsSecretAccessKey</name>
<value>XXXXXXXXXXXXXXXXXXXXXXX</value>
</property>


Step 3 :-

Save configuration. 

Click on Action and Deploy Configuration 

Restart hdfs once 


Step 4:-

[nitin@nitin-ubuntu ~]# sudo -u hdfs hadoop distcp s3n://big-store/getting_price.sh hdfs://<CLSUTER1>:8020/user/nitin/

[nitin@nitin-ubuntu ~]# sudo -u hdfs hadoop distcp s3n://big-store/getting_price.sh hdfs://10.10.10.216:8020/user/nitin/

Friday, 7 February 2014

BACKUP HIVE META-STORE WITH POSTGRESQL IN CDH4.X




1) Go to scm server database

[nitin@nitin-ubuntu:~] # cd /var/lib/cloudera-scm-server-db/data

2) check file generated_password.txt . This file is cerated by cloudera manager


[nitin@nitin-ubuntu:~] # cat generated_password.txt

8UlBunj0MM

The password above was generated by /usr/share/cmf/bin/initialize_embedded_db.sh (part of the cloudera-manager-server-db package)
and is the password for the user 'cloudera-scm' for the database in the current directory.

Generated at 20140128-230553.


3) Login to PostgreSQL with password from above file that is 8UlBunj0MM

[nitin@nitin-ubuntu:~] # psql --user cloudera-scm --port=7432 –dbname=postgres
Password for user cloudera-scm:************
postgres=# \q

Note :- Make sure you give exact username and password

4) Take A dump

[nitin@nitin-ubuntu:~] # pg_dump hive -U cloudera-scm --port=7432 > hive.sql
Password:********

5) Its Done.

Ansible Cheat sheet

Install Ansible  # yum install ansible Host file configuration  File  [ansible@kuber2 ~]$ cat /etc/ansible/hosts     [loca...