Friday, June 17, 2016

Hadoop Installation Procedure( Psudo Mode in Ubuntu OS)

Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.

You can install Hadoop in 3 modes such as, 

 b) Psudomode
 c) Distributed cluster mode

This article gives you idea about Psudo mode installation (in UBUNTU OS) only. for other two types of installations refer subsequent articles.

1. To install Hadoop you need the following software.( can be downloaded free)

  • hadoop-2.7.2.tar.gz
  • ii.jdk-8u77-linux-i586.tar.gz

These files are downloaded and extracted (as these are compressed folders). These two files are placed in Downloads ( you can also place it anywhere. This article considers that the files are kept in downloads)
for, simplicity i have renamed hadoop-2.7.2  filename to hadoop.

2. Updating ubuntu
    

user@user-Thinkceter-E73:- $ sudo apt update

  3. Now, install openssh server in your system.
    use the following command and press enter (enter password if asked)
    sudo apt-get install openssh-server
   
4. Setup passwordless ssh to localhost
user@user-Thinkceter-E73:- $ ssh-keygen -t rsa 

  


5.Add the public key to the authorized_keys. Just use the ssh-copy-id command, which will take care of this step automatically and assign appropriate permissions to these files.


user@user-Thinkceter-E73:- $ ssh-copy-id -i ~/.ssh/id_rsa.pub localhost
 
6 .To check local host connection type ssh localhost in terminal.

user@user-Thinkceter-E73:- $ssh localhost                                                                                                                                               

7. After connecting to local host type exit.

8. open .bashrc file


user@user-Thinkceter-E73:- $sudo gedit .bashrc                                                                                                                                      


9.Add these two lines to the end of the above file.

export JAVA_HOME=/home/user/Downloads/jdk1.8.0_77

export PATH=$PATH:$JAVA_HOME/bin

10. Now apply all the changes into the current running system.
CLOSE TERMINAL. And open again or

user@user-Thinkceter-E73:- $ source ~/.bashrc

11.For Verification of java path type echo $JAVA_HOME in it.


12. .Open hadoop/etc/hadoop/hadoop-env.sh file and add this line at the end of the file


export JAVA_HOME=/home/user/Downloads/jdk1.8.0_77


save and exit.


13. open .bashrc file


user@user-Thinkceter-E73:- $sudo gedit .bashrc  

At the end of the file add these two lines
export PATH=$PATH:/home/user/Downloads/hadoop/bin

export PATH=$PATH:/home/user/Downloads/hadoop/sbin

save and exit

14.Now close terminal and again open terminal.

for verification of hadoop installation  
. user@user-Thinkceter-E73:- $ hadoop

  It will display with version.

15.Configuration of  etc/hadoop/ files.

Open Downloads/hadoop/etc/hadoop/core-site.xml with gedit.

And type the following at the end of the file
Save and exit the file

Open Downloads/hadoop/etc/hadoop/hdfs-site.xml
And type the following at the end of the file
Save and exit the file
Rename the file present at Downloads/hadoop/etc/hadoop/mapred-site.xml.template as mapred-site.xml

Open the file And type the following at the end of the file

Open Downloads/hadoop/etc/hadoop/slaves
Erase the data present and type “localhost”
Save and exit from the file.



 

No comments:

Post a Comment