Wednesday, June 15, 2016

Hadoop Installation Procedure (simple mode installation) in UBUNTU


Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.

You can install Hadoop in 3 modes such as, 

  a) simple mode
 b) Psudomode
 c) Distributed cluster mode

This article gives you idea about simple mode installation (in UBUNTU OS) only. for other two types of installations refer subsequent articles.

1. To install Hadoop you need the following software.( can be downloaded free)


  • hadoop-2.7.2.tar.gz
  • ii.jdk-8u77-linux-i586.tar.gz
These files are downloaded and extracted (as these are compressed folders). These two files are placed in Downloads ( you can also place it anywhere. This article considers that the files are kept in downloads)
for, simplicity i have renamed hadoop-2.7.2  filename to hadoop.
2. Now, install openssh server in your system.
    use the following command and press enter (enter password if asked)
    sudo apt-get install openssh-server
    
   During installation it will ask you ( yes/no )  and/or password couple of times. Press      yes or y(as applicable) and give the password. The openssh server will be installed        automatically. Once the installation is over, you will get the prompt again.
   3. To check local host connection type ssh localhost in terminal.
        After connecting to local host type exit.
   4. Now, use the follwing command to edit bashrc file.
        gedit .bashrc
      The bashrc file is opened in an editor. Append the following Four statements ( to          set       the  path)  at the end  of the file.
      export JAVA_HOME=/home/user/Downloads/jdk1.8.0_77
      export PATH=$PATH:$JAVA_HOME/bin
      export PATH=$PATH:/home/user/Downloads/hadoop/bin
      export PATH=$PATH:/home/user/Downloads/hadoop/sbin
     Save these modifications and close the bashrc file. Close the Terminal and open  the      terminal again.
   5. To check whether java path is correctly set or not try the following command.
      echo $JAVA_HOME
     it will display the version of java ( if path is correctly set)
     

  6. Now install rsync in your system. use the following command and press enter.
     sudo apt-get install rsync
     Enter password and press yes/no if asked.
    

  
7. Open hadoop/tc/hadoop/env.sh file ( From Downloads) and add this line at the end       of the file
     export JAVA_HOME=/home/user/Downloads/jdk1.8.0_77 
     save and exit.
   8. for verification of hadoop installation type
       hadoop version.
      It will display with version.
    
   

 Hadoop installation completed.
 Run a program (e.g. wordcount) to check the correctness of installation.

Enjoy...




No comments:

Post a Comment