Edutainmentzone: Hadoop installation(Distributed Mode in Ubuntu)

Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.

You can install Hadoop in 3 modes such as,

a) simple mode (click here for the procedure)

b) Psudomode(click here for the procedure)

c) Distributed cluster mode

This article gives you idea about Distributed mode installation (in UBUNTU OS) only.

1. To install Hadoop you need the following software.( can be downloaded free)

hadoop-2.7.2.tar.gz

ii.jdk-8u77-linux-i586.tar.gz

These files are downloaded and extracted (as these are compressed folders). These two files are placed in Downloads ( you can also place it anywhere. This article considers that the files are kept in downloads)

for, simplicity i have renamed hadoop-2.7.2 filename to hadoop.

2. Updating ubuntu

user@user-Thinkceter-E73:- $ sudo apt update

3. Now, install openssh server in your system.

use the following command and press enter (enter password if asked)

sudo apt-get

install openssh-server
4. Creation of Master and slave nodes, use the following command,

$sudo gedit /etc/hosts

Now host file is opened, in that add master and slave IP addresses.

192.168.1.52 master

192.168.1.55 slave1

Note: IP addresses may vary but must be in same network ID.Save and exit.

5. To change name of host

$sudo gedit /etc/hostname

In terminal it will ask enter password(Enter your password)

Now host name file will be opened

Remove old name and type new name as master. Save and exit.

6. use the following command

sudo service hostname restart

Now close terminal and reopen.

Now the terminal is displayed with new name i.e. master.

7. use following command to generate key

$ssh-keygen –t rsa

Press enter 3 times. Now the following screen is displayed with a new key.

Now use the following command to copy the key
  $ssh–copy-id   -i   /home/user/.ssh/id_rsa.pub   user@slave1

8. Open new terminal in slave system and type
$openssh server

9. We can check whether the files are upgraded in slave or not with the help of following command by typing it in the new terminal of the master system

$ssh slave1

10.ConFiguration of hadoop in master

Goto Downloads/hadoop/etc/hadoop/ (go to hadoop folder in your system)

Open core-site.xml file with gedit.

Add the following lines at the end of the file.

(In the configuration tag change the name tag and value tag)

11. Now open hdfs-site.xml in the same directory.

At the end of the file add these lines

In the property tag

12. Now copy mapred-site.xml.template and paste it in the same directory.

Now rename the mapred-site.xml.template file as mapred-site.xml.

Open the renamed file using gedit. and make the following changes at the end of the file.

13. open Hadoop-env.sh file which is in the same directory at the end of the file add this line

Export HADOOP_CONF_DIR=/home/user/Downloads/hadoop/etc/hadoop

14. Copy files into Slave

open the terminal in master system

change the directory $cd Downloads

Now type the command to copy

$scp –r Hadoop slave1:/home/user/Downloads/Hadoop

(The above command copies all configured Hadoop files into slave system.)

15.Goto Hadoop directory and open slaves file with gedit.

Downloads/hadoop/etc/Hadoop/

Remove localhost from the file and type slave1. Save and exit.

16. Goto Hadoop directory and copy slaves file and paste there itself and rename copied file as masters

Now open masters file type master (by removing slave1). Save and exit

17. Name node formatting:

use the following command

$hadoop namenode -format

18.In terminal type the following command to start all Hadoop services

$start-all.sh

Enter Pasword :it may ask you for password(2 times)

19.In terminal of master system , try the following command

$jps

It must show the following list if installation is correct.

NameNode

Data Node

JPS

Secondary Namenode.

20. In terminal of slave system if we type

$jps

It displays the following.

DataNode

jps.

Now, you can execute JAVA rar files in the Hadoop distributed Environment.

Nav Menu

Saturday, June 25, 2016

Hadoop installation(Distributed Mode in Ubuntu)

No comments:

Post a Comment

Pages