Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.
You can install Hadoop in 3 modes such as,
c) Distributed cluster mode
This article gives you idea about Distributed mode installation (in UBUNTU OS) only.
sudo apt-get
1. To install Hadoop you need the following software.( can be downloaded free)
- hadoop-2.7.2.tar.gz
- ii.jdk-8u77-linux-i586.tar.gz
These files are downloaded and extracted (as these are compressed folders). These two files are placed in Downloads ( you can also place it anywhere. This article considers that the files are kept in downloads)
for, simplicity i have renamed hadoop-2.7.2 filename to hadoop.
2. Updating ubuntu
3. Now, install openssh server in your system.
2. Updating ubuntu
user@user-Thinkceter-E73:- $ sudo apt update
3. Now, install openssh server in your system.
use the following command and press enter (enter password if asked)
install openssh-server
4. Creation of Master and slave nodes, use the following command,
Now use the following command to copy the key
$ssh–copy-id -i /home/user/.ssh/id_rsa.pub user@slave1
8. Open new terminal in slave system and type
$openssh server
4. Creation of Master and slave nodes, use the following command,
$sudo
gedit /etc/hosts
Now
host file is opened, in that add master and slave IP addresses.
192.168.1.52 master
192.168.1.55 slave1
Note:
IP addresses may vary but must be in same network ID.Save
and exit.
5. To
change name of host
$sudo
gedit /etc/hostname
In
terminal it will ask enter password(Enter your password)
Now
host name file will be opened
Remove
old name and type new name as master. Save and exit.
6. use the following command
sudo
service hostname restart
Now
close terminal and reopen.
Now
the terminal is displayed with new name i.e. master.
7. use following command to generate key
$ssh-keygen –t
rsa
Press
enter 3 times. Now the following screen is displayed with a new key.
Now use the following command to copy the key
$ssh–copy-id -i /home/user/.ssh/id_rsa.pub user@slave1
8. Open new terminal in slave system and type
$openssh server
9. We can check whether the files are upgraded
in slave or not with the help of following command by typing it in the new
terminal of the master system
$ssh slave1
10.ConFiguration of hadoop in master
Goto
Downloads/hadoop/etc/hadoop/ (go to hadoop folder in your system)
Open core-site.xml file with
gedit.
Add the following lines at the end of the file.
(In
the configuration tag change the name tag and value tag)
11. Now
open hdfs-site.xml in the same directory.
At
the end of the file add these lines
In
the property tag
12. Now
copy mapred-site.xml.template and paste it in the same directory.
Now
rename the mapred-site.xml.template file as
mapred-site.xml.
Open the renamed file using gedit. and make the following changes at the end of the file.
13. open
Hadoop-env.sh file which is in the same
directory at the end of the file add
this line
Export
HADOOP_CONF_DIR=/home/user/Downloads/hadoop/etc/hadoop
14. Copy files into Slave
open the terminal in master system
change
the directory $cd Downloads
Now
type the command to copy
$scp
–r Hadoop slave1:/home/user/Downloads/Hadoop
(The above command copies all configured Hadoop files into slave system.)
15.Goto
Hadoop directory and open slaves
file with gedit.
Downloads/hadoop/etc/Hadoop/
Remove localhost from the file and type slave1. Save
and exit.
16. Goto
Hadoop directory and copy slaves
file and paste there itself and rename copied file as masters
Now
open masters file type master (by removing slave1). Save
and exit
17. Name
node formatting:
use the following command
$hadoop
namenode -format
18.In
terminal type the following command to start all Hadoop services
$start-all.sh
Enter
Pasword :it may ask you for password(2 times)
19.In
terminal of master system , try the following command
$jps
It
must show the following list if installation is correct.
NameNode
Data Node
JPS
Secondary Namenode.
20.
In terminal of slave system if we type
$jps
It
displays the following.
DataNode
jps.
Now, you can execute JAVA rar files in the Hadoop distributed Environment.
No comments:
Post a Comment