Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.
You can install Hadoop in 3 modes such as,
a) simple mode
b) Psudomode
c) Distributed cluster mode
This article gives you idea about simple mode installation (in UBUNTU OS) only. for other two types of installations refer subsequent articles.
1. To install Hadoop you need the following software.( can be downloaded free)
- hadoop-2.7.2.tar.gz
- ii.jdk-8u77-linux-i586.tar.gz
These files are downloaded and extracted (as these are compressed folders). These two files are placed in Downloads ( you can also place it anywhere. This article considers that the files are kept in downloads)
for, simplicity i have renamed hadoop-2.7.2 filename to hadoop.
2. Now, install openssh server in your system.
use the following command and press enter (enter password if asked)
sudo apt-get install openssh-server
During installation it will ask you ( yes/no ) and/or password couple of times. Press yes or y(as applicable) and give the password. The openssh server will be installed automatically. Once the installation is over, you will get the prompt again.
3. To check local host connection type ssh
localhost in terminal.
After connecting to local host type
exit.
4. Now, use the follwing command to edit bashrc file.
gedit .bashrc
The bashrc file is opened in an editor. Append the following Four statements ( to set the path) at the end of the file.
export
JAVA_HOME=/home/user/Downloads/jdk1.8.0_77
export PATH=$PATH:$JAVA_HOME/bin
export
PATH=$PATH:/home/user/Downloads/hadoop/bin
export
PATH=$PATH:/home/user/Downloads/hadoop/sbin
Save these modifications and close the bashrc file. Close the Terminal and open the terminal again.
5. To check whether java path is correctly set or not try the following command.
echo $JAVA_HOME
it will display the version of java ( if path is correctly set)
6. Now install rsync in your system. use the following command and press enter.
sudo apt-get install rsync
Enter password and press yes/no if asked.
7. Open hadoop/tc/hadoop/env.sh file ( From Downloads) and
add this line at the end of the file
export JAVA_HOME=/home/user/Downloads/jdk1.8.0_77
save and exit.
8. for verification of hadoop installation
type
No comments:
Post a Comment