Centos как запустить hadoop

Обновлено: 04.07.2024

Мы часто осуществляем развертывание различных кластерных систем, поэтому хорошие инструкции на вес золота. Сегодня мы предлагаем хорошую инструкцию по развертыванию кластера Hadoop, подходящего для разработки и малых кластеров без требований к высокой доступности.

Введение

В данной статье представлена подробная инструкция по установке кластера Hadoop на ОС CentOS 7. Статья расчитана на читателя, уже имееющего представление о Hadoop и ОС Linux.

Топология развертывания

Подготовка CentOS

Все действия данного раздела выполняются на каждом узле конфигурации, если иное не указано.

В качестве исходной системы для развертывания будем использовать минимальную установку CentOS 7. После установки системы необходимо добавить несколько программных пакетов:

Задание имени для каждого узла. Данный шаг необязательный, но важный для упрощения идентификации узлов.

Например, на узле master команда будет следующая:

Чтобы увидеть результат, необходимо повторно авторизоваться. Данную операцию необходимо выполнить на каждом узле с указанием корректного hostname узла.

Мы будем использовать OpenJDK 1.8, поскольку этот пакет включен в стандартный репозиторий CentOS 7.

Создайте файл /etc/profile.d/java.sh со следующим содержимым:

Для того, чтобы убедиться в корректности настройки завершите сессию и войдите в систему снова. По команде env вы должны увидеть корректные переменные окружения. Команда java -version должна выдать корректную версию java.

Создайте пользователя и группу пользователей для Hadoop:

Внесите изменения в файл hosts для взаимной идентификации узлов по имени:

Проверьте, что узлы идентифицируются верно:

Настройте доступ по SSH без пароля на каждый узел с каждого узла:

Проверьте, что все узлы взаимно доступны по ключам SSH, без ввода пароля.

Остановите и отключите брандмауэр:

Установка Hadoop

Все действия данного шага выполняются на узле master. Кроме того, все операции выполняются под пользователем hadoop.

Скачайте и распакуйте дистрибутив:

Добавьте переменные окружения Hadoop в сценарий инициализаци сессии bash

Примените данные переменные окружения, чтобы они стали доступны:

Теперь отредактируем файлы конфигурации Hadoop для нашей трехузловой топологии.

Добавьте имя узла slave в файл $HADOOP_HOME/etc/hadoop/slaves :

Добавьте имя вторичного узла в файл $HADOOP_HOME/etc/hadoop/masters :

Если требуется отключить проверки безопасности в Hadoop, что часто используется при разработке, добавьте в файл следующую секцию:

Создайте директории, необходимые Hadoop:

Скопируйте дерево Hadoop и файлы параметров окружения на slave-узлы:

Запуск кластера Hadoop

Запустите распределенную файловую систему DFS:

Запустите распределенную вычислительную систему YARN:

Для остановки кластера Hadoop выполните:

Проверка состояния кластера

На каждом узле запустите команду jps. Убедитесь, что возвращается успешный ответ.

Успешный ответ jps на узле master:

Для детального мониторинга состояния кластера воспользуйтесь веб-интерфейсами Hadoop:

  • 192.168.171.132:50070 — для просмотра состояния хранилища HDFS.
  • 192.168.171.132:8088 — для просмотра ресурсов YARN и состояния приложений.

Заключение

Это все, что необходимо для того, чтобы развернуть базовый кластер Hadoop с поддержкой репликации данных на 3х узлах.

В рамках данного развертывания используется Hadoop с единой точкой отказа NameNode. Несмотря на то, что используется Secondary NameNode, кластер не является отказоустойчивым и должен применяться для целей разработки или малых установок. В больших установках необходимо применять более сложное развертывание с отказоустойчивыми NameNode. Мы расскажем об этом в будущих статьях.

Если вы обнаружили ошибку, вам непонятны некоторые инструкции, или есть предложения по улучшению статьи, будем рады, если вы свяжетесь с нами. Успехов в работе с Hadoop.

Hadoop is a free, open-source and Java-based software framework used for storage and processing of large datasets on clusters of machines. It uses HDFS to store its data and process these data using MapReduce. It is an ecosystem of Big Data tools that are primarily used for data mining and machine learning. It has four major components such as Hadoop Common, HDFS, YARN, and MapReduce.

In this guide, we will explain how to install Apache Hadoop on RHEL/CentOS 8.

Before starting, it is a good idea to disable the SELinux in your system.

To disable SELinux, open the /etc/selinux/config file:

Change the following line:

Save the file when you are finished. Next, restart your system to apply the SELinux changes.

Hadoop is written in Java and supports only Java version 8. You can install OpenJDK 8 and ant using DNF command as shown below:

Once installed, verify the installed version of Java with the following command:

You should get the following output:

It is a good idea to create a separate user to run Hadoop for security reasons.

Run the following command to create a new user with name hadoop:

Next, set the password for this user with the following command:

Provide and confirm the new password as shown below:

Next, you will need to configure passwordless SSH authentication for the local system.

First, change the user to hadoop with the following command:

Next, run the following command to generate Public and Private Key Pairs:

You will be asked to enter the filename. Just press Enter to complete the process:

Next, append the generated public keys from id_rsa.pub to authorized_keys and set proper permission:

Next, verify the passwordless SSH authentication with the following command:

You will be asked to authenticate hosts by adding RSA keys to known hosts. Type yes and hit Enter to authenticate the localhost:

First, change the user to hadoop with the following command:

Next, download the latest version of Hadoop using the wget command:

Once downloaded, extract the downloaded file:

Next, rename the extracted directory to hadoop:

Next, you will need to configure Hadoop and Java Environment Variables on your system.

/.bashrc file in your favorite text editor:

Append the following lines:

Save and close the file. Then, activate the environment variables with the following command:

Next, open the Hadoop environment variable file:

Update the JAVA_HOME variable as per your Java installation path:

Save and close the file when you are finished.

First, you will need to create the namenode and datanode directories inside Hadoop home directory:

Run the following command to create both directories:

Next, edit the core-site.xml file and update with your system hostname:

Change the following name as per your system hostname:

Save and close the file. Then, edit the hdfs-site.xml file:

Change the NameNode and DataNode directory path as shown below:

Save and close the file. Then, edit the mapred-site.xml file:

Make the following changes:

Save and close the file. Then, edit the yarn-site.xml file:

Make the following changes:

Save and close the file when you are finished.

Before starting the Hadoop cluster. You will need to format the Namenode as a hadoop user.

Run the following command to format the hadoop Namenode:

You should get the following output:

After formating the Namenode, run the following command to start the hadoop cluster:

Once the HDFS started successfully, you should get the following output:

Next, start the YARN service as shown below:

You should get the following output:

You can now check the status of all Hadoop services using the jps command:

You should see all the running services in the following output:

Hadoop is now started and listening on port 9870 and 8088. Next, you will need to allow these ports through the firewall.

Run the following command to allow Hadoop connections through the firewall:

Next, reload the firewalld service to apply the changes:

At this point, the Hadoop cluster is installed and configured. Next, we will create some directories in HDFS filesystem to test the Hadoop.

Next, run the following command to list the above directory:

You should get the following output:

You can also verify the above directory in the Hadoop Namenode web interface.

Go to the Namenode web interface, click on the Utilities => Browse the file system. You should see your directories which you have created earlier in the following screen:

You can also stop the Hadoop Namenode and Yarn service any time by running the stop-dfs.sh and stop-yarn.sh script as a Hadoop user.

To stop the Hadoop Namenode service, run the following command as a hadoop user:

To stop the Hadoop Resource Manager service, run the following command:

Conclusion

In the above tutorial, you learned how to set up the Hadoop single node cluster on CentOS 8. I hope you have now enough knowledge to install the Hadoop in the production environment.


By Rahul January 10, 2015 5 Mins Read Updated: June 8, 2017

Hadoop on Linux

Step 1: Installing Java

Java is the primary requirement to setup Hadoop on any system, So make sure you have Java installed on your system using the following command.

Step 2: Creating Hadoop User

We recommend creating a normal (nor root) account for Hadoop working. So create a system account using the following command.

After creating an account, it also required to set up key-based ssh to its own account. To do this use execute following commands.

Step 3. Downloading Hadoop 2.6.5

Now download hadoop 2.6.0 source archive file using below command. You can also select alternate download mirror for increasing download speed.

Step 4. Configure Hadoop Pseudo-Distributed Mode

4.1. Setup Hadoop Environment Variables

First, we need to set environment variable uses by Hadoop. Edit

/.bashrc file and append following values at end of file.

Now apply the changes in current running environment

Now edit $HADOOP_HOME/etc/hadoop/hadoop-env.sh file and set JAVA_HOME environment variable. Change the JAVA path as per install on your system.

4.2. Edit Configuration Files

Edit core-site.xml

Edit hdfs-site.xml

Edit mapred-site.xml

Edit yarn-site.xml

4.3. Format Namenode

Now format the namenode using the following command, make sure that Storage directory is

Step 5. Start Hadoop Cluster

Now start your Hadoop cluster using the scripts provides by Hadoop. Just navigate to your Hadoop sbin directory and execute scripts one by one.

Now run start-dfs.sh script.

Now run start-yarn.sh script.

Step 6. Access Hadoop Services in Browser

Hadoop NameNode started on port 50070 default. Access your server on port 50070 in your favorite web browser.

hadoop single node namenode

Now access port 8088 for getting the information about cluster and all applications

hadoop single node applications

Access port 50090 for getting details about secondary namenode.

Hadoop single node secondary namenode

Access port 50075 to get details about DataNode


Step 7. Test Hadoop Single Node Setup


You can also check this tutorial to run wordcount mapreduce job example using command line.

Related Posts

How to Install OpenCV on Ubuntu 20.04

How to Create A New React.Js Application

How to Install Yarn on MacOS

72 Comments

Dear Mr Rahul
Could you kindly help me, how we can deploy the services of Single Node Cluster to multiple clients in a Lab environment.

Dear Mr. Rahul,
I am very thankful for your installation guide but could not understand how we can
Edit

/.bashrc file to setup Hadoop Environment Variables, could you kindly help us with more screen shots please

I was stuck up at these area
4.1. Setup Hadoop Environment Variables
4.3 Resolution of host name with IP, where should we set this IP please help me

Regards
Dinakar NK


Hi Dinakar, Simply edit the

/.bashrc configuration file and copy the settings at end of file.


Thanks Ruchira for pointing this. I have updated tutorial accordingly.


Plz provide results of following commands.

$ telnet localhost 22
$ netstat -tulpn | grep 22


It looks OpenSSH server is not running on your system.

The programs included with the Kali GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

I am able to setup the hadoop multi node in ubuntu. but not able to setup the multi node in centos 6.6.
Where can i set the interfaces and hosts and hostname?
Can you please share the video link?

I am not able to follow step 2 because executing passwd hadoop, does ask for password.
I tried hitting just enter but then when I run ssh localhost, it keeps on asking password.

One think I noticed which is also different is the message (shows ssh2 instead of ssh):-
Public key saved to /home/hadoop/.ssh2/id_rsa_2048_a.pub

It just comes back saying -bash: yum: command not found


It looks /usr/bin and /usr/local/bin is not added in PATH environment variable. Please use below command to add it.

I want to change my CentOs code that means I want to add hadoop single node cluster to this and I need to share some other?
How can I do this ??

hdfs file not found

all went well except the last step: step 7:

I cannot fix error. Please help me
Thank you all !

Hey friend,
i am newbie about hadoop i configured hadoop on vagrant ubuntu machine.i wants access hadoop web ui on browser but i unable to do so.i tried changing the core-site.xml file for hadoop ui on browser by my machine ip and different ports for ui like 9000/8020 and 50075,50070 but nothing happens.
plz help so.
Thanks in advance.

have a question, could you please explain me, why have we created a user after installation of java, and how could the password less ssh work for ssh the localhost when i am already in the same system, or am i missing something in here.

Thank you so much for precise instructions which makes it simple and perfect !
Great help 🙂

can i create cluster with two different os (ubuntu and cygwin on windows ) in which hadoop (same version)is installed ?

great article.. works fine

It works on Centos 7 , JDK 8 & Hadoop 2.6
Thanks! a great tutorial.

Thanks. It worked with Centos7 and Hadoop 2.7.1

Thanks. It worked with fedora 22 and Hadoop 2.7.
The only Warning i get is below. I am not sure what it means.

I am trying to run my workflow on a new Yarn cluster via oozie. The job submits fine and as part of the workflow creates a scanner; the scanner is serialised and written to disk. Then during deserialising the string to a scan object I encounter the following error

I googled and checked for all kinds of config errors but all my configurations such as nodename, jobtracker, etc are correctly configured. Also, the google protobuf jar is consistent across all YARN components and my code. Wondering whats going wrong?
a
-Shashank

Hi,
I need to install Hadoop 2.6.0 multi node cluster with different os configurations. I am already having a master node and one slave node both at Ubuntu 12.04. I want to add one more slave node with CentOS.
I wanted to ask is it fine?

Thanks in advance!

First of all let me say THANK YOU for this tutorial. This is a very big help especially to a person like me who just started learning Hadoop / Bigdata.

2. If I rebooted my machine, do I need to run the start-dfs.sh and start-yarn.sh again?

Hi will you please help me to solve out this


Hi this is really a great post, I followed it and it works! I have a follow-up question: can you post another blog for how to install Spark on this single YARN cluster which can work with the the data on hdfs on this single machine?

Please could you suggest me some solution

export HADOOP_HOME=/home/hadoop/hadoop
I didnt understand the above statement. When you create a user hadoop, a folder will be created in home directory. What the purpose of second hadoop?

cd $HADOOP_HOME/etc/hadoop
There is no hadoop folder in etc directory.
I am confused because etc comes under the supervision of root user rather than hadoop user.

Hi, I am trying to run wordcount example. But it is getting stuck at ACCEPTED state.
It is not going into RUNNING state.
Any help appreciated. I have followed the tutorial exact. But using 2.6.0 instead of 2.4.0

Once it is done, just run `stop-dfs.sh` followed by & `start-dfs.sh`

Dear
How I change command from
-Old 64 bit
-1.7.0.51
-rhel-2.4.5.5.el7-x86_64 u51-b31
-(build 24.51-b03,mixed mode)
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.65-2.5.1.2.el7_0.x86_64/jre
What command change dorectory
My VM ware java version
-1.7.0.45
-rhel-2.4.3.3.el6-i386 u45-b15
-build 24.45-b08,mixed mode,sharing
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6_0.i386/jre
please help advisor because I am beginner CENT OS and HODOOP
Best regards

Dear Rahul
How I set again cloud you advisor


Please make sure you have configured JAVA_HOME correctly.

Hi, This one is a great article. I followed many other blogs for this problem. But none of them worked. This one simply worked with no error.

But i have a little problem.
I have installed hbase standalone mode.
Now i want hbase to use hdfs. So in hbase-site.xml file i added this:

But its giving error and not working. Any reason why its not working? I copied the same configuration of yours during my hadoop installation.

Thank for this article.

What could be the problem or what mistake I might have done? Please help.

However this got me along way forther

Thanks for taking the time to write up this guide, it was very helpful.

Hi Rahul
Thank you for sharing this with us. I followed your steps in installing hadoop 2.4 on Centos Vritual Machine. I have an Hbase running on my mac machine, I get connection refused error when hbase tries to connect.
Here is my setting in core-site.xml

and on Hbase: hbase-site.xml

I can telnet onto the port 54310 from the VM but not from a remote machine, i.e my local macbook which is running the virtual machine. Looks like the port is closed to remote client. I have disabled firewall but it didn’t help.

thanks for the tutorial, why

cant be written?


Please check if hadoop user has proper privileges on this file

Thanks for all the steps. Please update the mapred-site.xml to mapred-site.xml.template.

Also, please update the testing the setup.

Very good article,
Two issues,
First exit; ssh localhost will not work for public/private key
Should be ssh localhost; exit


We have updated article accordingly.

Rahul, wonderful article.

Need to know few things and appreciate your feedback on this;

1. Used RHAT 6.3 with Java 1.7/Hadoop 2.6.0
2. Able to run the Name Node and Data Node

you can check your iptables and allow that port (54310, 9000, 50070). i try it and it works well.

Great article but here is a script that also install hbase, hdfs, and a number of other resources

When are you publishing next part of this article? I loved this and I am waiting to see how will you test your setup by running some example map reduce job.


By Rahul November 12, 2015 5 Mins Read Updated: April 3, 2019

Apache Hadoop 3.1 have noticeable improvements any many bug fixes over the previous stable 3.0 releases. This version has many improvements in HDFS and MapReduce. This how-to guide will help you to setup Hadoop 3.1.0 Single-Node Cluster on CentOS/RHEL 7/6 and Fedora 29/28/27 Systems. This article has been tested with CentOS 7 LTS.

This tutorial is for configuring Hadoop Single-Node Cluster. You may be intrested in Hadoop Multi-node Cluster Setup on Linux systems.

Setup Hadoop on Linux

1. Prerequsities

2. Create Hadoop User

We recommend creating a normal (nor root) account for Hadoop working. To create an account using the following command.

After creating the account, it also required to set up key-based ssh to its own account. To do this use execute following commands.

3. Download Hadoop 3.1 Archive

In this step, download hadoop 3.1 source archive file using below command. You can also select alternate download mirror for increasing download speed.

4. Setup Hadoop Pseudo-Distributed Mode

4.1. Setup Hadoop Environment Variables

First, we need to set environment variable uses by Hadoop. Edit

/.bashrc file and append following values at end of file.

Now apply the changes in the current running environment

Now edit $HADOOP_HOME/etc/hadoop/hadoop-env.sh file and set JAVA_HOME environment variable. Change the JAVA path as per install on your system. This path may vary as per your operating system version and installation source. So make sure you are using correct path.

4.2. Setup Hadoop Configuration Files

Edit core-site.xml

Edit hdfs-site.xml

Edit mapred-site.xml

Edit yarn-site.xml

4.3. Format Namenode

Now format the namenode using the following command, make sure that Storage directory is

5. Start Hadoop Cluster

Now run start-dfs.sh script.

Now run start-yarn.sh script.

6. Access Hadoop Services in Browser

Hadoop NameNode started on port 9870 default. Access your server on port 9870 in your favorite web browser.


Now access port 8042 for getting the information about the cluster and all applications


Access port 9864 to get details about your Hadoop node.


7. Test Hadoop Single Node Setup

7.1. Make the HDFS directories required using following commands.

7.3. Browse Hadoop distributed file system by opening below URL in the browser. You will see an apache2 folder in the list. Click on the folder name to open and you will find all log files there.


You can also check this tutorial to run wordcount mapreduce job example using command line.

Related Posts

How to Install and Configure Hadoop on Ubuntu 20.04

How To Install and Configure Hadoop on CentOS/RHEL 8

103 Comments

Out of so many i tried this one worked .. thanks a ton ..

i m getting this error : Error: Cannot find configuration directory: /etc/hadoop . any ideas ?

Starting resourcemanager
Starting nodemanagers

Thanks. This post is helpful in the datanode and namenode configuration which is missing in the guide of hadoop.

I have received an error
WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
Could you please help out me

!
How can I deal this under the error ?

WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
ERROR: Invalid HADOOP_YARN_HOME

bin/hdfs dfs -mkdir /user
When I run this, it says
-su: bin/hdfs: No such file or directory
Please help

For the steps showing here, does it work on Red Hat Linux 7.5?
I have tried. But not able to make it work. Here is the error:

Thank you in advance.



Hi Hema, Thanks for pointing out.

This tutorial has been updated to the latest version of Hadoop installation.

Regards
Dinakar N K

The information given is incomplete need to export more variables.

This is just awesome. I have had some issues during installation but your guide to installing a hadoop cluster is just great !!
I still used the vi editor for editing the .bashrc file and all other xml files. Just the lack of linux editor knowledge I did it. I used the 127.0.0.1 instead of localhost as the alias names are not getting resolved.

Do you have some idea ?


$ sudo apt-get install openssh-client

hi rahul i followed your setup here but unfortunately i ended up some errors like the one below.

Can anybody can tell me where is bin/hdfs in Step 7 please ?

This can be misleading. What it is saying is from the bin folder within the hadoop system folder.

$ cd $HADOOP_HOME/bin
$ hdfs dfs -mkdir /user
$ hdfs dfs -mkdir/user/hadoop

bash: /home/hadoop/.bashrc: Permission denied

I guess that the CentOS 7/Hadoop 2.7.1 tutorial was very helpful until step 4, when by some reason the instructions just explained what to do with .bashrc without explaining how to get to it and how to edit it in first place. Thanks anyway, I just need to find a tutorial that explains with detail how to set up Hadoop.

Thanks. I read your article have successfully deployed my first single node hadoop deployment despite series of unsuccessful attempts in the past. thanks

Hi Rahul,
Please help me! I installed follow your guide. when i run jps that result below:
18118 Jps
18068 TaskTracker
17948 JobTracker
17861 SecondaryNameNode
17746 DataNode

however when I run stop-all.sh command that
no jobtracker to stop
localhost: no tasktracker to stop
no namenode to stop
localhost: no datanode to stop
localhost: no secondarynamenode to stop
Can you explan for me? Thanks so much!

Can u plz tell me how to configure multiple datanodes on a single machine.I am using hadoop2.5

I am not able to start the hadoop services..getting an error like>

please help me to overcome this error.

Thanks a lot.
I made a few modifications but the the instructions are on the money!

Your artcile is simply super, I followed each and every step and installed hadoop.

I guess.. these export JAVA_HOME also need to include in some where else.

Thansk alot for your article buddy


Plz help me for following case
[FATAL ERROR] core-site.xml:10:2: The markup in the document following the root element must be well-formed.


Please provide your core-site.xml file content

localhost: [fatal Error] core-site.xml:10:2: The markup in the document following the root element must be well-formed.


Please make sure you have setup JAVA_HOME environment variable. Please provide output of following commands.

Hi Rahul,
I have successfully done the installation of single node cluster and able to see all daemons running.
But i am not able to run hadoop fs commands , for this should i install any thing else like jars??

How do we add data to the single node cluster.

Do you know what is the best site to download Pig and hive? I realized that I am unable to run and pig and hive. I thought it comes with the package just like while setting-up under cloudera.

Hi Rahul,
I just restarted the hadoop and JPS is working fine.

Hi Rahul,
jps is not showing under $JAVA_HOME/bin

It comes out with an error no such file or directory

Also, once I complete my tasks and comes out of linux, do I need to restart the hadoop?


Hi Raj,
Try following command for jps.

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
RHEL


Your systems hosts file entry looks incorrect. Please add entry like below

Hi Rahul!
Just to say your instructions worked like a dream.
In my hadoop-env.sh I used.
export JAVA_HOME=/usr/lib/jvm/jre-1.6.0 <- might help others its a vanilla Centos 6.5 install.
Cheers Paul

Thanks alot buddy 🙂

Keep doing such good works 🙂


i follow all your guide step by steps but encounter some problems while executing
this command:

Fixed these errors. Checked the logs and got the help from websites.

-bash-3.2$ jps
21984 NameNode
27080 DataNode
1638 ResourceManager
1929 NodeManager
5718 JobHistoryServer
6278 Jps

-bash-3.2$ jps
21984 NameNode
27080 DataNode
11037 Jps
1638 ResourceManager
1929 NodeManager

Where can I see the information as to why this stopped? Can you please suggest. Sorry for multiple posts. But I had to update and dont seem to find any help googling.
Thanks

Any ideas? I will test before I shut and restart
Thanks.

Hi
I downloaded and installed Apache hadoop 2.2 latest. Followed the above setup for single node ( First time setup.) RHEL 5.5
Name node, DataNode, ResourceManager, NodeManager started fine. Had some issue with datanode & had to update the IPtables for opening ports.
when I run
-bash-3.2$ sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /hadoop/hadoop-2.2.0/logs/mapred-hduser-historyserver-server.out
when I run jps, I dont see the JobHistoryServer listed . There are no errors in the out file above.

Can someone please assist?
Thanks
JKH

Hello Rahul,
I need your help , my cluster does not work because it must have something wrong in the configuration .

because when I put start-all.sh it does not initialize the package secondarynamenode .

shows this error

starting secondarynamenode , logging to / opt/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-secondarynamenode-lbad012.out

lbad012 : at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress ( NameNode.java : 212 )

lbad012 : at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress ( NameNode.java : 244 )

lbad012 : at org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress ( NameNode.java : 236 )

lbad012 : at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize ( SecondaryNameNode.java : 194 )

lbad012 : at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode . ( SecondaryNameNode.java : 150 )

lbad012 : at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main ( SecondaryNameNode.java : 676 )

heyy
please help.
wen i am doing the step 6 its asking ffor some password even though i havent set any password.please tell wat to do.


and my hue_safety_valve.ini looks as below

[hadoop]
[[mapred_clusters]]
[[[default]]]
jobtracker_host=servername
thrift_port=9290
jobtracker_port=8021
submit_to=True
hadoop_mapred_home=>
hadoop_bin=>
hadoop_conf_dir=>
security_enabled=false

Can you please suggest the steps to uninstall the apache hadoop . I am planning to test cloudera as well , if you can share steps for cloudera as well that would be awesome ! .


Are you configuring hadoop multinode cluster ?

I have found solutions for my issues, thanks rahul for your post it helped me a lot . .


Core xml file for reference


Hi Rahul,
sorry for my delay response.
I was able to resolve the above error ,By mistake i deleted the wrapper in the xml which caused the error , I have now kept the right data in xml and found below output after format.

16467 Resource Manager
15966 NameNode
16960 Jps
16255 SecondaryNameNode

Would appreciate any comment from your side on my queries here.


Yes, Please post the output of start-all.sh command with log files. But first plz empty your log files and them run start-all.sh after that post all outputs.

Also I prefer, if you post your question on our new forum, so it will be better to communicate.

Here is the output of start-all.sh script.

hello..
i also faced the same issue as mr Rakesh
while executing the format command


Установите Apache Hadoop на CentOS 8

Шаг 1. Во-первых, давайте начнем с проверки актуальности вашей системы.

Шаг 2. Установка Java.

Apache Hadoop написан на Java и поддерживает только версию Java 8. Вы можете установить OpenJDK 8 с помощью следующей команды:

Проверьте версию Java:

Шаг 3. Установка Apache Hadoop CentOS 8.

Рекомендуется создать обычного пользователя для настройки Apache Hadoop, создать пользователя с помощью следующей команды:

Далее нам нужно будет настроить SSH-аутентификацию без пароля для локальной системы:

Проверьте конфигурацию ssh без пароля с помощью команды:

Следующие шаги, загрузите последнюю стабильную версию Apache Hadoop. На момент написания этой статьи это версия 3.2.1:

Затем вам нужно будет настроить Hadoop и переменные среды Java в вашей системе:

Теперь мы активируем переменные среды с помощью следующей команды:

Затем откройте файл переменной среды Hadoop:

Hadoop имеет множество файлов конфигурации, которые необходимо настроить в соответствии с требованиями вашей инфраструктуры Hadoop. Начнем с конфигурации с базовой настройкой кластера с одним узлом Hadoop:

Редактировать : hdfs - site . xml

Редактировать : mapred - site . xml

Теперь отформатируйте namenode с помощью следующей команды, не забудьте проверить каталог хранилища:

Запустите демоны NameNode и DataNode с помощью скриптов, предоставленных Hadoop:

Шаг 4. Настройте брандмауэр.

Выполните следующую команду, чтобы разрешить подключения Apache Hadoop через брандмауэр:

Шаг 5. Доступ к Apache Hadoop.

Поздравляю! Вы успешно установили Apache Hadoop . Благодарим за использование этого руководства для установки Hadoop в системе CentOS 8. Для получения дополнительной помощи или полезной информации мы рекомендуем вам посетить официальный веб-сайт Apache Hadoop .

Читайте также: