Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission issue in starting Hadoop. #4

Open
Jeeva-Ganesan opened this issue Jan 4, 2018 · 2 comments
Open

Permission issue in starting Hadoop. #4

Jeeva-Ganesan opened this issue Jan 4, 2018 · 2 comments

Comments

@Jeeva-Ganesan
Copy link

Hi, I tried to use this makefile, everything installed fine, even keys were added successfully, however when I tried to start Hadoop, I am getting this error.

/home/ubuntu/Workspace/hadoop-spark-hive/tools/hadoop-2.8.1/sbin/start-dfs.sh
Starting namenodes on [localhost]
localhost: Permission denied (publickey).
localhost: Permission denied (publickey).
Starting secondary namenodes [0.0.0.0]
0.0.0.0: Permission denied (publickey).

Can you please assist with this?

Thanks.

@earthquakesan
Copy link
Owner

Hi @Jeeva-Ganesan!

It seems like the keys were not added properly, check "configure_hadoop" make target. You should be able to:

ssh localhost

P.S. There is another hive setup in Docker, which I have created. You can find it here

@anomaly122
Copy link

anomaly122 commented Feb 26, 2018

I have same problem!
My solution is follow : (Ubuntu)

Step by Step

  1. sudo -su

  2. mkdir -p ~/Workspace/hadoop-spark-hive && cd ~/Workspace/hadoop-spark-hive

  3. sudo apt-get install git

  4. git clone https://github.com/earthquakesan/hdfs-spark-hive-dev-setup ./
    And then I can see Makefile.

  5. Before run next CLI "make donwload" , you must change URL. (Because Url address is changed)

  1. And change spark-home. (Because download file name changed)
  • old : spark_home := $(addsuffix tools/spark-2.0.0-bin, $(current_dir))
  • change : spark_home := $(addsuffix tools/spark-2.0.0, $(current_dir))
  1. make download
  2. After download success, you will run "make configure"
    Before above CLI, you must set ssh command and java enviroment. (9~18)

Install Java and set "JAVA_HOME"
9) apt-get install openjdk-8-jdk
10) which javac
/usr/bin/javac
11) readlink -f /usr/bin/javac
/usr/lib/jvm/java-8-openjdk-amd64/bin/javac <<- this is real path

  1. gedit ~/.bashrc
  2. export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
  3. source ~/.bashrc

then add profile and source! (Because "ssh-add" fail!)
15) gedit ~/.profile
16) add follow command
if ! pgrep -q -U whoami -x 'ssh-agent'; then ssh-agent -s > ~/.ssh-agent.sh; fi
17) source ~/.profile

And then you change ssh key from "das" to "rsa"
18) Edit Makefile

  • ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
  • cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  • chmod 0600 ~/.ssh/authorized_keys
  • ssh-add
  1. Run CLI "make configure"
    if I fail "make configure", I remove ssh file and hadoop setting information
  • gedit ~/Workspace/hadoop-spark-hive/tools/hadoop-2.7.2/etc/hadoop/hdfs-site.xml
    ( /// please remove //// - do not remove )
  • gedit ~/Workspace/hadoop-spark-hive/tools/hadoop-2.7.2/etc/hadoop/core-site.xml
    ( /// please remove //// - do not remove )
  • rm -rf ~/.ssh
  1. test if ask password "ssh localhost"
    Success : don't ask or show warning message.

  2. Run CLI "make start_hadoop"

  3. Open browser, insert url "http://localhost:50070"

....

Success!!! >0<

Modify makefile

  • Copy and use it !

mkfile_path := $(abspath $(lastword $(MAKEFILE_LIST)))
current_dir := $(dir $(mkfile_path))
hive_home := $(addsuffix tools/apache-hive-2.1.0-bin, $(current_dir))
hadoop_home := $(addsuffix tools/hadoop-2.7.2, $(current_dir))
spark_home := $(addsuffix tools/spark-2.0.0, $(current_dir))

download: download_hadoop download_spark download_hive

download_hadoop:
mkdir -p ${current_dir}tools
cd ${current_dir}tools; wget --no-check-certificate https://archive.apache.org/dist/hadoop/core/hadoop-2.7.2/hadoop-2.7.2.tar.gz && tar -xvf hadoop-2.7.2.tar.gz && rm -rf hadoop-2.7.2.tar.gz

download_spark:
mkdir -p ${current_dir}tools
cd ${current_dir}tools; wget --no-check-certificate https://archive.apache.org/dist/spark/spark-2.0.0/spark-2.0.0.tgz && tar -xvf spark-2.0.0.tgz && rm -rf spark-2.0.0.tgz

download_hive:
mkdir -p ${current_dir}tools
cd ${current_dir}tools; wget --no-check-certificate https://archive.apache.org/dist/hive/hive-2.1.0/apache-hive-2.1.0-bin.tar.gz && tar -xvf apache-hive-2.1.0-bin.tar.gz && rm -rf apache-hive-2.1.0-bin.tar.gz

configure: configure_hadoop configure_spark

configure_hadoop:
#install Ubuntu dependencies
sudo apt-get install -y ssh rsync
#Set JAVA_HOME explicitly
sed -i "s#.export JAVA_HOME.#export JAVA_HOME=${JAVA_HOME}#g" ${hadoop_home}/etc/hadoop/hadoop-env.sh
#Set HADOOP_CONF_DIR explicitly
sed -i "s#.export HADOOP_CONF_DIR.#export HADOOP_CONF_DIR=${hadoop_home}/etc/hadoop#" ${hadoop_home}/etc/hadoop/hadoop-env.sh
#define fs.default.name in core-site.xml
sed -i '/</configuration>/i fs.default.namehdfs://localhost:9000' ${hadoop_home}/etc/hadoop/core-site.xml
sed -i '/</configuration>/i hadoop.tmp.dirfile://${current_dir}data/hadoop-tmp' ${hadoop_home}/etc/hadoop/core-site.xml
#set dfs.replication and dfs.namenode.name.dir
mkdir -p ${current_dir}data/hadoop
sed -i '/</configuration>/i dfs.replication1' ${hadoop_home}/etc/hadoop/hdfs-site.xml
sed -i '/</configuration>/i dfs.namenode.name.dirfile://${current_dir}data/hadoop' ${hadoop_home}/etc/hadoop/hdfs-site.xml
${hadoop_home}/bin/hdfs namenode -format
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
ssh-add

start_hadoop:
${hadoop_home}/sbin/start-dfs.sh
stop_hadoop:
${hadoop_home}/sbin/stop-dfs.sh

configure_spark:
# Change logging level from INFO to WARN
cp ${spark_home}/conf/log4j.properties.template ${spark_home}/conf/log4j.properties
sed -i "s#log4j.rootCategory=INFO, console#log4j.rootCategory=WARN, console#g" ${spark_home}/conf/log4j.properties
# Set up Spark environment variables
echo 'export SPARK_LOCAL_IP=127.0.0.1' >> ${spark_home}/conf/spark-env.sh
echo 'export HADOOP_CONF_DIR="${hadoop_home}/etc/hadoop"'>> ${spark_home}/conf/spark-env.sh
echo 'export SPARK_DIST_CLASSPATH="$(shell ${hadoop_home}/bin/hadoop classpath)"'>> ${spark_home}/conf/spark-env.sh
echo 'export SPARK_MASTER_IP=127.0.0.1'>> ${spark_home}/conf/spark-env.sh
mkdir -p ${current_dir}data/spark-rdd
echo 'export SPARK_LOCAL_DIRS=${current_dir}data/spark-rdd'

start_spark:
${spark_home}/sbin/start-all.sh
stop_spark:
${spark_home}/sbin/stop-all.sh

configure_hive:
echo "Installing JDBC for Java 8. If you use other Java version see: https://jdbc.postgresql.org/download.html#current"
wget https://jdbc.postgresql.org/download/postgresql-9.4.1209.jar
mv postgresql-9.4.1209.jar ${hive_home}/lib/
#enable JDBC connection
echo '' >> ${hive_home}/conf/hive-site.xml
echo '' >> ${hive_home}/conf/hive-site.xml
echo '' >> ${hive_home}/conf/hive-site.xml
#echo 'javax.jdo.option.ConnectionURLjdbc:derby:;databaseName=${current_dir}metastore_db;create=true' >> ${hive_home}/conf/hive-site.xml
echo 'javax.jdo.option.ConnectionURLjdbc:postgresql://localhost/metastore' >> ${hive_home}/conf/hive-site.xml
echo 'javax.jdo.option.ConnectionDriverNameorg.postgresql.Driver' >> ${hive_home}/conf/hive-site.xml
echo 'javax.jdo.option.ConnectionUserNamehive' >> ${hive_home}/conf/hive-site.xml
echo 'javax.jdo.option.ConnectionPasswordhive' >> ${hive_home}/conf/hive-site.xml
echo 'datanucleus.autoCreateSchemafalse' >> ${hive_home}/conf/hive-site.xml
echo 'hive.metastore.uristhrift://localhost:9083' >> ${hive_home}/conf/hive-site.xml
echo '' >> ${hive_home}/conf/hive-site.xml
#Copy hive-stie.xml to Spark -- necessary to run Spark apps with configured metastore
cp ${hive_home}/conf/hive-site.xml ${spark_home}/conf/
#export environment variables
echo 'export HADOOP_HOME="${hadoop_home}"' >> ${hive_home}/conf/hive-env.sh
echo 'export HIVE_HOME="${hive_home}"' >> ${hive_home}/conf/hive-env.sh
#Create hdfs folders
${hadoop_home}/bin/hadoop fs -mkdir -p /tmp
${hadoop_home}/bin/hadoop fs -mkdir -p /user/hive/warehouse
${hadoop_home}/bin/hadoop fs -chmod g+w /tmp
${hadoop_home}/bin/hadoop fs -chmod g+w /user/hive/warehouse

start_hive:
${hive_home}/bin/hive
start_hive_server:
${hive_home}/bin/hiveserver2 --hiveconf hive.server2.enable.doAs=false
start_hive_beeline_client:
${hive_home}/bin/beeline -u jdbc:hive2://localhost:10000
start_hive_postgres_metastore:
echo "Starting postgres docker container"
docker run -d --name hive-metastore -p 5432:5432 earthquakesan/hive-metastore-postgresql:2.1.0
sleep 5;
echo "Running Hive Metastore service"
${hive_home}/bin/hive --service metastore

pyspark:
IPYTHON=1 ${spark_home}/bin/pyspark
spark_shell:
${spark_home}/bin/spark-shell

activate:
echo "export PATH=${PATH}:${spark_home}/bin:${hadoop_home}/bin:${hive_home}/bin" >> activate
chmod a+x activate
echo "Run the following command in your terminal:"
echo "source activate"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants