Spark/Hadoop Cluster: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
== Passwordless SSH from Master | = Getting Started = | ||
This assumes the spark/hadoop cluster were configured in a particular way. You can see the general configuration from the Foreman page, but in general, spark was configured in the /usr/local/spark directory and hadoop was installed to /usr/local/hadoop. | |||
= Passwordless SSH from Master = | |||
To allow the spark master user to ssh to itself (for a local worker) and also the workers, you need ssh passwordless to be enabled. This can be done by logging into the spark user on the master server and doing: | To allow the spark master user to ssh to itself (for a local worker) and also the workers, you need ssh passwordless to be enabled. This can be done by logging into the spark user on the master server and doing: | ||
Line 15: | Line 19: | ||
<pre>ssh-copy-id -i ~/.ssh/id_rsa.pub spark@localhost | <pre>ssh-copy-id -i ~/.ssh/id_rsa.pub spark@localhost | ||
ssh-copy-id -i ~/.ssh/id_rsa.pub spark@spark2.lab.bpopp.net</pre> | ssh-copy-id -i ~/.ssh/id_rsa.pub spark@spark2.lab.bpopp.net</pre> | ||
= Starting Spark = | |||
<pre> | |||
su spark | |||
cd /usr/local/spark/sbin | |||
./start-all.sh | |||
</pre> | |||
= Starting Hadoop = |
Revision as of 06:42, 29 January 2024
Getting Started
This assumes the spark/hadoop cluster were configured in a particular way. You can see the general configuration from the Foreman page, but in general, spark was configured in the /usr/local/spark directory and hadoop was installed to /usr/local/hadoop.
Passwordless SSH from Master
To allow the spark master user to ssh to itself (for a local worker) and also the workers, you need ssh passwordless to be enabled. This can be done by logging into the spark user on the master server and doing:
ssh-keygen -t rsa -P ""
Once the key has been generated, it will be in /home/spark/.ssh/id_rsa (by default). Copy it to the authorized hosts file (to allow spark to ssh to itself):
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Or, for each worker, do something like:
ssh-copy-id -i ~/.ssh/id_rsa.pub spark@localhost ssh-copy-id -i ~/.ssh/id_rsa.pub spark@spark2.lab.bpopp.net
Starting Spark
su spark cd /usr/local/spark/sbin ./start-all.sh