Installing Apache Spark via Puppet: Difference between revisions

No edit summary
No edit summary
Line 37: Line 37:
   include spark::frontend
   include spark::frontend
}</pre>
}</pre>
== Spark Config ==
/usr/local/spark/conf/slaves
<pre>
# A Spark Worker will be started on each of the machines listed below.
spark1
spark2
spark3
#spark4
</pre>

Revision as of 04:18, 9 January 2020

Install Java Module into Puppet

/etc/puppetlabs/code/environments/production$ sudo /opt/puppetlabs/bin/puppet module install puppetlabs/java

Install Spark Module to /etc/puppetlabs/code/environments/production/manifests/spark.pp Note that this hard codes server names. Not ideal, but it's a starting point.

$master_hostname='spark-master.bpopp.net'

class{'hadoop':
  realm         => '',
  hdfs_hostname => $master_hostname,
  slaves        => ['spark1.bpopp.net', 'spark2.bpopp.net'],
}

class{'spark':
  master_hostname        => $master_hostname,
  hdfs_hostname          => $master_hostname,
  historyserver_hostname => $master_hostname,
  yarn_enable            => false,
}

node 'spark-master.bpopp.net' {
  include spark::master
  include spark::historyserver
  include hadoop::namenode
  include spark::hdfs
}

node /spark(1|2).bpopp.net/ {
  include spark::worker
  include hadoop::datanode
}

node 'client.bpopp.net' {
  include hadoop::frontend
  include spark::frontend
}


Spark Config

/usr/local/spark/conf/slaves

# A Spark Worker will be started on each of the machines listed below.
spark1
spark2
spark3
#spark4