Spark on YARN - datacouch-io/spark-java GitHub Wiki
Understanding the difference between YARN Client and Cluster Modes
Objective: To test the consequences of making changes to container sizes, Maximum Application settings, and queue states; configure application preemption and create a set of queues and leaf queues which logically represent the organizations you support and their SLAs
-
Open terminal
-
Change to the
~/spark3/examples/jars directory
.cd spark3/examples/jars
-
Open a second terminal window.
-
Once again, change to the
~/spark3/examples/jars directory
.cd ~/spark3/examples/jars
-
Position the two terminal windows so that both are visible on your screen side-by-side.
-
Import
sherlock.txt
data into HDFShdfs dfs -mkdir data/ hdfs dfs -put ~/data/sherlock.txt data/
-
Type the Spark Submit command to execute wordcount in both terminal windows but do not press Enter to execute it.
Terminal One
spark3-submit --class org.apache.spark.examples.JavaWordCount --master yarn --deploy-mode client spark-examples_2.12-3.1.2.jar data/sherlock.txt
List Application
-
While Application is running, open another terminal window and execute below command:
$ yarn application -list
Output
To view Logs of the Application, execute below command:
yarn logs -applicationId {your application id}
Kill an Application
- Below command kills a running application -
yarn application -kill {your application id}
Exploring the YARN Cluster
-
Open the browser and connect to the CM at the URL http://localhost:7180
-
Login to the CM UI using the username admin and the password admin
-
Click Services in the CM UI.
-
Select the YARN service on the Services page.
- Click Web UI and select ResourceManager UI.
The ResourceManager UI Web interface opens in another browser tab.
NOTE: If simply clicking the quick link fails to open the ResourceManager UI, replace the default URL in the browser tab with datacouch.training.io:8088/cluster
- This page will be refreshed in a moment once applications are running. Leave this tab open.
Terminal Two
spark3-submit --class org.apache.spark.examples.JavaWordCount --master yarn --deploy-mode cluster spark-examples_2.12-3.1.2.jar data/sherlock.txt
Open ResourceManager UI Web interface in another browser tab and click on application ID