forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
30 lines
9.0 KiB
HTML
30 lines
9.0 KiB
HTML
<a name="mrs_01_0366"></a><a name="mrs_01_0366"></a>
|
|
|
|
<h1 class="topictitle1">Getting Started with Spark</h1>
|
|
<div id="body1589421619606"><p id="mrs_01_0366__af103364aa976495d8b047395830ae973">This section describes how to use Spark to submit a SparkPi job. SparkPi, a typical Spark job, is used to calculate the value of Pi (π).</p>
|
|
<div class="section" id="mrs_01_0366__sd2a57331ab704780bbe437bd73c7c8de"><h4 class="sectiontitle">Procedure</h4><ol id="mrs_01_0366__oa18a68d0bcc74b30b9a9a9f2137b6ab4"><li id="mrs_01_0366__l73a12356415f4048981acfcdd6579007"><a name="mrs_01_0366__l73a12356415f4048981acfcdd6579007"></a><a name="l73a12356415f4048981acfcdd6579007"></a><span>Prepare the SparkPi program.</span><p><p id="mrs_01_0366__a939d4b45686245ba82388b6d86cf0aea">Multiple open-source Spark sample programs are provided, including SparkPi. Click <a href="https://archive.apache.org/dist/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgz" target="_blank" rel="noopener noreferrer">https://archive.apache.org/dist/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgz</a> to download the software package.</p>
|
|
<p id="mrs_01_0366__ae222ebce0fe84c0fa967d802a595c8b5">Decompress the software package to obtain the <strong id="mrs_01_0366__b12603245132217">spark-examples_2.11-2.1.0.jar</strong> file, the sample program package, in the <strong id="mrs_01_0366__b17822935152520">spark-2.1.0-bin-hadoop2.7/examples/jars</strong> directory. The <strong id="mrs_01_0366__b62236712615">spark-examples_2.11-2.1.0.jar</strong> sample program package contains the SparkPi program.</p>
|
|
</p></li><li id="mrs_01_0366__l9c355cfcb7e94337b4acf23ef5b3b2f7"><span>Upload data to OBS.</span><p><ol type="a" id="mrs_01_0366__ofa3aa0f60b424b6188d2c3b541cea204"><li id="mrs_01_0366__l6cb10bb327fa4b1da979d39927f4531d">Log in to OBS Console.</li><li id="mrs_01_0366__lfe9f6a2f8ec648848198d9674251b441">Choose <strong id="mrs_01_0366__b7642930141810">Parallel File System</strong> > <strong id="mrs_01_0366__b1864219305181">Create Parallel File System</strong> to create a file system named <strong id="mrs_01_0366__b364243015183">sparkpi</strong>.<p id="mrs_01_0366__a9c4aca37fc8f427d91f1d90853772a22"><strong id="mrs_01_0366__b15921133918188">sparkpi</strong> is only an example. The file system name must be globally unique. Otherwise, the parallel file system fails to be created. Use the default values for other parameters.</p>
|
|
</li><li id="mrs_01_0366__li3404164419718">Click the file system name <strong id="mrs_01_0366__b32082241019">sparkpi</strong> and click <strong id="mrs_01_0366__b320122181016">Files</strong>.</li><li id="mrs_01_0366__li130626173717">Click <strong id="mrs_01_0366__b141865427468">Create Folder</strong> to create the <strong id="mrs_01_0366__b8186194224610">program</strong> folder..</li><li id="mrs_01_0366__lea54ef8d2b7d4c81aff203c177d26c0b">Go to the <strong id="mrs_01_0366__b5236855161720">program</strong> folder, click <strong id="mrs_01_0366__b172475971817">Upload Object</strong>, select the program package downloaded in <a href="#mrs_01_0366__l73a12356415f4048981acfcdd6579007">1</a> from the local PC, and set <strong id="mrs_01_0366__b64611924111913">Storage Class</strong> to <strong id="mrs_01_0366__b45171326121913">Standard</strong>.</li></ol>
|
|
</p></li><li id="mrs_01_0366__l386a3fc1418f4b3c8bb1168fc216736b"><span>Log in to the MRS console. In the left navigation pane, choose <strong id="mrs_01_0366__b1266113014199">Clusters</strong> > <strong id="mrs_01_0366__b1066743014194">Active Clusters</strong>, and click a cluster name.</span></li><li id="mrs_01_0366__lbdc27dca47484c2e901c956bd4e2494e"><span>Submit the SparkPi job.</span><p><div class="p" id="mrs_01_0366__p19986101711388">On the MRS console, click the <span class="uicontrol" id="mrs_01_0366__uicontrol17219193234715"><b>Jobs</b></span> tab and click <span class="uicontrol" id="mrs_01_0366__uicontrol12224532194713"><b>Create</b></span>. The <strong id="mrs_01_0366__b102256323474">Create Job</strong> page is displayed. For details about how to submit the job, see <a href="https://docs.otc.t-systems.com/en-us/usermanual/mrs/mrs_01_0524.html" target="_blank" rel="noopener noreferrer">Running a SparkSubmit or Spark Job</a>.<ul id="mrs_01_0366__ul13890947104010"><li id="mrs_01_0366__li18890647154019">Set <strong id="mrs_01_0366__b52199535472">Type</strong> to <strong id="mrs_01_0366__b1822414531474">SparkSubmit</strong>.</li><li id="mrs_01_0366__li16890194718403">Set <strong id="mrs_01_0366__b786825515477">Name</strong> to <strong id="mrs_01_0366__b8415222480">sparkPi</strong>.</li><li id="mrs_01_0366__li1890647184016">Set <strong id="mrs_01_0366__b842352706213513">Program Path</strong> to the path where programs are stored on OBS, for example, <strong id="mrs_01_0366__b1066519014498">obs://sparkpi/program/spark-examples_2.11-2.1.0.jar</strong>.</li><li id="mrs_01_0366__li25639312316">In <strong id="mrs_01_0366__b37691029145911">Program Parameter</strong>, select <strong id="mrs_01_0366__b5220118114911">--class</strong> for <strong id="mrs_01_0366__b3129713114918">Parameter</strong> and set <strong id="mrs_01_0366__b1291514035813">Value</strong> to <strong id="mrs_01_0366__b1354992418493">org.apache.spark.examples.SparkPi</strong>.</li><li id="mrs_01_0366__li889074715404">Set <strong id="mrs_01_0366__b6948175916583">Parameters</strong> to <strong id="mrs_01_0366__b12890730134917">10</strong>.</li><li id="mrs_01_0366__li0891204774012">Leave <strong id="mrs_01_0366__b1772312327493">Service Parameter</strong> blank.</li></ul>
|
|
</div>
|
|
<p id="mrs_01_0366__a60330685f3e44364ad3c99053c4a8426">A job can be submitted only when the cluster is in the <span class="parmvalue" id="mrs_01_0366__parmvalue189481346193815"><b>Running</b></span> state.</p>
|
|
<p id="mrs_01_0366__a018e2d53b455420f8baf4e424dd5a5a6">After a job is submitted successfully, it is in the <span class="parmvalue" id="mrs_01_0366__parmvalue810384714394"><b>Accepted</b></span> state by default. You do not need to manually execute the job.</p>
|
|
</p></li><li id="mrs_01_0366__l4ecfa9cfd1554bfdbae3338a5dd5088c"><span>View the job execution result.</span><p><ol type="a" id="mrs_01_0366__o388a703f4415455cabdf93230509ad76"><li id="mrs_01_0366__l902a2434e61048f2a0a6d7c527e49ea4">Go to the <span class="wintitle" id="mrs_01_0366__w503243d1e5f24011b8620cf69d824c86"><b>Jobs</b></span> tab page and view job execution status.<p id="mrs_01_0366__afaa09a0cdccd4e518d8c4b5e4219df68">The job execution takes a while. After the jobs are complete, refresh the job list.</p>
|
|
<p id="mrs_01_0366__ae749b06ed57f49ddb599015f43188007">Once a job has succeeded or failed, you cannot execute it again. However, you can add or copy a job, and set job parameters to submit a job again.</p>
|
|
</li><li id="mrs_01_0366__l2f38fd1aa370493ab76090418b8f2395">Go to the native Yarn page and view the job output information.<ol class="substepthirdol" id="mrs_01_0366__ol23664351516"><li id="mrs_01_0366__li685693719208">On the <strong id="mrs_01_0366__b48825116501">Jobs</strong> tab page, locate the row that contains the target job and click <strong id="mrs_01_0366__b11784813205015">View Details</strong> in the <strong id="mrs_01_0366__b730781625014">Operation</strong> column to obtain the actual job ID.</li><li id="mrs_01_0366__li1846915481913">Log in to Manager and choose <strong id="mrs_01_0366__b149231035105012">Services</strong> > <strong id="mrs_01_0366__b79031371501">Yarn</strong> > <strong id="mrs_01_0366__b15336124111505">ResourceManager WebUI</strong> > <strong id="mrs_01_0366__b186131435502">ResourceManager (Active)</strong>. The Yarn page is displayed.</li><li id="mrs_01_0366__li853132713173">Click the ID corresponding to the actual job ID.<div class="fignone" id="mrs_01_0366__fig5276182312596"><span class="figcap"><b>Figure 1 </b>Yarn <span id="mrs_01_0366__ph4109141014246">W</span>eb UI</span><br><span><img id="mrs_01_0366__image17352244175815" src="en-us_image_0000001388362146.png"></span></div>
|
|
</li><li id="mrs_01_0366__li11569174472514">Click <strong id="mrs_01_0366__b232118417513">Logs</strong> in the job log area.<div class="fignone" id="mrs_01_0366__fig1673515263410"><span class="figcap"><b>Figure 2 </b>SparkPi job logs</span><br><span><img id="mrs_01_0366__image13405815941" src="en-us_image_0000001388203690.png"></span></div>
|
|
</li><li id="mrs_01_0366__li38491509287">Click <strong id="mrs_01_0366__b1013284413516">here</strong> to obtain more detailed logs.<div class="fignone" id="mrs_01_0366__fig47917335298"><span class="figcap"><b>Figure 3 </b>More detailed logs of sparkPi jobs</span><br><span><img id="mrs_01_0366__image1747972882910" src="en-us_image_0000001296090200.png"></span></div>
|
|
</li><li id="mrs_01_0366__li10755957162918">Obtain the job execution result.<div class="fignone" id="mrs_01_0366__fig10583338317"><span class="figcap"><b>Figure 4 </b>SparkPi job execution result</span><br><span><img id="mrs_01_0366__image11154155383013" src="en-us_image_0000001349169941.png"></span></div>
|
|
</li></ol>
|
|
</li></ol>
|
|
</p></li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0589.html">Using Spark</a></div>
|
|
</div>
|
|
</div>
|
|
|