forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
26 lines
2.2 KiB
HTML
26 lines
2.2 KiB
HTML
<a name="mrs_01_1700"></a><a name="mrs_01_1700"></a>
|
|
|
|
<h1 class="topictitle1">Why Does Array Border-crossing Occur During FileInputFormat Split?</h1>
|
|
<div id="body8662426"><div class="section" id="mrs_01_1700__en-us_topic_0000001173949232_sd6cfe94a8277481e9cebd708c59d735f"><h4 class="sectiontitle">Question</h4><p id="mrs_01_1700__en-us_topic_0000001173949232_ac1d0f85c88c549a19f990308df4feef7">When HDFS calls the FileInputFormat getSplit method, the ArrayIndexOutOfBoundsException: 0 appears in the following log:</p>
|
|
<pre class="screen" id="mrs_01_1700__en-us_topic_0000001173949232_s7645825abdf242d89795e426b4163991">java.lang.ArrayIndexOutOfBoundsException: 0
|
|
at org.apache.hadoop.mapred.FileInputFormat.identifyHosts(FileInputFormat.java:708)
|
|
at org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:675)
|
|
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:359)
|
|
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:210)
|
|
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
|
|
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
|
|
at scala.Option.getOrElse(Option.scala:120)
|
|
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
|
|
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)</pre>
|
|
</div>
|
|
<div class="section" id="mrs_01_1700__en-us_topic_0000001173949232_sbb1b6b67f3d8491f8579feeb968fbd5c"><h4 class="sectiontitle">Answer</h4><p id="mrs_01_1700__en-us_topic_0000001173949232_a159ffeeaa3a84a83b3d03bac23ce2660">The elements of each block correspondent frame are as below: /default/rack0/:,/default/rack0/datanodeip:port.</p>
|
|
<p id="mrs_01_1700__en-us_topic_0000001173949232_a3f193adfb98e4d8ba2e63a645109fb87">The problem is due to a block damage or loss, making the block correspondent machine ip and port become null. Use <strong id="mrs_01_1700__en-us_topic_0000001173949232_b13503171315181">hdfs fsck</strong> to check the file blocks health state when this problem occurs, and remove damaged block or restore the missing block to re-computing the task.</p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1690.html">FAQ</a></div>
|
|
</div>
|
|
</div>
|
|
|