Progress is not always measurable, but nevertheless it tells Hadoop that a task is doing something. For example, a task writing output records is making progress, even though it cannot be expressed as a percentage of the total number that will be written, since the latter figure may not be known, even by the task producing the output. Progress reporting is important, as it means Hadoop will not fail a task that’s making progress. All of the following operations constitute progress:

  • Reading an input record (in a mapper or reducer)
  • Writing an output record (in a mapper or reducer)
  • Setting the status description on a reporter (using Reporter’s setStatus() method)
  • Incrementing a counter (using Reporter’s incrCounter() method)
  • Calling Reporter’s progress() method
Tag

Related Post

Leave a Reply