Fair Scheduler
·
The core idea behind the fair share scheduler was to assign resources to
jobs such that on average over time, each job gets an equal share of the
available resources. The result is that jobs that require less time are able to
access the CPU and finish intermixed with the execution of jobs that require
more time to execute. This behaviour allows for some interactivity among Hadoop
jobs and permits greater responsiveness of the Hadoop cluster to the variety of
job types submitted. The fair scheduler was developed by Facebook.
·
The Hadoop implementation creates a set of pools into which jobs are
placed for selection by the scheduler.
Each pool
can be assigned a set of shares to balance resources across jobs in pools (more
shares equals greater resources from which jobs are executed). By default, all
pools have equal shares, but configuration is possible to provide more or fewer
shares depending upon the job type.
The number of jobs active at
one time can also be constrained, if desired, to minimize congestion and allow
work to finish in a timely manner.
·
To ensure fairness, each user is assigned to a pool. In this way, if one
user submits many jobs, he or she can receive the same share of cluster
resources as all other users (independent of the work they have submitted).
Regardless of the shares assigned to pools, if the system is not loaded, jobs
receive the shares that would otherwise go unused (split among the available
jobs).
·
The scheduler implementation keeps track of the compute time for each
job in the system. Periodically, the scheduler inspects jobs to compute the
difference between the compute time the job received and the time it should
have received in an ideal scheduler. The result determines the deficit for the
task. The job of the scheduler is then to ensure that the task with the highest
deficit is scheduled next.
No comments:
Post a Comment