Difference between revisions of "Tardis scheduling policy"
import>Cen1001 |
import>Cen1001 |
||
Line 24: | Line 24: | ||
=== Reservation and Backfill === |
=== Reservation and Backfill === |
||
− | + | This policy isn't making much difference at the moment; I don't think it needs changing but I mention it for completeness. We make a reservation for the top queued job and then backfill other jobs around that (ie let them jump the queue if and only if they will not have any effect on the start time of the top queued job). This stops big parallel jobs being crowded out by small jobs, but only once they have got to the top of the queue. Without this the 32 proc jobs would almost never run. |
|
=== The current problem === |
=== The current problem === |
Revision as of 15:57, 16 March 2007
Now is a good time to review tardis's scheduling policy as we have had the machine in service for a few months.
Current policy
Every individual user has a fairshare target of 20% of the machine. If you go over that then you get penalties; under it and you get bonuses. There are also two QOS (quality of service) groups, which also have fairshare targets. These are based on how the machine was funded: the 'stuart' group (Stuart Althorpe's research group) get 52% and the 'portfolio' group (everyone else) get 48%.
Fairshare targets and usage can be seen by running 'diagnose -f'
The fairshare calculation takes the last six weeks of usage into account, decaying it at a rate of 0.8 per week.
Priority is currently: 20 * ( personal fairshare bonus/penalty + group fairshare bonus/penalty ) + job expansion factor .
Job expansion factor rises with time spent on the queue, but rises faster for short jobs. The reason for using that and not basic queue time is that it helps the very short (30 min) test jobs to run. It makes practically no difference when compared to the fairshare numbers, but ensures that every job eventually runs.
Priority calculations for all queued jobs can be seen by running 'diagnose -p'
Throttling policies
There is one throttling policy in use: any user may only have four jobs in the 'Idle' state at any given time. This avoids queue stuffing. However it does not help when one person has a very big fairshare bonus and submits a lot of jobs, because every job that gets to run is replaced in the queue immediately.
Reservation and Backfill
This policy isn't making much difference at the moment; I don't think it needs changing but I mention it for completeness. We make a reservation for the top queued job and then backfill other jobs around that (ie let them jump the queue if and only if they will not have any effect on the start time of the top queued job). This stops big parallel jobs being crowded out by small jobs, but only once they have got to the top of the queue. Without this the 32 proc jobs would almost never run.
The current problem
The Althorpe group currently have only one active user and little historical usage, so that account gets a huge fairshare bonus and can crowd everyone else out of the machine. This is taking considerable time to correct because the fairshare memory is long and the deficit is large.
Things we could change and their likely effects
Maui is amazingly configurable; any policy you can come up with we can probably find a way to make Maui do. Here are a few possibilities:
- Shorter or fewer fairshare windows, so machine has shorter memory
- Dilute group fairshare (ie give personal fairshare a bigger multiplier than 20)
- Drop group fairshare and possibly give Stuart's group bigger personal fairshare instead
- Max processors per person limit
- Max outstanding processor-seconds per person limit