General queueing problems

From CUC3
Jump to navigation Jump to search

This page is for general queueing discussion. There are specific pages for discussing mek-quake and tardis .

I have moved the paragraph from the page for me-quake as it is more appropriate here. --Catherine 09:55, 22 June 2006 (BST)

I don't know a lot about queuing systems, so I can only really talk about front-end problems. 
What has annoyed me about nimbus is that it is supposed to be for running parallel jobs, yet 
whenever I submit something (usually 2-10 but sometimes up to 20 nodes, which I have to run 
with 10 nodes, 2 jobs per node which is already not ideal) it can spend literally weeks in the 
queue while hundreds of serial jobs run. This leads me to use rama because I can get a job 
running sooner, however rama is not optimised for parallel work so although they start, they 
run very slowly and I no doubt annoy all the people wanting to run serial jobs there. Is there 
any way around this? 
Jane

I was asked to set up nimbus to run a mixture of serial and parallel work, not just parallel jobs. If this has changed then Michele, Ard and Jon need to let me know and I'll given serial jobs on nimbus some kind of penalty. Anyway, on to the actual problem.

Having a mixture of job types is always problematic. The system works by taking the top queued job (usually parallel) and works out the earliest time that job can start, assuming that all running jobs take all the time they asked for. It then reserves the nodes that the parallel job will run on, and will not allow anything else to run that could delay the parallel job. It then fills in the gaps in the system with other jobs, and these tend to be serial jobs because the gaps are usually small. The serial jobs are not actually delaying the parallel job from starting because they are not started on any node that the system has reserved for the parallel job. The problem is when jobs regularly finish early then you can get a situation where, if you had not started a certain serial job that looked safe at the time, the parallel job could have started earlier. However the queueing system can't see into the future so it can't help it.

One potential fix is to make the system strictly FIFO. However this dramatically reduces overall usage. A better fix is for all users to make an effort to estimate how long their jobs will take to run, and set that walltime on the jobs plus 10% (or something) instead of just using the longest time available. This means the scheduler can predict the future state of the system much more accurately. A third possibility is to increase the number of reservations made so that we get less aggressive filling of gaps. This reduces overall usage but is kinder to parallel jobs.

--Catherine 10:15, 22 June 2006 (BST)