Difference between revisions of "New mek-quake"

From CUC3
Jump to navigation Jump to search
import>Cen1001
import>Cen1001
Line 10: Line 10:
   
   
The clearly will not scale to 4-way nodes on the new machine, so
+
The clearly will not scale to 4-way nodes on the new machine, so we shall have to allow mixed jobs on nodes. Not sure how to avoid
we shall have to allow mixed jobs on nodes. Not sure how to avoid
 
 
the users resetting the tasks per node though. A qsub wrapper would be
 
the users resetting the tasks per node though. A qsub wrapper would be
 
complex as it would also have to process the job script.
 
complex as it would also have to process the job script.

Revision as of 17:04, 11 April 2006

New mek-quake

This is a large cluster system to be used by the Wales and Vendruscolo groups. There are some decisions to be made about setting it up.

On clust (the other Wales group cluster) there are dual-CPU nodes but these are deliberately hidden from the queueing system. The smallest unit you can get is a single node. This was to avoid the situation we get on nimbus, a similar machine, where parallel jobs sometimes get assigned one task on certain nodes, instead of two. This seems to happen despite the queues being configured to ask for two tasks per node on parallel jobs but it is always possible for users to reset this so I do not know if it is a Maui bug or user error.


The clearly will not scale to 4-way nodes on the new machine, so we shall have to allow mixed jobs on nodes. Not sure how to avoid the users resetting the tasks per node though. A qsub wrapper would be complex as it would also have to process the job script.

Assuming we do not need variable fairshare for different groups it could be done with QOS but this is a bit of a bodge

Are we going to need variable fairshare?