Difference between revisions of "New mek-quake"

From CUC3
Jump to navigation Jump to search
import>Cen1001
import>Cen1001
Line 1: Line 1:
This is a large cluster system to be used by the Wales and Vendruscolo groups. There are some decisions to be made about setting it up.
+
This is a large cluster system to be used by the Wales and Vendruscolo groups.
   
  +
[[Differences from Clust]]
On clust (the other Wales group cluster) there are dual-CPU nodes but these are deliberately hidden from the queueing system. The smallest unit
 
you can get is a single node. This was to avoid the situation we get on
 
nimbus, a similar machine, where parallel jobs sometimes get assigned one
 
task on certain nodes, instead of two. This seems to happen despite the
 
queues being configured to ask for two tasks per node on parallel jobs but it is always possible for users to reset this so I do not know if it is a Maui bug or user error.
 
 
The clearly will not scale to 4-way nodes on the new machine, so we shall have to allow mixed jobs on nodes. Not sure how to avoid
 
the users resetting the tasks per node though. A qsub wrapper would be
 
complex as it would also have to process the job script.
 
 
Assuming we do not need variable fairshare for different groups it could be
 
done with QOS but this is a bit of a bodge.
 
 
Are we going to need variable fairshare?
 
 
Need to do the same access policy on nodes as clust (ie if you have not got a job running on the node, you do not get to log in). This discourages people from using /scratch, which is bad, so we then have to arrange for /scratch to be visible on the head node of the cluster by automounting.
 
 
The epilogue scripts (and the MPI wrapper scripts) are going to be complicated. It is only safe to slay a user's processes if you are certain that the user has no other jobs running on a particular machine. The script is therefore going to have to ask the queueing system about this and handle the answer. I can see a race condition here- what if you ask, and are told it's safe to slay, and then another job starts on the node and you accidentally slay both? Is this even possible? Does pbs_mom fork to run epilogue scripts?
 
 
The MPI wrappers will have to parse the PBS nodefile halfway intelligently and write a machinefile accordingly. It is not safe to make assumptions about the task geometry of the jobs. Also I will need to figure out exactly how Maui interprets PBS's nodespecs (I have some idea but not enough to predict the behaviour in all cases) and document that so that people stand a chance of getting the geometry they intended.
 
   
 
[[Mek-quake initial setup notes]]
 
[[Mek-quake initial setup notes]]

Revision as of 13:40, 12 June 2006

This is a large cluster system to be used by the Wales and Vendruscolo groups.

Differences from Clust

Mek-quake initial setup notes

Maui compilation