Next Previous Contents

2. Core Scheduler Components

In this section we describe some of the core components in more detail, and provide pointers to the Javadoc API documentation.

2.1 Scheduler

The scheduler is the central object responsible for scheduling jobs. The scheduler object runs in its own thread, and has a very simple main loop: (1) invoke each one of its modules in order and (2) sleep for the specified interval.

2.2 Schedulable

A schedulable object is an abstract object holding common attributes of all schedulables (such as submission time, start time, duration, task geometry, etc). One type of Schedulable object is the job object, which represents a job that a user submits. Another type of schedulable object is the sys object, which is used when an administrator sets a reservation.

2.3 Job

A job is an implementation of the schedulable interface with added job-specific characteristics.

Jobs should be submitted to the scheduler as .cmd files, using the mauisubmit client command. Currently, Maui Scheduler parses cmd files in MauiME format. For more information, see the mauisubmit command documentation and specifically the MPI-over-Myrinet, MPI-over-ethernet, and default job documentation sections.

2.4 Reservation

A reservation is a fundamental data structure used by the scheduler. A reservation locks resources (nodes and slots) for use by a job, by a group of jobs, or for some other use. There are two fundamental types of reservations: job reservations and sys reservations.

Job Reservation

A job reservation is made for a single jobs after a successful attempt at scheduling or backfilling. Each job reservation has an exclusive lock on certain nodes and slots, at a certain starting time, for a certain duration. Jobs that can be run will have a job reservation associated with them.

If a job reservation is active, it means that the job is actually running (on potentially one or more of the resource managers in the system). Active job reservations are locked-in and immutable (unless the job (or its reservation) is cancelled). If the job reservation is inactive, the job is waiting in the job queue to run at some future point. Inactive job reservations may be changed several times before becoming active. This is due to other job cancellations or submissions changing priority ordering.

Sys Reservation

A sys reservation is a lock on system resources for a specific time and duration created by an administrator. A sys reservation can be used to dedicate certain nodes/slots for a specific user, group, or account to run jobs "inside", or to lock specific nodes for administrator access and maintenance while keeping the rest of the cluster running.

2.5 Nodes and Slots

Nodes and slots are resources that allow parallel job tasks (or a single serial job) to run. A node is the "beige box" with associated local disk, memory, network interfaces, etc. A slot is a computation slot on a node allowing for one parallel job task (or one serial job) to execute. There is always at least one slot per node. Usually the number of slots corresponds directly to the number of CPUs on the node, but it doesn't have to.


Next Previous Contents