Getting Access to the High-Performance Computing Clusters
Table of Contents
- How do I get access?
- For College/Dept/Allocation Managers: How do I grant access to my Deepthought/Deepthought2 allocation?
- How do I grant/get access to my MARCC/Bluecrab allocation?
- How do I get an Allocation from the AAC?
- How do I get access for a class I am teaching?
The Deepthought High Performance Computing (HPC) Clusters are a joint effort between the Division of Information Technology (DIT) and a number of units within the University of Maryland. Funding for the clusters came from both DIT and various colleges, departments, and research groups. MARCC and the Bluecrab cluster were constructed with State of Maryland funds, and a percentage of the resources has been reserved for use by the University of Maryland.
Units that contributed to one of the Deepthought clusters receive standard and high priority allocations on the respective cluster which replenish on a monthly and quarterly schedule. Basically, they receive one month's worth of the value of the CPU cycles from the hardware purchased with their contribution as high priority every month, but can borrow from the other two months in the quarter at standard priority.
The CPU cycles from nodes on the Deepthought clusters purchased out of DIT funds are made available to University members through a proposal process. This same proposal process is used for all access to the University's portion of the MARCC/Bluecrab resources. The proposals are reviewed by the Allocations and Advisory Committee (AAC). These faculty members with extensive HPC experience will evaluate the merits of each proposal in order to ensure the best use of these valuable resources.
DIT will also allow the use of the Deepthought cluster by classes in some circumstances. If you are teaching a course and wish to make use this research tool in the class, see the section on requesting access for use by a class below.
How do I get access?
To get access to the one of the HPC clusters, you need to be granted access to an allocation on that cluster. As described above, there are basically two types of allocations:
- those for members of units which contributed towards the purchase of the cluster
- those for people who are not members of units which contributed
NOTE: Since the funding for the construction, procurement, and implementation of the MARCC/Bluecrab cluster came from the state, there are no contributing units for that cluster. For access to the MARCC/Bluecrab cluster, you must go through the application process.
The following is a list of contributing units to the Deepthought clusters. The table has been simplified, omitting most contributors at the research group level (it is assumed that at that level, if you belong to a contributing research group they have informed you of such). And there might be additional restrictions and constraints on access applied at the college, departmental, or research group level. But this should give you some idea of whether you might be able to get contributor access.
(This list is currently maintained manually, so apologies if it gets out-of-date. If you notice errors or omissions, please contact us and we will update it).
|A. James Clark School of Engineering||Jim Zahniser||-||X|
|College of Math & Natural Sciences||Mike Landavere||X||X|
| ||Astronomy||Prof. Derek Richardson||X||X|
| ||Atmospheric & Oceanic Sciences||Prof James Carton|
Prof Kayo Ide
| ||Institute for Physical Sciences & Technology||Dr. Alfredo Nava-Tudela||-||X|
| ||Physics||Jeff McKinney||-||X|
So if you want access to one of the Deepthought HPC clusters, you should:
- If your research group already has an allocation, contact the person in your research group responsible for the allocation and ask them to grant you access to that allocation. NOTE: the request for access must come from one of the points of contact for the allocation.
- If you are unsure about whether your research group has an allocation, you should probably ask your colleagues before proceeding further. It is easier all around to get added to an existing allocation than to get a new one. If you discover your research group has an allocation, go back to step 1.
- If your group does NOT have an allocation already, check the table above and see if your college or department has any allocations. If so, contact the contacts listed in the table and see if you and/or your group can get access to an allocation. You are probably best off going from smaller (e.g. departmental) units to larger (e.g. college). Procedures vary by unit; some might grant you access to an existing allocation, others might carve out a suballocation for your group. You might or might not need to have the request come from the PI for the group or from your faculty advisor if you are a student.
- If you are unable to obtain access to an allocation following the steps above, you can make a request for an allocation from the AAC. This procedure is more involved, and discussed further below.
If you want access to the MARCC/Bluecrab HPC cluster, you must request an allocation from the AAC.
For College/Dept/Allocation Managers: How do I grant access to my Deepthought/Deepthought2 allocation?
This section is for designated points of contact for the allocations on one or both of the Deepthought HPC clusters. If you are NOT a designated point of contact for the allocation, i.e if you are only a member, or not even a member of the allocation, do NOT follow these steps. We will not honor requests made from people who are not points of contact for the allocation in question. If you are not the point of contact for the allocation, find the point of contact and have them make the request.
If you are trying to add someone to your MARCC/Bluecrab allocation, or are trying to get access to your colleague's or advisor's MARCC/Bluecrab allocation, that is discussed in the section below on granting access to MARCC/Bluecrab allocations.
To add/delete users from an existing Deepthought/Deepthought2 allocation
Basically, one of the points of contact for the allocation on one or both of the Deepthought HPC clusters just needs to send email to firstname.lastname@example.org requesting that the user be added to the allocation. The point of contact should identify themselves (ideally email should come from their @umd.edu email address), and also specify:
- what HPC cluster this is for? (e.g. Deepthought, Deepthought2)
- the name of allocation?
- the Glue/Terpconnect username (i.e. campus LDAP directory ID, or the left part of their @umd.edu email address)
NOTE: To add an user to your MARCC/Bluecrab allocation, please see the section for granting access to Bluecrab allocations below.
Note that certain subdomains of umd.edu (e.g. cs.umd.edu, astro.umd.edu) are NOT part of the unified campus username space, and as those subdomains are NOT maintained by DIT, are not usable by us to uniquely identify people. E.g., email@example.com might or might not be firstname.lastname@example.org, so we cannot reliably map email@example.com to a specific person.
Note: For projects which have both standard and high
priority allocations (e.g. have
allocations), users either have access to both allocations or neither
allocation. Because of the
replenishment process, anything else really does not make any sense.
E.g., someone charging against the
foo allocation at the
beginning of a quarter is really charging against the next month's
foo-hi allocation (but without the priority boost).
The DIT maintained HPC clusters are part of the campus Glue environment, and as such all users need to have active Glue/TerpConnect accounts. "Adding a user" to your HPC allocation basically is just enabling the user's existing Glue account access to the cluster and giving it the ability to charge against your allocation(s). Therefore the users you specify MUST already have TerpConnect accounts before access can be granted, and since the process to activate a TerpConnect account must begin with the user (and can take a day or so), it is best to ensure the user has a TerpConnect account before submitting the email request for access. See here for information and instructions on activating TerpConnect accounts. If you submit a request for users without a TerpConnect account, you will just get email back telling you they need to get a TerpConnect account first.
Requests to delete users from the allocation can be handled similarly. Here it does not matter whether the user's TerpConnect account is still active. If the user is not associated with any allocations other than yours, their access to the cluster (as well as to charge against your allocation(s)) will be revoked, and all access to their HPC home directory and any directories on lustre or data volumes will be revoked and those directories slated for deletion. If there is data which should be retained, you should mention that in the email so we can look into reassigning ownership. If the user has access to other allocations, only their ability to charge against your allocation will be revoked, and we will by default not do anything with respect to their home or data files. You should contact the user about any transfer of data that is required (and you can contact us if assistance is needed).
To create suballocations on the Deepthought HPC clusters
Certain contributors (e.g. CMNS, Engineering) have not allocated all of the resources they are entitled to from their contribution to the Deepthought clusters, and are instead periodically creating suballocations carved from these unallocated resources.
In general, we recommend avoiding the subdivision of allocations if at all possible. If your users can peacefully coexist in a single allocation without abusing resources, that will generally result in the most efficient use of resources. E.g., if there are two groups, groupA and groupB with access to the foo allocation of 100 kSU/month, and both groups average about 50 kSU/month in usage, if one month groupA only uses 30 kSU and groupB needs to use 65 kSU, there is no problem. But if you subdivide into fooA and fooB suballocations of 50 kSU each, one for each group, neither group can use more the 50 kSU any month, even if the other group used none of their allocation. And even if both groups were granted access to both allocations, all jobs can only be charged against a single allocation, so if at end of the month both allocations had 1 kSU free each, a 1.5 kSU job could not be started even though there is a combined 2 kSU free.
But we recognize that suballocations are sometimes needed. The points of contact for the contributions with unassigned allocations should send email to firstname.lastname@example.org including the following information:
- Name of the point of contact (ideally should be sent from your @umd.edu account)
- Name of the contribution the suballocation should be carved from
- Proposed name of the suballocation
- Quarterly size of the allocation in kSU (1000 core-hours)
- TerpConnect/Glue username of point of contact for suballocation (i.e. their @umd.edu email address)
- Name of department that the research will be done in
- You can also give additional users to be made members (or additional points of contact) for the suballocation (please specify members vs points of contact)
- If the suballocation should have a shorter lifetime than the parent contribution, you must specify its expiration date. Note that since our accounting is essentially quarterly, expiration dates on quarter boundaries (Jan 1, Apr 1, Jul 1, Oct 1) probably make the most sense.
Again, all points of contact and members of the suballocation MUST already have active Glue/TerpConnect accounts before submitting the request. See here for information and instructions on activating TerpConnect accounts.
Also, all such requests MUST come from a designated point of contact for the parent contribution.
Adjustments to sizes of suballocations do involve some manual labor, and
we request that you try to organize your suballocations to minimize this.
However, it is recognized that circumstances will require adjustments from
time to time. The only information tracked for replenishing allocations
are the quarterly value and the expiration date of the allocation, so we
cannot really handle requests like "Give suballocation
an extra 50 kSU for this quarter only." You will instead need to ask
us to increase suballocation
foo by 50 kSU, and at the start
of the next quarter ask us to reduce it back down again.
How do I grant/get access to my MARCC/Bluecrab allocation?
Because MARCC is jointly operated by both Johns Hopkins University and the University of Maryland, accounts on the Bluecrab cluster cannot be directly tied in with either universities systems. So users on the Bluecrab cluster need to establish an new account expressly for the Bluecrab cluster.
This section assumes that you already have an allocation, or that you are seeking to get access to an existing allocation belonging to someone else. If you do not already have an allocation on Bluecrab, please see the section on requesting an allocation from the AAC.
Therefore, to gain access to the MARCC/Bluecrab cluster, the person requesting access must visit and fill in the MARCC/Bluecrab user account request form.
- Enter your University Directory ID (i.e., the part to
the left of the
@terpmail.umd.eduemail address) in the field labelled such.
- Be sure to specify
UMDas your University
- For the phone number, please enter a number at which you can receive text/SMS messages. The MARCC/Bluecrab staff will send text messages when they have urgent need to contact you.
- You must enter the name and
@umd.eduemail address of the PI for the allocation you are requesting access to.
NOTE: If you just received notice that the allocation you requested was
created, you need to go through the above procedure to create an
account on the system. In this case, you are your own "sponsor,"
so enter your own name and
@umd.edu email address in
the fields for the sponsor. Do not do the steps
above before you receive an email from MARCC staff informing
you that the allocation has been created. (The email from UMD informing
you that your allocation was approved is NOT sufficient).
Your login id on the MARCC/Bluecrab cluster will
DIRID@umd.edu, where DIRID is
your University of Maryland Directory ID.
To remove someone from your allocation, please email email@example.com with your request. Such requests should come from the PI for the allocation.
How do I get an Allocation from the AAC?
The CPU cycles from the compute nodes of the two Deepthought clusters purchased from Division of Information Technology (DIT) funds are made available to researchers at the University of Maryland via the HPC Allocations and Advisory Committee (AAC) through a proposal process. In addition, all of the CPU cycles alloted for use by the University of Maryland on the MARCC/Bluecrab cluster are made available to researchers through the same process. The AAC's role is to foster the use of these HPC resources for research which can make appropriate use of the resources.
Allocations of compute time are provided as service units (SUs), each of which represents one hour of wall clock time on one CPU core. Different categories of allocations provide cycles for newcomers (development: 20K SUs), for moderately demanding jobs (small: 60K SUs), and for compute-intensive research (large: 100K SUs). The larger allocations are naturally scrutinized more, and generally require the applicant to have shown reasonable knowledge of HPC and its issues, either from previous development grants or other experience on this or other clusters.
Allocations on the Deepthought clusters are one time grants of SUs with an one year (by default) expiration. SUs can be used as needed over the course of that year. Allocations on the Bluecrab cluster are also by default for one year, but SUs are allocated quarterly. E.g., if you requested 100 kSU to complete your project, you will get 25 kSU per quarter for four quarters. If you require/would prefer different timing (e.g. 50 kSU/quarter over 2 quarters, or 25 kSU for the first quarter and 75 kSU for the second of two quarters) please state such in the application. We are generally willing to accomodate such to the extent that we can.
If an application is approved by the AAC, the allocation will be created, by default, shortly after approval. Typically within about one business day. If you would prefer a later starting date (e.g. you will not be able to use start using the cluster immediately due to other priorities or because you are awaiting data), please specify such in the proposal, especially if there will be significant delay. The time between submission and approval can vary; if the application has sufficient detail that the AAC has no follow up questions, approval is typically within one or two business days. If there are follow up questions, an HPC administrator will contact you (typically via email) with the questions, and forward your replies back to the committee. Again, usually you should receive notification of approval or follow up questions within about one or two business days after a submission.
Requests submitted by students are restricted to the development (20 kSU) level on the Deepthought clusters, are non-renewable, and except in the case of proposals with special requirements are restricted to the original Deepthought cluster. Such special requirements will need to be made clear in the proposal; and the decision as to which cluster if approved is reserved to the AAC. Students must also provide the name and email address of their faculty advisor in the request. Student proposals are NOT eligible for the Bluecrab cluster.
Students are only allowed ONE allocation from the AAC for their duration with the university, and that will be a developmental allocation and not renewable. If more CPU cycles are required, their faculty advisor must apply for the allocation.
If a student requires an allocation of larger size than 20 kSU, or an allocation on the Bluecrab cluster, the proposal for the allocation must be made by their faculty advisor. If and when the proposal is granted, the advisor must request that the student be added to the new allocation.
Criteria used in making such determinations include appropriateness of the clusters for the intended computation, the specific hardware and/or software requested, a researcher's prior experience with high-performance computing, the track record of a requestor who has received HPCC allocations in the past, and the overall merits of the research itself.
The AAC will determine which of the HPC clusters is most appropriate for the request. If the requestor has a specific cluster in mind, that should be explicitly mentioned in the proposal. In addition, the proposal should provide enough information to justify the use of a specific cluster (e.g. the need for Matlab DCS or other cluster specific licenses, or GPUs, or large memory nodes). While the committee will consider requests for a specific cluster, the committee decide which cluster to grant for a proposal based on which cluster is most appropriate for the request.
Fill out this form to submit a proposal to the AAC for an allocation. The same form is used for new applications and for "renewal" applications (i.e. when requesting additional time on an existing allocation, either because more time is needed to complete the research than originally thought, or because it is desired to increase the scope of the research).
When applying to the AAC for an allocation, remember that the AAC generally prefers to award allocations for specific projects. It is best to make a proposal for specific projects, with milestones that can be achieved within one year (or whatever time frame of requested allocation is), and if needed make a renewal request for more time for a second set of goals. In addition, it is useful to include the following in your proposal:
- A quantitative estimate of the total number of SUs (where 1 SU = 1 CPU-core-hour) needed to complete the project. This can be as simple as an estimate of the number of jobs that need to be run times the estimated average SUs required per job (if running several different classes of jobs, maybe break the calculations down by class of job).
- If you are requesting time on the Bluecrab cluster, you should further break down the total number of SUs into SUs/quarter, as needed. If not specified, we will assume the total number of requested SUs is to be alloted over one year, 25% of requested SUs per quarter for each of 4 quarters.
- What, if any, parallelization strategies are used by your codes? Do they use distributed memory parallelization techniques (e.g. MPI) that allow the code to run over multiple nodes? Do they use shared memory parallelization techniques (e.g. OpenMP, threading) that allows for use of multiple cores on the same node? A hybrid of the two? Although the clusters are not restricted/reserved for parallel applications, it is hard to run large, parallel applications except on HPC clusters, so they do get some preference.
- If the codes are parallel, how well do they scale? Generally, as the number of processors given to a program increases, the incremental benefit from each new processor added diminishes (and at some point might even go negative). Thus there usually is an optimal value for the number of processors to use in the calculation. A discussion of how the code scales (if known), and/or how you plan to determine the optimal degree of parallelization would be of value.
- If this is a renewal of an existing application, you should discuss what has been accomplished with the original grant of SUs. Publications, etc. are always nice to include (and are useful when we need to convince budget directors of the value of investing in HPC resources), but less formal milestones can be included as well. If there was a miscalculation in the amount of compute time needed to complete the original goals, a discussion of such is worthwhile.
The AAC is unlikely to grant large allocations without a good discussion of most of the above points. However, it is recognized that not all applicants are experienced High Performance Computing (HPC) experts. Indeed, one of the aims of the AAC with this allocation process is to allow researchers who are not even sure if HPC techniques will work for their research an opportunity to try HPC out without a large investment. So if you are unable to address all of the above points, you can still apply for an allocation. It is likely that the AAC will, at least initially, only approve your application for a developmental (20 kSU) allocation, but that should not be viewed as a set back. The 20 kSU allocation might even be enough for some small projects, but at minimum it should allow you to collect the information regarding SUs required per job, scalability of code, etc. to address the above points when you request additional time in a renewal allocation.
If you have questions regarding the application process, or with what information is requested or how to obtain such, or any other issues, please feel free to contact us.
Several samples of approved applications have been made available with the kind consent of the applicant to assist others who wish to apply.
How do I get access for a class I am teaching?
As befits an institute of learning, DIT is willing to make the
Because needs vary from class to class, we do not have a standard form for this request. Please state what class this is for (include the semester) and why you wish to make use of the HPC cluster. Also, include you estimates on the number of students, the number of jobs/student, and the size (number of cores) and length (walltime) of the jobs. Estimates of memory and disk usage are also useful if you have them.
We ask that such requests be made at least a month before the start of the semester. And that you review well before the semester starts whether the needed software is already installed on the cluster. It also behooves you to verify that the software is working properly and is the correct version BEFORE the semester starts. While we are willing to try to install any software you need on the cluster, please note that DIT will NOT purchase software for courses. If licensed software is required, you will have to provide the licenses (DO NOT PURCHASE software for the HPC cluster without contacting us first; not all software and/or licenses are compatible with an HPC cluster, and we do not want you to spend money on an incompatible product).
Typically, we will create a bunch of Glue temporary class accounts for
the students and TAs in the course, and will provide you with a list of the
initial passwords. You can then distribute these accounts to your students.
NOTE: you must keep track of which student was given what
account and provide it to us upon request, just in the unlikely event we need
to track a misbehaving student. Students can change the initial passwords
with the standard Unix password command. Should they forget their passwords,
the instructor can reset the password through
the campus Special Identity Management System (SIMS), or
contact the Deepthought systems staff at firstname.lastname@example.org to request that we
do so. NOTE: the password reset request MUST
come from the registered instructor of the class, as systems staff do not know
which accounts belong to which students.