Version 4 (modified by 13 years ago) ( diff ) | ,
---|
HPC Communications
- Overview
- Data Flow, step 1
- Data Flow, step 2
- Data Flow, step 3
- Data Flow, step 4
- Data Flow, Asynchronous Step
UltraScan's communication with the High Performance Computer (HPC) or Grid Cluster is implemented according to the above drawings. The tasks are accomplished as described below. The original OpenOffice document is attached to this page.
Laboratory Information Management System (LIMS)
The purpose of this system it to interface with the user to specify an analysis type, such as the Genetic Algorithm (GA) or Two Dimensional Spectrum Analysis (2DSA), and the needed parameters for the analysis to the HPC system. After the user specifies the needed data, the data is packaged into a control.xml package. The command-line program grid-submit is then invoked.
The contents of the control.xml file will include a generated AnalysisGroupGUID and all needed child HPCAnlysisRequest records.
Specific data in the control.xml file will be specified here once we agree on this top level design.
The database tables HPCAnalysisGroup, HPCAnalysisRequest, and appropriate Settings tables are populated by LIMS before calling grid-submit.php.
LIMS is currently a Web interface. In the future, it's functionality may be ported to the UltraScan client.
grid-submit
grid-submit.php is a command line tool that creates the initial HPCAnalysisResult table entry with a queue status of 'Submitted'. It copies control.xml and all files that it specifies to the HPC system using the gsiscp utility.
It then uses the submission technique needed for the specified supercomputer cluster to queue the job.
Supercomputer Queue
This task is controlled by the Supercomputer system. It is responsible for controlling the jobs running on that system and communication with clients.
Communication tasks include receiving tasks, returning job status, and informing the client when a task has been completed or aborted.
MPI_Analysis (UltraScan HPC Analysis Program)
The MPI_Analysis program reads the control.xml file and uses that as a guide to read other data files as needed to populate internal data structures. It then performs the analysis, writing any needed output to disk.
At the beginning of the program, periodically during execution, and at the end of of processing, MPI_Analysis writes a UDP status datagram to a listener on the host and port specified in the control.xml file. Each datagram will consist of the analysisRequestGUID and a status (e.g. started, iteration number, finished). This is not a reliable two-way communication and it is the responsibility of the listener to follow up and manage any missed messages.
grid-timeout
This program will either be scheduled periodically via cron, or run as a daemon. It will check status of jobs in the mysql database and initiate a status query for jobs that have overdue status updates. If a job has been aborted, it will notify the grid-listen program of that status.
grid-query
This is a command line program that submits a status query to the Supercomputer Queue and returns the result.
grid-listen
This program runs as daemon receiving udp packets from the MPI_Analysis program or the grid-timeout program. It is responsible for updating the mysql database table HPCAnalysisResult with current status and, upon completion or abort of an analysis, fetches needed files from the supercomputer cluster, sends an email to the user, and does any other cleanup necessary.
Attachments (3)
- sc-comm.png (50.4 KB ) - added by 13 years ago.
-
LIMS3-schema.odp
(20.5 KB
) - added by 13 years ago.
Original schema document
-
LIMS3-schema.pdf
(128.3 KB
) - added by 12 years ago.
LIMS3/GFAC/Globus job submission schema
Download all attachments as: .zip