wiki:HpcInfo

Version 2 (modified by dzollars, 13 years ago) ( diff )

--

HPC Communications

UltraScan's communication with the High Performance Computer (HPC) or Grid Cluster is implemented according to the above drawings. The tasks are accomplished as described below. The original OpenOffice document is attached to this page.

Laboratory Information Management System (LIMS)

The purpose of this system it to interface with the user to specify an analysis type, such as the Genetic Algorithm (GA) or Two Dimensional Spectrum Analysis (2DSA), and the needed parameters for the analysis to the HPC system. After the user specifies the needed data, the data is packaged into a control.xml package. The command-line program grid-submit is then invoked.

The contents of the control.xml file will include a generated AnalysisGroupGUID and all needed child HPCAnlysisRequest records.

Specific data in the control.xml file will be specified here once we agree on this top level design.

The database tables HPCAnalysisGroup, HPCAnalysisRequest, and appropriate Settings tables are populated by LIMS before calling grid-submit.php.

US3 HPC Database Tables

LIMS is currently a Web interface. In the future, it's functionality may be ported to the UltraScan client.

grid-submit

grid-submit.php is a command line tool that creates the initial HPCAnalysisResult table entry with a queue staus of 'Submitted'. It copies control.xml and all files that it specifies to the HPC system using the gsiscp utility.

It then uses the submission technique needed for the specified supercomputer cluster to queue the job.

Supercomputer Queue

This task is controlled by the Supercomputer system. It is responsible for controlling the jobs running on that system and communication with clients.

Communication tasks include receiving tasks, returning job status, and informing the client when a task has been completed or aborted.

NNLS (UltraScan HPC Analysis Program)

The NNLS program reads the control.xml file and uses that as a guide to read other data files as needed to populate internal data structures. It then performs the analysis, writing any needed output to disk.

At the beginning of the program, periodically during execution, and at the end of of processing, NNLS writes a UDP status datagram to a listener on the host and port specified in the control.xml file. Each datagram will consist of the analysisRequestGUID and a status (e.g. started, iteration number, finished). This is not a reliable two-way communication and it is the responsibility of the listener to follow up and manage any missed messages.

grid-timeout

This program will ether be scheduled periodically via cron, or run as a daemon. It will check status of jobs in the mysql database and initiate a status query for jobs that have overdue status updates. If a job has been aborted, it will notify the grid-listen program of that status.

grid-query

This is a command line program that submits a status query to the Supercomputer Queue and returns the result.

grid-listen

This program runs as daemon receiving udp packets from the NNLS program or the grid-timeout program. It is responsible for updating the mysql database table HPCAnalysisResult with current status and, upon completion or abort of an analysis, fetches needed files from the supercomputer cluster, sends an email to the user, and does any other cleanup necessary.

Attachments (3)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.