Showing posts with label hacks. Show all posts
Showing posts with label hacks. Show all posts

Tuesday, September 15, 2009

Hex-editing your GPFS Terabytes

I remember the old times where you could find a those geek guys that would use a hex editor to "patch" your favorite game and become bullet-proof or with 99...9 (still counting) lives or ... do what ever you wanted in order to win. At that time i was sure that hex editors are powerful to save you from a "disaster" but i couldn't think what a disaster can be.

We are using GPFS as network file system for our clusters and except from dummy scratch space for MPI jobs it is also used for some local user's home directory. A local team need to expand their GPFS filesystem so we had to add a few disks to our array. The procedure sounded trivial, adding the new disks to the array, create a new logical volume and finally add the new raw device to GPFS filesystem.

But of course something went wrong. The new volume was about 10TB in size which due to a GPFS limitation we had to partition in at least 2 partitions. Easy work via parted but what happens when parted (for a reason still unknown) "modifies" the partition table of other logical volume which is part of the GPFS filesystem as a whole (without partition table).

Well the parted simply ruins the first sectors of a GPFS NSD which means it ruins all the valuable information (Disk ids and NSD ids as well as the filesystem definition) from this disk. The users report "We are receiving 'Input/Output error' when using the X file" and everything gets worse and worse.

Fortunately there IS a solution to this disaster. Although we couldn't find any official IBM documentation on this (apart from some posts in GPFS's forum), there is a way to recover from this situation. What you need is a hex editor, the famous "dd" and a lot of patience.

First copy the sector 8 from each disk within the GPFS filesystem. This sector is the File System Descriptor and it is common on all disks. Next we have to recover sector 2 and sector 1. Sector 2 is GPFS disk identifier (know also as the NSD-ID). Finally the sector 1 contains information about the disk which is called the disk descriptor.

Due to legal reasons i'm not sure if i'm allowed to reveal more information on how to do this but studying carefully the sectors starting from 8 and going to 2 and then 1 you are able to recover your FS.

Saturday, June 20, 2009

OpenMP jobs on Grid? (The LCG-CE - PBS approach)

There was a user support requirement for OpenMP jobs in Grid. OpenMP is a shared-memory implementation which means that all processes must run on the same box.

Well this can easily achieved at PBS side by using the directive:
#PBS -l nodes=1:ppn=X

Where "X" is the number of requested processes. But the main issue is HOW can we get this requirement based on what WMS gives to us on submission?

After googling this, the "correct" solution can only be achieved at CREAM CE where users can select a number of requirements that will not only be used for job matching process at WMS but also passed to the CE. You can find more info on this here.

LCG CEs on the other hand are only getting a poor RSL which doesn't carry almost any of the user's requirements. So lets get in LCG CE's internals...

First a job reaches the globus-gatekeeper. At this phase user's proxy is matched to a pool account. GateKeeper's task is to authenticate the user and the job and pass it to the globus job manager.

The globus job manager uses the GRAM protocol to report the job state and submits the job to the globus-job-manager-marshal which is using a perl module to talk to the relevant queuing system.

This perl module is responsible for the creation of the job (shell script) that will be submitted to the PBS server. In this module the CpuNumber requirement is translated by default to:
#PBS -l nodes=X

So this is the part we need to change in order to create OpenMP jobs. The next issue now is how we find out if user has asked for OpenMP job. I've noticed that the JDL option "Environment" is passed to the job executable that will be submitted thus a definition like the following:
Environment = {"OPENMP=true"};
can do the trick.

The whole above approach works but for sure needs a lot of work but as proof of concept is more than ok...
In the (near) future i would like to test the CREAM CE which, as i said before, has a more clear way to support requirements from JDLs using the CeForwardParameters definition.