1. SmallData Production

How to configure and generate the small data

The small data generation takes advantage of the local development of the PSANA framework. It can be customized and run both online against the shared memory as well as offline against the .xtc files. Parallel computing has been built-in. For a typical experiment, at the moment, Silke will help set up the small data processing/generation in the directory /reg/d/psdm/<hutch>/<expname>/results/smalldata_tools.

<hutch> will be e.g. xpp and the experiment name could be e.g. xpptut15.

A "driver" python file (typically called SmallDataProducer.py) can then be edited to, e.g., choose and optimize a different ROI on area detectors, define beam center for radial integration, define delay time range and bin sizes, etc. This driver file lives in the examples directory, along with the run scripts.

The default output will be saved to

/reg/d/psdm/xpp/xpp*****/hdf5/smalldata

To run a single DAQ run, you can use (in the results directory or in your own private release):

./examples/smallDataRun -r <#> -e xpp....

During data taking, you can omit the "-e experiment_name" parameter and the jobs will be sent to a special queue with priority access.

Configuring the small data - point detectors only

If your concerned data stream are "point detectors" only (e.g. PIPS diodes, FEE gas detector, analogue voltage), please start with SmallDataProducer.py and the only section you need to pay attention to is commented in the file itself:

Default SmallData, UserInput

##########################################################

## User Input start -->

##########################################################

# run independent parameters

##########################################################

#aliases for experiment specific PVs go here

epicsPV = ['s1h_w']

#tt calibration parameters (if passing None, init will use values used during recording

ttCalibPars = None # (or [] or [p0, p1, p2]

##########################################################

## <-- User Input end

##########################################################

You will be able to leave the event codes alone unless your point-of-contact tells you otherwise. Should you have user motors e.g. for sample motion for which you would like to save the position or if you are using a lakeshore for temperature control or have other remotely controlled data you would like to save, you will need to add those to the epicsPV line. For the exact names, ask the beam line staff. Please note that 's1h_w' is only an example/placeholder.

As soon as you have taken and analyzed a time tool calibration run, you should be putting the obtained parameters into the ttCalibPars parameter. They will be used for any new productions (e.g. for reprocessing of runs taken before the calibration was in place). If it is set to "None" or an empty list, the calibration saved in the data is used.

Configuring the small data - addition of the time tool traces

If you have been told you might have to reanalyze the time tool data or you have past experience and feel you can gain by doing so, you can add the event-by-event trace of the OPAL camera used for the time tool like this:

##########################################################

## User Input start -->

##########################################################

##pass along user selected ROI

#ttROI=[[v11,v12],[v21,v22]]

#ldr.set_ttRaw(ROIPars=ttROI)

##no argument will use the ROI as used in the DAQ

ldr.set_ttRaw()

# other user code here.

##########################################################

## <-- User Input end

##########################################################

Configuring the small data - addition of userData

In addition to the points made above, there are a few more section to add if you are adding "userData" (reduced area detector data). This is described in its own subpage.

Running and checking the automatic production

During the experiment we will produce the smallData files automatically. This is done by a script which monitors the run database and submits a job to the (high priority) queue if a new job has been found for which no hdf5 has been created yet. This script is run in a "procserv" session. You start this session by running in small data_tools:

./examples/scripts/run-monitor-smallDataProduction

At this point, this script will output some error messages that can be ignored. Check if the script is running by:

telnet localhost 40001

You should see some message scrolling by every 30 seconds or so. These messages are also written to a log file named something like /tmp/monitor-smallData.psanagpu101.1472164389.

If you cannot remember on which machine you are running the automatic production, look in the log files:

snelson@psanagpu101:/reg/d/psdm/xpp/xppn4116/results$ grep submitted logs/smallData_xppn4116_005*

logs/smallData_xppn4116_0050_716414.out:Job <mpirun python ./examples/SmallDataProducer.py --run 0050> was submitted from host <psanagpu102.pcdsn> by user <username> in cluster <slac>.

If files stop appearing, log into the machine you are running the automatic generation from and connect to the telnet session. In that session do:

> ^x

to restart. To stop and complete restart using

> ^t <-- check if auto restart is on, it needs to be off for a stop

> ^x <-- stops the process and will tell you what to do next which is either

> ^R <-- restart

> ^Q <-- kills the session

> ^] <-- to quit the procServ session, use quit to exit out of the telnet session.

Please make sure you are only running one session of the automatic producer, the system performance goes down should you submit multiple jobs reading the same file from the ffb filesystem.

If the output files do not appear, but the log-files are produced in /reg/d/psdm/xpp/<expname>/results/logs, then try to take a look at the log file. If the error message makes sense to you (typo in the driver script,...), you can fix it yourself. If not, you can ask your POC call the expert.

(Optional:) Create your local copy of little data generation module

Sometimes you want to create your own version. For example, when multiple people are using little data for the same experiment at the same time with different settings. First, you need to setup the psana environment and create your own analysis source file folder by type the following in the terminal on a psana node, assuming you are in bash shell. You will also need a "token" to add the module code. If you ssh in, you will have that, if you use the nxserver, you will need to "kinit" to get a token.

source /reg/g/psdm/etc/psconda.sh

Then, checkout smalldata_tools from github:

https://github.com/slac-lcls/smalldata_tools

smallDataRun contains the name of the driver script (by default SmallDataProducer.py), check that it uses the one you are editing should you not do exactly as written above.

Page updated

Google Sites

Report abuse