Data access

Data access nodes

The experimental data can be externally accessed through scp/sftp from

psexport.slac.stanford.edu

This is the access to the data also trough the wired and wireless visitor networks at LCLS.

The data can furthermore be accessed by the analysis machines, but NOT from pslogin itself. There are ethernet tabs in the XPP control room table (labelled "data") which allow fast access to data.

Analysis nodes

The main pool of analysis machine is accessible over a multi hop ssh connection. This would be e.g. from psexport or machines are your home institution (in the latter case, use the full machine name pslogin.slac.stanford.edu:

ssh -X pslogin

ssh -X psana

psana automatically connects to one of a farm of machines with most available resources.

The analysis machines can be reached from the user terminals in the XPP control room (the psusr13* machines) directly (w/o needing to go to pslogin first).

Data location

Experimental data are stored in:

/reg/d/psdm/<hutch>/<hutch>*****

<hutch>is the hutch name in lowercase. The data for each experiment is stored in subdirectories <hutch><expid>. Together with the experiment ID <expid> this can be for example xppcom10, the name of the XPP commissioning experiment, and xpp34511 for an experiment in 2011.

There are subfolders underneath for different data formats.

/xtc: raw data is kept. You will need psana to read this data. The XPP pipeline is build upon psana.

/calib: calibration files like pedestals and geometry information is kept.

/hdf5/smallData: smallData hsf5 files are written by default.

/results: small files containing results of the analysis. The analysis code for each experiment is also kept here.

/scratch: e.g. reprocessed data can be shared in the scratch subfolder between the experiment's collaborators.

Data formats

The LCLS data are recorded as files identified by the experimental RUN NUMBERs. A RUN was initiated and terminated by the data acquisition system, can be a series of shots/event or can also be divided into different scan steps. Scan steps within the data files are called Calibcycles. Two data formats exist at the moment: the original data format .xtc, and a translated, more generic data format HDF5.

Since about 2015, we have an HDF5 format called SMALL DATA (was LITTLE DATA) available which which only contains selected information that are typically needed for the analysis. The content can be customized and the typical size per run is sub Gigabytes. This is described in more detail on this page.

LCLS data analysis group provide C/C++ and Python based data analysis framework for .xtc files. Instructions and tutorials can be found at https://confluence.slac.stanford.edu/display/PCDS/Data+Analysis.

There is also the "standard" HDF5 translation by default a direct translation of the .xtc file. XPP has developed a set of MATLAB based functionalities for the data format. You should expect to have to update these a little as the underlying data format changed over time in minor ways. The translation for a RUN typically starts after the data acquisition has ended. However, one could setup 'online' translation for Runs that breaks up into Calibcycles. You can also customize the translation to add intermediate data processing results, or exclude large area detectors. Underneath, this uses an older version of the LCLS' analysis groups psana code and is not developed any longer, but the code still exists and runs. Please contact beamline staff for more details.

Data transfer

The experimental data can be copied by scp/sftp as well as by the optimized bbcp protocol. For analyzing data on the own computer is is recommended to mount the data as an ssh drive and connect by the wired network available on the Control room conference table.

scp/sftp

Possible programs to be used for scp/sftp data transfer

bbcp

bbcp is a SLAC-developed protocal which uses multiple connections for optimizing the use of available bandwidth for data transfer. It is only available for unix systems. For details see

https://confluence.slac.stanford.edu/display/PCDS/Using+BBCP

If you have a lot of data, you can also use Globus online to transfer your data. This is described here.

Offline Monitoring System

From any psana machines (e.g. psana1212):

shortcut (recommended):

/reg/neh/operator/xppopr/ami_offline &

you can also go to the directory and look for the latest version (e.g. 7.4.1-p7.2.8)

cd /reg/g/pcds/dist/pds/ami-7.4.1-p7.2.8/build/ami/bin/x86_64-linux-opt/

launch

./offline_ami &

*There is a shortcut at xpp for the online version: xppstartami