Import module

Import module

A separate import module has been created for two reasons:

1) to perform a DICOM to NIFTI conversion and store ancillary DICOM header information in *.mat files and

2) to restructure an existing arbitrary scan directory structure into a structure that ExploreASL is familiar with.

This import module uses regular expressions and the directory structure can be can be flexibly defined as long as it is fixed per study. The search function then searches across these directories and copies all files to the ExploreASL directory structure, and performs the conversion to NIFTI if necessary.

The function ExploreASL_Import does the import using the parameters set up by ExploreASL_ImportConfig.

The best is to execute

ExploreASL_Import(ExploreASL_ImportConfig('E:\Backup\ASL_E\ASL_PipelineCourse\2D_Sleep'));


The function extracts:

ROOT = E:\Backup\ASL_E\ASL_PipelineCourse;

studyID = 2D_Sleep;

It looks to the ROOT\studyID\raw or ROOT\studyID\analysis. raw and extracts NIFTIs or DICOMs and copy them to ROOT/studyID/analysis.


The 2D_Sleep study data example has an original directory structure as follows: ROOT\raw\subject\session\scan_ID, and the directories containing the DICOMs. See image below:

Explore ASL creates the following directory structure ROOT\analysis\subject\ and puts structural images directly and ASL images to directories ASL_1, ASL_2 depending on the session - see below:

The import config module contains the following code (different for each study):

case '2D_Sleep'

folderHierarchy = { '^(\d{3}).*', '^Session([12])$','^(PSEUDO_10_min|T1-weighted|M0)$'};

tokenOrdering = [ 1 2 3];

tokenSessionAliases = { '^1$', 'ASL_1'; '^2$', 'ASL_2' };

tokenScanAliases = { '^PSEUDO_10_min$', 'ASL4D'; '^M0$', 'M0'; '^T1-weighted$', 'T1' };

bMatchDirectories = true;

case '2D_Sleep'

Here, the studyID is given. Each study, identified by studyID has its different entry.


folderHierarchy = { '^(\d{3}).*', '^Session([12])$','^(PSEUDO_10_min|T1-weighted|M0)$'};

This specifies the names of all directories at all levels. In this case, we have specified regular expressions for directories at three different levels:

    • ^(\d{3}).*
    • ^Session([12])$
    • ^(PSEUDO_10_min|T1-weighted|M0)$

This means that we will identify three directory levels, each with a certain name, following regular expressions (see more here https://de.mathworks.com/help/matlab/ref/regexp.html).

At each directory level, it first decides if the directory matches the regular expression, then it also decides if it extract tokens from the string or not - typically tokens are in brackets. More below.

  • Extracting a token: ^(\d{3}).*
  • Not extracting a token: ^\d{3}.*

Tokens information extracted from the directory name at each directory level. The tokens are numbered by the directory level and can be later used to label patients, sequences etc. See tokenOrdering below.

We use normal characters and metacharacters. Expressions from characters and metacharacters are then further coupled with quantifiers, grouping operators, and conditional operators. Some basic regular expressions are described here with examples - for a full description see the link above:

    • Metacharacters
      • . -- any character -- '..ain' == 'drain' == 'train' != 'pain' != 'rain'
      • [c1c2] -- any character within the brackets -- '[rp.]ain' == 'rain' == 'pain' == '.ain' != 'wain'
      • \d -- any numeric digit, equivalent to [0-9]
    • Quantifiers (applied on an expression)
      • expr* -- repeat 0+ times -- '.*' can give anything
      • expr? -- repeat 0 or 1 times - '.?' == '' == 'p' != 'pp'
      • expr+ -- repeat 1+ times
      • expr{n} -- repeat n times consecutively
    • Grouping operators (allows to capture tokens)
      • expr -- group elements and capture tokens as a part of the text -- '(.)ain' matches 'rain' or 'pain' and returns 'r' or 'p' as a token
      • (?:expr) -- group elements but do not capture tokens
    • Anchors
      • ^expr -- beginning of a text -- '^M.*' is any string starting with M
      • expr$ -- end of the text -- '.*M$' is any string ending with M
    • Conditional operators
      • expr1|expr2 -- matches expression 1 or 2


Some examples of strings:

    • ^(\d{3}).* -- a string starting with 3 digits and any ending, the digits are extracted as tokens.
      • 001a, 212abasd, 231absd
    • ^(P\d{2})$ -- a string starting with P and ending with two digits, the whole string is taken as a token.
      • P12, P32
    • .*(T1_MPR|pcasl|M0).*\.PAR$ -- a string with any beginning, containing T1_MPR or pcasl or M0 (this is taken as a token), and ending by .PAR.
      • test_T1_MPR_testtest.PAR, another_pcasl.PAR, M0_test_.PAR
    • .*(T1|ASL).*(PCASL|PASL) -- extracts string containing T1 or ASL and PCASL and PASL. Two tokens are extracted.

In the above example, the first level is subject, which has three digits (e.g. 101 or 102), specified by \d{3} as regular expression. This is between brackets ( ) to define that this is the first token.

tokenOrdering = [ 1 2 3];

Tokens (parts of the directory names) were extracted according to the regular expressions above. Here we decide how the tokens are used.

This is specified by tokenOrdering = [patientName, SessionName, ScanName]

  • tokenOrdering = [ 1 2 3]; = first token is used for patient name, second for session name, third for scan name
  • tokenOrdering = [ 1 0 2]; = first token is used for patient name, second for scan name, session name is not assigned
  • tokenOrdering = [ 1 3 2]; = first token is used for patient name, third for session name, second for scan name
  • tokenOrdering = [ 2 1 3]; = second token is used for patient name, first for session name, third for scan name


tokenSessionAliases = { '^1$', 'ASL_1'; '^2$', 'ASL_2' };

The second token defines the name of the session. The token was extracted using a regular expression ^Session([12])$. This can represent a string Session1 or Session2. And either 1 or 2 is taken as the token.

In the session-aliases, each row represents one session name. The first column is the regular expression for the token, the second column gives the final name.

^1$ ASL_1

^2$ ASL_2

Here, the token name ^1$ - that is a string equaling to "1" is replaced in the analysis folder by the session name ASL_1.


tokenScanAliases = { '^PSEUDO_10_min$', 'ASL4D'; '^M0$', 'M0'; '^T1-weighted$', 'T1' };

The third token defines the name of the scan. Each row represents one scan name. The first column is the regular expression for the token, the second column gives the final name of the scan.

^PSEUDO_10_min$ ASL4D

^M0$ M0

^T1-weighted$ T1

The DICOMs in the directory PSEUDO_10_min are thus extracted to a NIFTI file called ASL4D.nii.

Note: The names ASL4D, M0, T1, FLAIR and WMH_SEGM are fixed names in the pipeline data structure. ASL4D and T1 are required, whereas M0, FLAIR and WMH_SEGM are optionally. When M0 is available, the pipeline will divide the perfusion-weighted image by the M0 image in the quantification part. When FLAIR and WMH_SEGM are available, tissue misclassification in the T1 gray matter (GM) and white matter (WM) because of white matter hyperintensities (WMH) will be corrected.

bMatchDirectories = true;

Set to TRUE if it should look for directories and DICOMs inside. Set to FALSE when you will


Summary of the 2D_Sleep example

The import module creates an overview CSV-file of all imported scans. It will ask whether you want to open this file after the import has been performed:

In the current example of 2D_Sleep, we have two subjects: 101 and 102 with each two ASL sessions. In ExploreASL, each subject has a single T1 scan and can have multiple ASL sessions. This is the case when the anatomy is not expected to change for the different ASL sessions, e.g. when scans are repeated before and after a CO2 or treatment challenge. The data structure will be analysis\SubjectName\T1.nii for the anatomical scans (T1 or FLAIR) and analysis\SubjectName\ASL_1\ASL4D.nii and analysis\SubjectName\ASL_2\ASL4D.nii for ASL sessions. If it concerns a follow-up study, where scan sessions have months or years in between them and brain anatomy is expected to change significantly between the scan sessions, then the data should be set up as a longitudinal study design. In this case, different time points are set up as analysis\SubjectName_1\T1.nii and analysis\SubjectName_2\T1.nii for two longitudinal scans of the same subject, where _1 designates the time point. So a longitudinal study with two ASL scans per scan session (e.g. a medication challenge is repeated with 6 months time difference) would look like analysis\SubjectName_1\T1.nii, analysis\SubjectName_1\ASL_1\ASL4D.nii & analysis\SubjectName_1\ASL_2\ASL4D.nii for the first time point and analysis\SubjectName_2\T1.nii, analysis\SubjectName_1\ASL_2\ASL4D.nii & analysis\SubjectName_2\ASL_2\ASL4D.nii for the second time point.

Import tips & tricks in Windows

You can download Bulk Rename Utility, to change the names into ones that can easier be processed with the import script.

Check the variable bMatchDirectories. This should be set to true (1) if you search for a directory that defines the scan. So each scan is in a separate directory. If you set this to false (0), this will search for files that define the scan. So each scan is in a separate file.

You can use the microdicom viewer and shell extension, to browse quickly to dicoms. In order to use the Windows shell extension, make sure the extension is set to .dcm (which you can do with Bulk Rename Utility).

Parameter files (meta-data)

For each converted NIFTI file, a parameter file will be created as well. This parameter file (e.g. M0_parms.mat) contains DICOM header field information that is important for post-processing - especially quantification - of the ASL data. For a T1 or FLAIR, this information is as relevant. For ASL or M0, the following DICOM header information is stored:

RepetitionTime (TR) = required to compute incomplete T1 recovery, EchoTime (TE) to compute the T2* decay during readout (only for 2D readout, not for 3D). NumberOfTemporalPositions is the number of measurement repetitions (here 60, or 30 control-label pairs for the M0 in the 2D_Sleep study). The last three scale slopes are scale slopes that are used in Philips data, to store 16 bit information from MRI scans in 12 bit file size. The same parameters are summarized in the summary CSV file:

Excel has a feature to show these comma-separated values (CSV) in separate cells upon opening of the file.

Additional QC

One thing you could do, is to quickly check whether the size of ASL4D.nii and M0.nii (and other files) are the same across participants. If there any slices lacking, or strange data ordering, this should change the data size.

In some rare cases with NSA>1, the ASL4D.nii gets wrongly structured. Always first try to restore the orientation (RestoreOrientation.m). Also, it can help in rare cases to manally reposition the NIfTI with the spm viewer.