Running 1st and 2nd level BOLD task activation analyses#

Task activation analysis in QuNex uses a standard general linear model (GLM) based approach. Notably, QuNex leverages the same GLM 'engine' when computing regression for functional connectivity preprocessing described in the section on BOLD task-evoked and resting-state functional connectivity preprocessing.

We recommend that the initial steps of creating BOLD image brain masks, computing image statistics, and creating image statistics reports described above are performed before conducting task-based BOLD activation analyses. One reason is that they allow inspection of the quality of data and identification of sessions that might need to be excluded due to excessive movement. Another reason is that you might want to exclude bad frames from GLM computation and/or regress movement parameters to improve results.

This is a short overview of steps and commands that are available for BOLD task-based activation analysis. It is divided into two parts.

  1. The first part relates to first level single-session analysis in which one obtains session-specific GLM estimates of task-evoked responses.

  2. The second part relates to second level group analysis in which specific hypotheses are tested at the group level.

First level (single session) analysis#

Creating conc and fidl files#

Prior the first level analysis, two pieces of information need to be prepared:

  1. Which BOLD images and in what order they were acquired during task performance. This information is provided using .conc files.

  2. The second piece of information is the exact timing and duration of events that are to be modeled in the task GLM analysis. This information is provided in .fidl files.

.conc and .fidl files have to be created manually and for each session separately. For the specific format of both files see the specification of fidl and conc files.

In short, .conc files should contain a list of BOLD image files that are to be included in the analysis in the order in which they are to be concatenated.

Creating conc and fidl files - examples#

Example of a .conc file:

number_of_files: 4
    file:/Volumes/tigr/MBLab/fMRI/sWM/sessions/OP108/images/functional/bold1_Atlas.dtseries.nii
    file:/Volumes/tigr/MBLab/fMRI/sWM/sessions/OP108/images/functional/bold2_Atlas.dtseries.nii
    file:/Volumes/tigr/MBLab/fMRI/sWM/sessions/OP108/images/functional/bold3_Atlas.dtseries.nii
    file:/Volumes/tigr/MBLab/fMRI/sWM/sessions/OP108/images/functional/bold4_Atlas.dtseries.nii

This is a .conc file for one session. The file gives information that four BOLD images were acquired during task performance and wished to be used in the analysis. Moreover, complete directories of BOLD images are provided after the file: notation and they are written in the order in which they were acquired.

.fidl files start with the information on TR with which the BOLD images were acquired, and the list of events that are to be modeled. Each subsequent line then specifies the time, event type index, and event duration for each of the events to be modeled. Following event duration one can also specify the value of additional behavioral (or other) coregressors. Note, it is essential that the sequence of BOLD image files in .conc files matches the timing information in .fidl files. Specifically, the timing information in .fidl files is given as if task BOLD images would represent a single uninterrupted scan.


Example of a .fidl file:

2.5  encoding  delay  response
12.5  0  2.5
15.5  1  9.5
25    2  3
40    0  2.5
43    1  9.5
52.5  2  3
70    0  2.5
73    1  9.5
82.5  2  3
...

This is a snippet of a .fidl file for one session. The first line of the file gives information on the TR (i.e. 2.5) and the names of all events (i.e. 'encoding', 'delay', 'response'). Following lines in the file are divided into three columns, where the first column gives information on the start of the event (in seconds and aligned with TR), the second column represents the code of the event (the index of the event from the events list in the first line), and the third column gives the duration of the event (in seconds).

To make use of .conc and .fidl files it is best to put them in <study>/sessions/inbox/events folder, where they would be found by the processing commands and copied to the individual's <study>/sessions/<session id>/images/functional/concs and <study>/sessions/<session_id>/images/functional/events folders for .conc and .fidl files, respectively. To identify correctly which file belongs to which participant or session, the following naming convention needs to be used:

  • .conc files: <session_id>_<task name>.conc

  • .fidl files: <session_id>_<task name>_<modeling name>.fidl

Running GLM analysis#

The first step in task-evoked BOLD analysis is a single-session or first level statistical analysis. This is most commonly accomplished with regression analysis using the general linear modeling (GLM) framework. GLM is used to obtain activation (beta) estimates while (optionally) also accounting for nuisance signals (e.g. motion, white matter signal, ventricle signal etc.). Before computing GLMs, additional preprocessing steps can also be performed, such as spatial smoothing and temporal filtering.

These analysis steps can be done with either preprocess_bold or preprocess_conc QuNex commands. preprocess_bold can be used when task spanned only a single BOLD image. More commonly though, tasks spans a series of BOLD images specified in a .conc file in which case preprocess_conc should be used. Do note that as tasks usually span multiple BOLD images, existing functionality is enhanced and novel functionality added to preprocess_conc command first. Going forward the two commands might be combined into a single one.

preprocess_conc is a complex command that enables multiple preprocessing steps, in general GLM computation, removal of nuisance signal, spatial smoothing, temporal filtering (high-pass, low-pass) and motion scrubbing. Detailed information on the command inputs and options should be checked in the command help documentation and on BOLD Preprocessing page.

The command produces several different outputs as listed below:

  • residual image:

    <root>_res-<regressors><glm name>.<ext>

  • GLM image:

    <bold name><bold tail>_conc_<event root>_res-<regressors><glm name>_Bcoeff.<ext>.

  • text GLM regressor matrix:

    <bold name><bold tail>_GLM-X_<event root>_res-<regressors><glm name>.txt.

  • image of a regressor matrix:

    <bold name><bold tail>_GLM-X_<event root>_res-<regressors><glm name>.png.

While residual and GLM images are stored in <study folder>/sessions/<session id>/images/functional, text files and images of a GLM regressor matrix can be found in <study folder>/sessions/<session id>/images/functional/glm.

Running GLM analysis - example#

Example use of preprocess_conc:

qunex preprocess_conc \
    --batchfile="batch.txt" \
    --sessionsfolder="sessions" \
    --overwrite="yes" \
    --parsessions="6" \
    --bold_actions="sr" \
    --bold_nuisance="e" \
    --glm_name="_A" \
    --glm_results="all" \
    --event_string="encoding:boynton|delay:boynton|response:boynton" \
    --bold_preprocess="sWM" \
    --event_file="sWM_A" \
    --nprocess="0" \
    --image_target=cifti \
    --hcp_cifti_tail="_Atlas" \
    --conc_use="absolute"

The breakdown of the parameters is as follows:

  • sessions Path to batch file

  • sessionsfolder Path to study sessions folder

  • overwrite Whether to overwrite the existing files (yes) or not (no)

  • parsessions How many sessions to process in parallel

  • bold_actions Which actions to perform, e.g. 'src' for smoothing and regression

  • bold_nuisance What nuisance signal to model, at least 'e' for event modeling

  • glm_name The name of the resulting GLM file

  • glm_results Which results from GLM regression to save, e.g. 'cr' for beta coef. and GLM residuals

  • event_string A string of regressor to be used in the GLM modeling

  • bold_preprocess The root name of the conc files

  • event_file The root name of the fidl files

  • nprocess How many sessions to process, e.g. 0 for all

  • image_target What file format to process

  • hcp_cifti_tail The tail of the cifti files

  • conc_use 'relative' (extract and use only bold number), 'absolute' (use whole path)

Preparing the event string parameter#

In this section we will cover event string in more detail. The event string directly relates to events as they are reported in fidl files. Each of the event types listed in the fidl file can be modeled using the Boynton HRF, SPM double gaussian HRF, block response, unassumed modeling, or a combination of these.

Assumed HRF modeling#

Assumed modeling is specified by fidl event code, type of hemodynamic response function (HRF), and optionally length of event in the format: <fidl code>:<hrf function>[-run|-uni][:length]. Here are a few examples:

  • target:boynton ... models all target events using the Boynton HRF with duration as specified in the fidl file

  • response:spm:2 ... models all response events using SPM double gaussian HRF with assumed event duration of 2 second (irrespective of what is specified in fidl file)

For each specified event in the string a single regressor will be created by creating a time vector of zeros, placing 1 for all the timepoints of the duration of the event and convolving it with the specified HRF function.

The optional -run or -uni extension of the hrf command specifies, how the resulting regressor time series should be normalized. In case of <hrf function>-run, e.g. boynton-run or SPM-run, the resulting regressor is normalized to an amplitude of 1 within each bold run separately. This can result in different scaling of regressors with different durations, and of the same regressor across different runs. Scaling, in this case, is performed after the signal is weighted, so in effect, the scaling of weights (e.g. behavioral regressors), can differ across bold runs. In the case of <hrf function>-uni, e.g. 'boynton-uni' or 'SPM-uni', the HRF function will be normalized to have the area under the curve equal to 1. This ensures uniform and universal, scaling of the resulting regressor across all event lengths. In addition, the scaling is not impacted by weights (e.g. behavioral coregressors), which in turn ensures that the weights are not scaled. The flag can be abbreviated to '-r' and '-u'. If not specified, '-uni' will be assumed. The default has changed from -run in QuNex version 0.93.4.

The optional length parameter, if set, would override the event duration information provided in the fidl file.

Block modeling#

Blocked modeling is specified by fidl event code followed by key word block and two numbers that specify block onset and offset as duration in frames from the event onset and offset as specified in the fidl file. For example:

  • movement:block:1:2 ... would create a block regressor with onset 1 frame after the start of the movement event as recorded in fidl file and offset 2 frames after the end of the event according to the information in the fidl file

Unassumed HRF modeling#

In comparison to assumed and blocked modeling in which a single regressor is created for the specific event type of interest, unassumed HRF modeling enables one to estimate the response time course across the specified number of frames. Specifically, for each event type, N regressors are created (N equals the number of frames over which to model the event). All the regressors are set initially to zero, then the first regressor is set to 1 for all first frames of the specified event of interest, the second regressor is set to 1 for all the second frames of the specified event of interest, and so on. For example, let's assume that we would like to model event T with an unassumed regressor of length 5. In the event string, we would specify that using the format:

<fidl event code>[:u]:<number of frames to model> as T:u:5 or just T:5.

We can represent the resulting regressors in the following timeline, with Event coding the onsets of events A, B and T, and lines T1 to T5 coding the actual values of the regressors for each of the frames:

Event:  A . . . T . . T . . . . B . . A . . T . . . B . . T . . . . 
   T1:  0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0
   T2:  0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0
   T3:  0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 
   T4:  0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0
   T5:  0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1

Naming of regressors#

Regressors by default take the name from the fidl event code to which they relate; this name can, however, be changed. It is especially prudent to name regressors if a regressor relates to more than one fidl code or if a behavioral (co)regressor is to be created. In the first case, let's assume that you would like to create a single Boynton assumed regressor for both events A and B. This can be specified as: A,B:boynton>AB. To parse that string, A,B specifies that events A and B are to be treated a single event type boynton specifies to model them using Boynton HRF, > denotes further information, of which the first, in this case AB is the name of the created regressor.

Behavioral regressors#

Sometimes one needs to create a behavioral regressor, a regressor whose amplitude would depend on a specific behavioral variable, e.g. accuracy or reaction time. The values for behavioral regressors first need to be encoded in the .fidl file in the form of additional columns for each specified event. For instance:

2.0 congruent incongruent
6.0   0  1  342 
10.0  1  1  425
14.0  1  1  462
20.0  0  1  367
22.0  1  1  418
26.0  0  1  332
...

This .fidl file specifies in the header a BOLD event naming with TR of 2 seconds in which congruent and incongruent trials were shown. The rest of the lines contain in the first column the time of the trial onset, in the second column the code of the event (0 referring to congruent, 1 to incongruent), in the third column the event duration is specified, and finally, the fourth column denotes the reaction time of the response.

If we would like to investigate, the activity of which brain region correlates with variability in trial to trial task performance, we can generate the following two behavioral regressors:

  • congruent:9>congruent_rt:1:within:z

  • incongruent:9>incongruent_rt:1:within:z

The first specification would create 9 regressors to model via an unassumed HRF method for congruent trials. The set of regressors would be named congruent_rt and their value would be based on the additional (4th) column in the fidl file (the one with the reaction times in the example above). To create the regressors, the reaction times would be standardized to z-scores within the congruent events and all the values 1 across the 9 unassumed regressors that pertain to a specific event would be multiplied by the z-score value for that event. The second set of unassumed regressors would be created in the same way for the incongruent trials.

Do note, that since the behavioral regressor(s) would be de-meaned (i.e. mean removed) within each event type, the estimates for other regressors would not change by adding these behavioral regressors. The latter would only help establish activation of which brain regions correlate with reaction times, in this case for each time frame separately, as we are using unassumed HRF modeling.

A full formal specification for a regressor is:

<fidl code>:<hrf>:<length>[><name>[:<column>[:<normalization span>[:<normalization method>]]]]

For a behavioral regressor normalization span can be set to:

  • within ... the behavioral values are only normalized within the specified event type

  • across ... the behavioral values are normalized across all events.

Normalization method can be set to:

  • z ... compute a Z-score

  • 01 ... normalize to fixed range 0 to 1

  • -11 ... normalize to fixed range -1 to 1

Full event string#

All the regressors that are to be specified are then joined in a single string and separated using the pipe | character. For example:

event_string="congruent:boynton|incongruent:boynton|congruent:boynton>congruent_rt:1:within:z|incongruent:boynton>incongruent_rt:1:within:z"

The string specifies a model with 2 assumed regressors modeling response to congruent and incongruent trials using Boynton HRF, and 2 additional behavioral regressors that are based on z-scored values from the first extra column in the fidl file.

Second level (group) analysis#

Whole-brain second-level statistical analysis#

The second step in task-evoked BOLD analysis is a group- or second-level statistical analysis. In this step activation estimates obtained in the previous step are subjected to statistical testing to address questions such as, which regions were activated and deactivated during task performance, what were differences in (de)activation between two conditions or two groups, and others.

Whole-brain second-level analysis is broadly organized into three steps:

  1. The relevant effect estimates are extracted from the individual GLM files and combined into group files.

  2. The desired statistics are computed. QuNex has adopted the excellent FSL PALM package that uses permutation resampling methods of complex uni- and multivariate GLM models for statistical inference. QuNex provides convenience functions that simplify generation of input files, and running analyses.

  3. The obtained results can be thresholded, masked, and combined into final results.

As steps two and three are shared with other types of whole-brain analyses, please refer to Group-level thresholding and mapping of results for more information. Again, more detailed information can be obtained by consulting each respective command help documentation and other related user guide documents.

Creating list files and extracting GLM estimates#

Before running the second-level analysis in PALM, we need to extract GLM estimates of the effects of interests, organize and save them in the desired manner. This can be achieved with the qunex command general_extract_glm_volumes.

The command uses the .list file as an input. A .list file should contain information on the session id for each session included in the analysis, followed by the directory to the file with GLM estimates for that session. Detailed description of list files is provided in the specification of list files.

Examples#

Example of a .list file:

session id: OP109
    glm: /Volumes/tigr/MBLab/fMRI/sWM/sessions/OP109/images/functional/bold_Atlas_conc_sWM_A_s_res-e_A_Bcoeff.dtseries.nii
session id: OP110
    glm: /Volumes/tigr/MBLab/fMRI/sWM/sessions/OP110/images/functional/bold_Atlas_conc_sWM_A_s_res-e_A_Bcoeff.dtseries.nii
session id: OP136
    glm: /Volumes/tigr/MBLab/fMRI/sWM/sessions/OP136/images/functional/bold_Atlas_conc_sWM_A_s_res-e_A_Bcoeff.dtseries.nii
session id: OP137
    glm: /Volumes/tigr/MBLab/fMRI/sWM/sessions/OP137/images/functional/bold_Atlas_conc_sWM_A_s_res-e_A_Bcoeff.dtseries.nii
session id: OP140
    glm: /Volumes/tigr/MBLab/fMRI/sWM/sessions/OP140/images/functional/bold_Atlas_conc_sWM_A_s_res-e_A_Bcoeff.dtseries.nii

This is a .list file for a group of five sessions. The file gives information on each session's id and the directory to the file with GLM estimates after a glm: notation.

To extract the estimates of interest use:

  • general_extract_glm_volumes

    This command, which can be run either from MATLAB or the command line, extracts GLM estimates of the effects of interests, organizes and saves them in the desired manner.


Example use of general_extract_glm_volumes:

qunex general_extract_glm_volumes \
    --flist=sWM_A_n25.list \
    --outf=sWM_A_n25 \
    --effects="encoding, delay, response" \
    --saveoption="by subject" \
    --verbose=true

Running PALM#

QuNex uses a wrapper command run_palm to run second-level statistical analysis in PALM. PALM allows various types of statistical tests, which are defined by different input parameters and should be carefully checked in the PALM User Guide.

PALM requires several inputs:

  1. Statistical testing is performed on the file with extracted group GLM estimates prepared before.

  2. Separate specification files have to be created with information on the design matrix, t-contrasts, f-contrasts and exchangeability blocks respectively. These specification files can be easily created with the create_ws_palm_design convenience command:

  • create_ws_palm_design

    This command prepares a design file, t-contrasts, f-contrasts and exchangeability block files for a single group within-session PALM designs. It supports full factorial designs with up to four factors with arbitrary number of levels each. The command assumes the data to be organized by session and the first specified factor to be the slowest varying one. It is recommended that the output files are carefully examined to ensure that information matches the data and that the design is appropriate for the type of second-level statistical analysis that you want to perform.

More details on design matrices, t-contrasts, f-contrasts and exchangeability blocks that are appropriate for specific statistical analyses (e.g. independent t-test, paired t-test, one-way ANOVA etc.) can be found on FSL wiki.

PALM examples#

Example use of create_ws_palm_design:

qunex create_ws_palm_design \
    --factors="2,3" \
    --nsubjects=25 \
    --root=sWM_A_n25

The example above would create four .csv files, separately for the design matrix, t-contrasts, f-contrasts and exchangeability blocks, named according to the parameter root. The example uses two factors to prepare specification files, the first with two levels and the second with three levels. The design matrix would then consist of numbers (usually ones and zeros) that code for different main effects of factors and their interactions. It would also contain coding for subjects as random effects, in the case above 25 subjects would be added in the design matrix. t-contrasts and f-contrasts would be prepared in the order: main effect of the first factor, main effect of the second factor, interaction between factors, subject effects. Exchangeability blocks file in the case above would contain one column with sequential numbers (starting from 1), where each number would code for one specific subject.

Once all specification files are prepared, second-level statistical testings can be run in PALM with the following command:

This command runs second level analysis using PALM permutation resampling. The command can be run on grayordinate, CIFTI images or volume, NIFTI images. The output of this command is a number of different image files that contain results for all t- or f-contrasts provided by the --design parameter and different methods for multiple testing correction defined by the palm_args parameter.


Example use of run_palm:

qunex run_palm \
    --image=sWM_A_n25.dtseries.nii \
    --design="name:sWM_A_n25|f:none" \
    --palm_args="n:500|fdr|T|T2DHEC:2:0.5:26|C:3.1|accel:tail" \
    --root=PALM/sWM_A_n25

Masking and joining maps#

Finally, output results of PALM can be thresholded or masked with the mask_map command and combined with the join_maps command. mask_map and join_maps are both convenience wrapper commands that utilize wb_command to modify CIFTI files.

  • mask_map

    This command enables easy masking of CIFTI images (e.g. ztstat image from PALM), using the provided list of mask files (e.g. p-values images from PALM) and thresholds. More than one mask can be used in which case they can be combined using a logical OR or AND operator.

  • join_maps

    This command concatenates the listed CIFTI images and names the individual volumes, if names are provided.

mask_map example#
qunex mask_map \
    --image="PALM/sWM_A_n25_reg_ztstat_C2.dscalar.nii" \
    --masks="PALM/sWM_A_n25_clustere_ztstat_fwep_C1.dscalar.nii, \
           PALM/sWM_A_n25_clustere_ztstat_fwep_C2.dscalar.nii" \
    --output="PALM/sWM_A_n25_reg_ztstat_cefwep_C1C2_0.017.dscalar.nii" \
    --maxv="0.017"
join_maps example#
qunex join_maps \
    --images="PALM/sWM_encoding_A_n25_reg_ztstat_cefwep_C1C2_0.017.dscalar.nii, \
            PALM/sWM_delay_A_n25_reg_ztstat_cefwep_C1C2_0.017.dscalar.nii, \
            PALM/sWM_response_A_n25_reg_ztstat_cefwep_C1C2_0.017.dscalar.nii" \
    --names="Encoding [CE FWE p0.017], Delay [CE FWE p0.017], Response [CE FWE p0.017]" \
    --output="Maps/sWM_n25_Z_CEFWE_p0.017.dscalar.nii"

Region of interests (ROI) second-level analysis#

Either as a way to provide more detailed ROI level results, or to conduct analyses on independently or a priori defined ROI instead of whole-brain analyses, QuNex provides a set of commands that enable identification of ROI from activation images, and extracting relevant GLM values for specified ROIs. Further analysis can then be done using other statistical and visualization tools, such as R or others suitable for your specific research focus.

Currently the following commands are provided:

  • general_find_peaks

    This command, which can be run from MATLAB or the command line, can be used to identify peaks in the provided input image and define ROIs using a watershed algorithm, which grows regions from peaks. The command can perform initial spatial smoothing and masking of input images, and allows specification of minimum and maximum ROI size. ROIs smaller than specified minimum size are either joined with neighbouring ROI or rejected when isolated. ROIs larger than specified maximum size are reflooded until maximum size is reached.

  • general_extract_roi_glm_values

    This command, run either from MATLAB or the command line, is used to extract mean activation (beta) estimates of specified effects for the specified ROIs from volume or cifti GLM files. The resulting mean beta estimates or mean percent signal change values can be saved in simple text files in either long or wide format that can be easily imported for further processing, analysis and visualization.