Quality Assurance in QuNex#

Quality Assurance is an important but highly tedious step in running QuNex preprocessing. The run_qa command helps ease this process, and supports the following:

  • Raw Data QA (--datatype=raw_data)

  • Config File QA (--datatype=config)

Quality Assurance, QA, is not to be confused with Quality Control, QC, and it's command run_qc. In short, QA is responsible for processing efficiency and completion, whereas QC is responsible for the results of the processing.

Using the run_qa command#

qunex run_qa \
    --datatype=<Type of QA>
    --sessionsfolder=<QuNex sessions folder>
    --sessions=<Sessions to QA>
    --configfile=<QA config file>
    --tag=<Output identifier>
    --overwrite=<Overwrite, yes or no>

This command will run QA on all specified sessions according to a highly-customizable user-created configuration YAML file. Usually, this entails checking specified files exist and that parameters have expected values.

Once complete, run_qa will output lists of sessions that have passed and failed the declared QA, as well as reports, both human and machine readable, that detail why and how these sessions failed.

The QA performed is highly dependent on two flags: configfile and datatype. The configfile flag should point towards a configuration file. The datatype flag must be a String referring to the type of data on which you want to run QA. This page will focus primarily on these.

For more precise info on other flags and actually running the command, see the command's page.

Inputs#

Command inputs vary significantly depending on the QA specified, but at the bare minimum the command requires a configuration YAML file and a folder for each session, found within the --sessionsfolder directory.

Outputs#

The run_qa command generates four files: two lists containing sessions that have passed failed QA, and two reports (one human-readable and one machine-readable containing the same info). These are generated inside processing/lists and processing/reports respectively.

processing/lists/QA_pass_{datatype}{tag/config}.list#

The first output is a file containing all sessions that have passed the specified QA. It is formatted as a QuNex .list file. This means each line corresponds to an individual session, with the format:

session id: {session_1}
session id: {session_2}
...
session id: {session_N}

The result is that this file can input directly into qunex commands with the --sessions parameter:

--sessions="QA_pass_raw_data.list"

The goal is that users can use this list to continue processing, without including problematic sessions or those that would require different processing. These lists files also have other functionality, see more info on list files here.

processing/lists/QA_fail_{datatype}{tag/config}.list#

This output file has the same .list format as above, but instead contains all sessions that have failed QA. Similarly, this file can be input directly into QuNex commands with the --sessions parameter, even into run_qa itself if you wish to investigate the data further.

processing/reports/QA_report{datatype}{tag/config}.txt#

This file will contain a human-readable report of the QA outcomes, particularly for sessions that have failed QA. What exactly is contained within the report is highly dependent on the QA run, but will typically explain why sessions failed QA and what precisely went wrong.

processing/reports/QA_report{datatype}{tag/config}.yml#

This file output has all the same information as the above but in a machine-friendly .yml format. It also contains internal variables that may be useful to those developing pipelines off the outputs of `run_qa.

The Configuration file#

Because the QA needed varies greatly between datasets, run_qa is designed to be highly user-customizable, controlled through a user-created configuration YAML file. If you're unfamiliar with YAML format, see the documentation here.

In this file, users can define nested parameter-value pairs and sequences pertaining to your data. Basically, it allows you to tell run_qa what things you want to check in your data and what you expect them to be.

The contents will be quite different depending on the QA type you're running and your data, but it should follow this basic format:

datatypes:
    <Specified Data-type 1>:
        <param>:<value>
        <param>:
            <sub-param>:<value>

    <Specified Data-type 2>:
        - <sequence param>:
            <sub-param>:<value>
            <sub-param>:
                <sub-sub-param>:<value>

config:
    <Additional config options>

Parameters and sub-parameters must be within the scope of their corresponding datatype or parameter. They can either be specified directly as key-value pairs, or as yaml sequences starting with - depending on the data type. See below for data type specific parameters and config creation.

Data Types#

Only the below data types are currently supported.

Raw Data QA (--datatype=raw_data)#

Raw Data QA checks whether found scans are in-line with the scan Protocol, defined by the user in the supplied config. Run after import_<datatype>, this does various checks to ensure data is valid before processing. The main goal is to identify problematic sessions before you start processing, saving time and resources. It should also prevent users from needing to manually identify missing/misordered scans.

To specify Raw Data QA in your config, it must be added underneath datatypes as raw_data:

datatypes:
    raw_data:

Scans#

For each scan/image the user wishes to QA, they must add a corresponding scan-config in their configuration file with the tag - scan. Each scan-config must have the series_description parameter, which is used to identify which image (as labeled in the session.txt file) you are attempting to QA.

datatypes:
    raw_data:
        - scan:
            series_description: T1w

        - scan:
            series_description: BOLD1

        - scan:
            series_description: BOLD2

Note: this series_description field also accepts the user of wildcards, *, or specifying multiple acceptable scans with |.

        - scan:
            series_description: T1w run-1|T1w run-2

        - scan:
            series_description: BOLD1*

This is all you need to run a basic QA: run_qa will simply check each session has scans that explicitly match the series_description specified. One possible use-case for a config like this is in mapping file verification for create_session_info.

To do more advanced QA, users can add a combination of parameters and sub-parameters. Aside from series_description, all are optional, though depending on the data not all are practical.

Here are all potential parameters that can be specified at the scan level:

        - scan:
            series_description: --> Scan identifier, looks in session.txt
            required:           --> Whether scan must be present for a session to pass QA
            dicoms:             --> The number of dicoms before Nifti conversion (from import_dicom)
            session:            --> Contains sub-parameters related to the session.txt file
                <sub-params>
            json:               --> Contains sub-parameters related to the sidecar .json file
                <sub-params>
            nii:                --> Contains sub-parameters related to the Nifti file header
                <sub-params>