Skip to content

First Level Analysis

Mandy Renfro edited this page Apr 9, 2018 · 20 revisions

First Level Analysis (within subject - within run/set)

Unlike the create_ev.ipynb script, the lvl1.py script is a simple python file that is executed using a second python "submit" script.

Go to /home/data/madlab/scripts/tools/sample_scripts to find the sample scripts wmaze_lvl1.py and wmaze_lvl1_submit.py.

  • Open wmaze_lvl1.py
  • Make sure you see #!/usr/bin/env python as the first line; if it's missing, add it!
  • The next part is important -- it provides information about the current analysis. This is especially important for future you and other researchers in the lab: a description of what you're doing. It is also the perfect place to write special notes so you don't have to reinvent the wheel 37 times.
"""
=============================================================================
wmaze_fMRI: Mandy Thesis -- Fixed before Conditional -- Model 3 Version 1.0.0
=============================================================================
First level workflow for UM GE 750 wmaze task data
- WMAZE Model 3 Version 1.0.0
 1 - Normal general linear model (not LSS)
 0 - Does not account for the last three volumes (scanner artifacts)
 0 - No method to control or account for the last three volumes/trials
     - EV Directory (Model3)--- /home/data/madlab/data/mri/wmaze/scanner_behav/WMAZE_001/MRthesis/model3
- python wmaze_lvl1.py -s WMAZE_001
                       -o /home/data/madlab/data/mri/wmaze/frstlvl/wmaze/model3_1-0-0/lvl1
                       -w /home/data/madlab/scripts/wmaze/model3/model3_1-0-0/status/lvl1
**Note: DOF file is writing out in numpy hexadecimal format
    Example: 0x1.64p+7 
    print((1 + 6./0x10 + 4./0x100) * 2**7) = 178
"""
  • Import all your libraries and tools (most of which will be Nipype)
import os
from nipype.pipeline.engine import Workflow, Node, MapNode
from nipype.interfaces.utility import IdentityInterface
from nipype.interfaces.utility import Function
from nipype.utils.misc import getsource
from nipype.interfaces.io import DataGrabber
from nipype.algorithms.modelgen import SpecifyModel
from nipype.interfaces.fsl.model import Level1Design
from nipype.interfaces.fsl.model import FEATModel
from nipype.interfaces.fsl.model import FILMGLS
from nipype.interfaces.fsl.model import ContrastMgr
from nipype.interfaces.fsl.utils import ImageMaths
from nipype.interfaces.io import DataSink
from nipype.interfaces.utility import Merge
  • First you will need to create the subjectinfo() function. With this function, you will create a "Bunch" for each run/set, for each subject. This will open the EV files and organize the onsets, durations, and amplitudes for each of the EV types -- which will be fed into the Nipype pipeline.
  • Below you will see a complex set of condition statements concerning the number of each EV file: if the size does not equal at least 3 (one trial worth of info), it will not try to create an empty file (which would make the pipeline crash).
# Grab the first dimension of an array/matrix
pop_lambda = lambda x : x[0]

def subjectinfo(subject_id):
    import os
    from nipype.interfaces.base import Bunch
    from copy import deepcopy
    import numpy as np
    base_proj_dir = '/home/data/madlab/data/mri/wmaze/scanner_behav'
    # Empty array to contain info from each run (index 1-6)
    output = []

    # For the current run, of which there are 6
    for curr_run in range(1,7):
        names = []
        onsets = []
        durations = []
        amplitudes = []

        # All EVs for *all before correct B* responses in the current run
        data_before_B_corr = np.genfromtxt(base_proj_dir + 
                             '/{0}/MRthesis/model3/EVs/' +
                             r{1}_before_B_corr.txt'.format(subject_id,
                                                            curr_run), 
                                                            dtype = str)
        # All EVs for all before incorrect B responses in the current run
        data_before_B_incorr = np.genfromtxt(base_proj_dir + 
                               '/{0}/MRthesis/model3/EVs/' +                                             
                                'r{1}_before_B_incorr.txt'.format(subject_id,
                                                                  curr_run), 
                                                                  dtype = str)        
        # All remaining responses in the current run
        data_all_remain = np.genfromtxt(base_proj_dir + 
                          '/{0}/MRthesis/model3/EVs/' +
                          r{1}_all_remain.txt'.format(subject_id,
                                                      curr_run), 
                                                      dtype = str)
        # All remaining responses in the current run
        data_nonresponse = np.genfromtxt(base_proj_dir + 
                           '/{0}/MRthesis/model3/EVs/' +
                           'r{1}_nonresponse.txt'.format(subject_id,
                                                         curr_run), 
                                                         dtype = str)        
        sequence = ['all_before_B']
        for curr_type in sequence:
            corr_array_name = eval('data_{0}_corr'.format(curr_type))
            incorr_array_name = eval('data_{0}_incorr'.format(curr_type))
            if incorr_array_name.size > 0: #MORE THAN ONE MISTAKE MADE
                # Array to contain name of trial:  _corr 
                curr_names = ['{0}_corr'.format(curr_type), 
                              '{0}_incorr'.format(curr_type)]
                curr_corr_onsets = map(float, corr_array_name[:,0])
                curr_corr_durations = map(float, corr_array_name[:,1])
                curr_corr_amplitudes = map(float, corr_array_name[:,2])

                # If there is only one incorrect response in the current run
                if incorr_array_name.size == 3: #ONLY ONE ERROR 
                    curr_incorr_onsets = [float(incorr_array_name[0])]
                    curr_incorr_durations = [float(incorr_array_name[1])]
                    curr_incorr_amplitudes = [float(incorr_array_name[2])]
                else: #MORE THAN ONE ERROR
                    curr_incorr_onsets = map(float, incorr_array_name[:,0])
                    curr_incorr_durations = map(float, incorr_array_name[:,1])
                    curr_incorr_amplitudes = map(float,incorr_array_name[:,2])

                # Variables to contain all onset times, durations, 
                # and amplitudes for the current run  
                curr_onsets = [curr_corr_onsets, curr_incorr_onsets]
                curr_durations = [curr_corr_durations, curr_incorr_durations]
                curr_amplitudes = [curr_corr_amplitudes, curr_incorr_amplitudes]

            else: #NO MISTAKES WERE MADE
                curr_names = ['{0}_corr'.format(curr_type)]
                curr_corr_onsets = map(float, corr_array_name[:,0])
                curr_corr_durations = map(float, corr_array_name[:,1])
                curr_corr_amplitudes = map(float, corr_array_name[:,2])
                # Variables containing all onsets, durations, 
                # and amplitudes for current stim type in the current run
                curr_onsets = [curr_corr_onsets]
                curr_durations = [curr_corr_durations]
                curr_amplitudes = [curr_corr_amplitudes]
            
            # Append the current sequence/run name to names 
            names.append(curr_names) 
            onsets.append(curr_onsets)
            durations.append(curr_durations)
            amplitudes.append(curr_amplitudes)

        ## ALL REMAINING TRIALS ## 
        curr_names = ['all_remaining']
        curr_corr_onsets = map(float, data_all_remaining[:,0])
        curr_corr_durations = map(float, data_all_remaining[:,1])
        curr_corr_amplitudes = map(float, data_all_remaining[:,2])

        curr_onsets = [curr_corr_onsets]
        curr_durations = [curr_corr_durations]
        curr_amplitudes = [curr_corr_amplitudes] 
        
        # Append the current sequence/run name to names 
        names.append(curr_names)  
        onsets.append(curr_onsets)
        durations.append(curr_durations)
        amplitudes.append(curr_amplitudes) 

        # If any element in names is a list instead of a single value
        if any(isinstance(el, list) for el in names):
            # Unpacks subarrays into one mega array!
            names = [el for sublist in names for el in sublist]  
        if any(isinstance(el, list) for el in onsets):
            onsets = [el_o for sublist_o in onsets for el_o in sublist_o]
        if any(isinstance(el, list) for el in durations):
            durations = [el_d for sublist_d in durations for el_d in sublist_d]
        if any(isinstance(el, list) for el in amplitudes):
            amplitudes = [el_a for sublist_a in amplitudes for el_a in sublist_a]

        # Insert the contents of each run at the index of curr_run (1-6) 
        output.insert(curr_run,
                      Bunch(conditions = names,
                            onsets = deepcopy(onsets),
                            durations = deepcopy(durations),
                            amplitudes = deepcopy(amplitudes),
                            tmod = None,
                            pmod = None,
                            regressor_names = None,
                            regressors = None))
    return output
  • Another necessary function is get_contrasts() -- this will be used to set the precise contrasts you desire in the statistical analysis.
  • The below example shows you one way to flexibly create contrasts when you cannot control the number of trials that fall into each (such as correct vs. incorrect).
# Function to obtain and create contrasts *flexibly* 
def get_contrasts(subject_id, info):
    contrasts = []
    for i, j in enumerate(info):
        curr_run_contrasts = [] 
        cont_all = ['AllVsBase', 'T', j.conditions, 
                    [1. / len(j.conditions)] * len(j.conditions)]
        curr_run_contrasts.append(cont_all)
        for curr_cond in j.conditions:
            curr_cont = [curr_cond, 'T', [curr_cond], [1]]
            curr_run_contrasts.append(curr_cont)   
        if 'before_B_corr' in j.conditions and 'before_B_incorr' in j.conditions:
            cont_corr_vs_incorr = ['corr_minus_incorr', 'T',       
                                   ['before_B_corr','before_B_incorr'], 
                                   [1, -1]]
            cont_incorr_vs_corr = ['incorr_minus_corr', 'T', 
                                   ['before_B_corr','before_B_incorr'], 
                                   [-1, 1]]
            curr_run_contrasts.append(cont_corr_vs_incorr)
            curr_run_contrasts.append(cont_incorr_vs_corr) 
        contrasts.append(curr_run_contrasts)
    return contrasts
  • The get_subs() function is used to customizing the way the output files will be named.
def get_subs(cons):
    subs = []
    # For each run
    for run_cons in cons:
        run_subs = []
        # For each contrast in the run
        for i, con in enumerate(run_cons):
            # Append a tuple containing "cope#" and "cope#+name" 
            run_subs.append(('cope%d.'%(i + 1), 
                             'cope%02d_%s.'%(i + 1, con[0])))
            run_subs.append(('varcope%d.'%(i + 1), 
                             'varcope%02d_%s.'%(i + 1, con[0])))
            run_subs.append(('zstat%d.'%(i + 1), 
                             'zstat%02d_%s.'%(i + 1, con[0])))
            run_subs.append(('tstat%d.'%(i + 1), 
                             'tstat%02d_%s.'%(i + 1, con[0])))
        subs.append(run_subs)        
    return subs
# Function for extracting the motion parameters from the noise files
def motion_noise(subjinfo, files):
    import numpy as np
    motion_noise_params = []
    motion_noi_par_names = []
    # If each instance of files is not a list, make it one
    if not isinstance(files, list):
        files = [files]
    # If each instance of subjinfo is not a list, make it one
    if not isinstance(subjinfo, list):
        subjinfo = [subjinfo]
    # For each instance of files
    for j,i in enumerate(files):
        curr_mot_noi_par_names = ['Pitch (rad)', 'Roll (rad)', 'Yaw (rad)', 
                                  'Tx (mm)', 'Ty (mm)', 'Tz (mm)',
                                  'Pitch_1d', 'Roll_1d', 'Yaw_1d', 
                                  'Tx_1d', 'Ty_1d', 'Tz_1d',
                                  'Norm (mm)', 'LG_1stOrd', 'LG_2ndOrd', 
                                  'LG_3rdOrd', 'LG_4thOrd']
        # Numpy array of each motion noise files
        a = np.genfromtxt(i)
        motion_noise_params.append([[]] * a.shape[1])
        # If there are more than 17 motion noise parameters
        if a.shape[1] > 17:
            # For those additional noise parameters over 17
            for num_out in range(a.shape[1] - 17):
                # Give it a name
                out_name = 'out_{0}'.format(num_out + 1)
                curr_mot_noi_par_names.append(out_name)
        # For each instance in the second column of a
        for z in range(a.shape[1]):
            motion_noise_params[j][z] = a[:, z].tolist()
        motion_noi_par_names.append(curr_mot_noi_par_names)
    # For each instace of subjinfo    
    for j,i in enumerate(subjinfo):
        # If there are are no regressor names
        if i.regressor_names == None: 
            i.regressor_names = []
        # If there are no regressors
        if i.regressors == None: 
            i.regressors = []
        # For each instance of motion_noise_params 
        # in the current iteration of subjinfo
        for j3, i3 in enumerate(motion_noise_params[j]):
            i.regressor_names.append(motion_noi_par_names[j][j3])
            i.regressors.append(i3)            
    return subjinfo