Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added homework py files into branch anatomical-solution #8

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions day2_exercises/fMRI_homework_20151107.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# -*- coding: utf-8 -*-
"""
Created on Sat Nov 7 09:02:19 2015

@author: Elissaios
"""

#%% Load nii image (after it was unzipped)
import nibabel as nib

folder = r"c:\Anaconda3\Lib\site-packages\nibabel\tests\data";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try very hard to avoid hard-coding paths like this. I guess you are debugging, to use the nibabel test image instead of the openfmri image?

#filePath = path + 'ds107_sub001_highres.nii';
filePath = folder + r"\example_nifti2.nii";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider os.path.join to join file paths. This makes sure your script will work on Windows as well as Unix.


img = nib.load(filePath);
data = img.get_data();

#%% Anatomical image
import numpy as np
#import math;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you were trying to make the pull-request mechanism work, so there was no reason to polish the script up - but for a real PR, you would want to take out unused commented statements and debugging stuff to make it more clear to the person reading, what you were trying to do.

import matplotlib.pyplot as plt

folder = r"c:\Users\Elissaios\Box Sync\Projects\fMRI\analysis1";
filePath = folder +r"\anatomical.txt";
anat = np.loadtxt(filePath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest name this anat_data_1d for clarity.


zdim=32;
anat_slice = int(len(anat)/zdim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name also a bit hard to read, because you mean anat_slice_size rather than the slice from the anatomical data.

for ix in range(120,200+1):
anat_mod = anat_slice % ix;
if anat_mod == 0:
ij = anat_slice/ix;
#if ij.is_integer():
#if np.floor(ij)-ij == 0:
#if int(ij)-ij == 0:
if ij % 1 == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe (correct me if I am wrong) that if anat_mod is 0, then anat_slice / ix must be an integer, because you've shown that ix is a factor of anat_slice.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true... was not thinking


From: Matthew Brett [[email protected]]
Sent: Saturday, November 14, 2015 2:32 PM
To: practical-neuroimaging/fmri-methods-2015
Cc: Karageorgiou, Elissaios
Subject: Re: [fmri-methods-2015] added homework py files into branch anatomical-solution (#8)

In day2_exercises/fMRI_homework_20151107.pyhttps://github.com//pull/8#discussion_r44864724:

+import matplotlib.pyplot as plt
+
+folder = r"c:\Users\Elissaios\Box Sync\Projects\fMRI\analysis1";
+filePath = folder +r"\anatomical.txt";
+anat = np.loadtxt(filePath);
+
+zdim=32;
+anat_slice = int(len(anat)/zdim)
+for ix in range(120,200+1):

  • anat_mod = anat_slice % ix;
  • if anat_mod == 0:
  •    ij = anat_slice/ix;
    
  •    #if ij.is_integer():
    
  •    #if np.floor(ij)-ij == 0:
    
  •    #if int(ij)-ij == 0:
    
  •    if ij % 1 == 0:
    

I believe (correct me if I am wrong) that if anat_mod is 0, then anat_slice / ix must be an integer, because you've shown that ix is a factor of anat_slice.


Reply to this email directly or view it on GitHubhttps://github.com//pull/8/files#r44864724.

print(ix, int(ij))
anat3d = anat.reshape(ix,int(ij),zdim)
plt.imshow(anat3d[:,:,16],'gray')
#answer 170 by 156


#%% Working with four dimensional images
import numpy as np
import nibabel as nib
import matplotlib.pyplot as plt

folder = r"C:\Users\Elissaios\Box Sync\Projects\fMRI\analysis1";
filePath = folder + r"\ds114_sub009_t2r1.nii";
img = nib.load(filePath)
data = img.get_data()
volume = data[...,0]
#plt.imshow(volume[...,15])
var_vol = np.var(data,3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer explicit np.var(data, axis=3) - it's easier to see quickly what the command is doing.

plt.figure(1)
plt.imshow(var_vol[...,14],'gray')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might also want interpolation='nearest'.

low_vol = volume[volume<10]
plt.figure(2)
plt.hist(low_vol)
uniq = np.unique(low_vol)

#%% Into four dimensions
import numpy as np
import nibabel as nib
import matplotlib.pyplot as plt
folder = r"C:\Users\Elissaios\Box Sync\Projects\fMRI\analysis1";
filePath = folder + r"\ds114_sub009_t2r1.nii";
img = nib.load(filePath)
data = img.get_data()



















113 changes: 113 additions & 0 deletions day2_exercises/into4D.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
#%% Into four dimensions
# -*- coding: utf-8 -*-
"""
Created on Sat Nov 7 13:46:24 2015
Exercise for Fall 2015
@author: Elissaios
"""
# Compatibility with Python 3
from __future__ import print_function, division

# Standard imports of libraries
import numpy as np
import nibabel as nib
import matplotlib.pyplot as plt

# Declarations
TR = 2.5
std_vol = []
ave_task = []
ave_rest = []
outlier_task = []
outlier_rest = []

# Input data
folder = r"C:\Users\Elissaios\Box Sync\Projects\fMRI\analysis1";
filePath = folder + r"\ds114_sub009_t2r1.nii";
condPath = folder + r"\ds114_sub009_t2r1_cond.txt";
img = nib.load(filePath)
data = img.get_data()
#print(img.shape)
task = np.loadtxt(condPath)
task[:,0:2] = task[:,0:2]/TR # change sec to TR for first two columns (2 is for 1+1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might hit a problem here, because task may be an array of integers, in which case this line will get truncated to integers. Better to do something like:

ons_dur = task[:,0:2] / TR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow.


From: Matthew Brett [[email protected]]
Sent: Saturday, November 14, 2015 2:41 PM
To: practical-neuroimaging/fmri-methods-2015
Cc: Karageorgiou, Elissaios
Subject: Re: [fmri-methods-2015] added homework py files into branch anatomical-solution (#8)

In day2_exercises/into4D.pyhttps://github.com//pull/8#discussion_r44864818:

+TR = 2.5
+std_vol = []
+ave_task = []
+ave_rest = []
+outlier_task = []
+outlier_rest = []
+
+# Input data
+folder = r"C:\Users\Elissaios\Box Sync\Projects\fMRI\analysis1";
+filePath = folder + r"\ds114_sub009_t2r1.nii";
+condPath = folder + r"\ds114_sub009_t2r1_cond.txt";
+img = nib.load(filePath)
+data = img.get_data()
+#print(img.shape)
+task = np.loadtxt(condPath)
+task[:,0:2] = task[:,0:2]/TR # change sec to TR for first two columns (2 is for 1+1)

You might hit a problem here, because task may be an array of integers, in which case this line will get truncated to integers. Better to do something like:

ons_dur = task[:,0:2] / TR


Reply to this email directly or view it on GitHubhttps://github.com//pull/8/files#r44864818.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In [12]: task = np.ones((4, 3), dtype=int)

In [13]: task[:, :2] / 2.  # integer divided by float -> float
Out[13]: 
array([[ 0.5,  0.5],
       [ 0.5,  0.5],
       [ 0.5,  0.5],
       [ 0.5,  0.5]])

In [14]: task[:, :2] = task[:, :2] / 2.  # Setting into int array -> int

In [15]: task
Out[15]: 
array([[0, 0, 1],
       [0, 0, 1],
       [0, 0, 1],
       [0, 0, 1]])

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. weird implicit structure in my mind, but I think I understand.


From: Matthew Brett [[email protected]]
Sent: Saturday, November 14, 2015 4:53 PM
To: practical-neuroimaging/fmri-methods-2015
Cc: Karageorgiou, Elissaios
Subject: Re: [fmri-methods-2015] added homework py files into branch anatomical-solution (#8)

In day2_exercises/into4D.pyhttps://github.com//pull/8#discussion_r44866024:

+TR = 2.5
+std_vol = []
+ave_task = []
+ave_rest = []
+outlier_task = []
+outlier_rest = []
+
+# Input data
+folder = r"C:\Users\Elissaios\Box Sync\Projects\fMRI\analysis1";
+filePath = folder + r"\ds114_sub009_t2r1.nii";
+condPath = folder + r"\ds114_sub009_t2r1_cond.txt";
+img = nib.load(filePath)
+data = img.get_data()
+#print(img.shape)
+task = np.loadtxt(condPath)
+task[:,0:2] = task[:,0:2]/TR # change sec to TR for first two columns (2 is for 1+1)

In [12]: task = np.ones((4, 3), dtype=int)

In [13]: task[:, :2] / 2. # integer divided by float -> float
Out[13]:
array([[ 0.5, 0.5],
[ 0.5, 0.5],
[ 0.5, 0.5],
[ 0.5, 0.5]])

In [14]: task[:, :2] = task[:, :2] / 2. # Setting into int array -> int

In [15]: task
Out[15]:
array([[0, 0, 1],
[0, 0, 1],
[0, 0, 1],
[0, 0, 1]])


Reply to this email directly or view it on GitHubhttps://github.com//pull/8/files#r44866024.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same in MATLAB, except MATLAB seems to use rounding instead of just taking the integer part:

>> task = ones(4, 3, 'int32')

task =

           1           1           1
           1           1           1
           1           1           1
           1           1           1

>> task(:, 1:2) = task(:, 1:2) / 2

task =

           1           1           1
           1           1           1
           1           1           1
           1           1           1

The point is that, in order to save memory and processing time, Python / MATLAB need to be able to depend on each element of the array having the same type in memory. So, when you set part of an array, it has to become the same type of the rest of the array.

dur = list(img.shape)[3]
nTR = dur
time_course = np.zeros(nTR)
nBlock = task.shape[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general prefer variable names without capitals - e.g n_block. See https://www.python.org/dev/peps/pep-0008/#naming-conventions for a common set of guidelines.


for itr in range(0,nBlock):
durBlock = task[itr,1]
blockS = int(task[itr,0]) # block start
blockE = int(blockS+durBlock) # block end
time_course[blockS:blockE] = 1

plt.figure(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not need this line.

plt.ylim(-0.5,1.5)
plt.plot(time_course)

# Create two matrices, one for each condition; it includes first 10 s of rest
is_task_tr = [time_course==1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the list brackets around the variable. The extra brackets are why you need squeeze and np.array in the lines below. Simplify with:

is_task_tr = time_course==1
data_task = data[..., is_task_tr]

is_rest_tr = [time_course==0]
taskInd = np.squeeze(np.array([is_task_tr]))
restInd = np.squeeze(np.array([is_rest_tr]))
data_task = data[...,taskInd]
data_rest = data[...,restInd]

# Identify outlier volumes in each; use 2.5 x interquartile range as cutoff
# You could do it with variance too rather than average signal (take np.std instead of np.average)
# however, for a proper analysis of outliers you probably need both (e.g. either signal
# is high in entire slice but with little variance, or signal is high is certain voxels only
# with low average value but high std)
nBlock_task = int(data_task.shape[3])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already an integer I believe.

for t in range(0,nBlock_task):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in class, but you might prefer something like:

n_in_slice = np.prod(data_task.shape[:-1])
task_2d = np.reshape(data_task, (n_in_slice, nBlock_task))
ave_task_vol = np.mean(task_2d, axis=0)

ave_task = ave_task + [np.average(data_task[...,t])]
ave_task_vol = np.average(ave_task)
iqRange_task = np.percentile(ave_task,75) - np.percentile(ave_task,25)
for t in range(0,nBlock_task):
if ave_task[t] > (ave_task_vol + (iqRange_task*2.5)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IQR outlier threshold is usually (https://en.wikipedia.org/wiki/Interquartile_range#Interquartile_range_and_outliers):

p25, p75 = np.percentile(ave_task, [25, 75])
iqr = p75 - p25
hi_thresh = p75 + iqr * 2.5  # well, 1.5 scaling is more typical
lo_thresh = p25 - iqr * 2.5

You would then check if the value was above hi_thresh or below lo_thresh.

You can do that in a vectorized way:

is_outlier = (ave_task_vol < lo_thresh) & (ave_task_vol > hi_thresh)

outlier_task = outlier_task + [t]

nBlock_rest = int(data_rest.shape[3])
for t in range(0,nBlock_rest):
ave_rest = ave_rest + [np.average(data_rest[...,t])]
ave_rest_vol = np.average(ave_rest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should check for outliers in the whole dataset rather than separately for the task and rest blocks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is a bimodal distribution, where one is an outlier for a specific state then you should take the individual group distributions. think of a t-test, the purpose is for the individual groups to be distributed to allow for normality, additivity and homoscedasticity, not the entire dataset. hence the outliers should be checked on individual groups. You are right though on the cutoff of the outliers, I was using an approximate recollection of the approach, though to an extent it is somewhat arbitrary.


From: Matthew Brett [[email protected]]
Sent: Saturday, November 14, 2015 2:57 PM
To: practical-neuroimaging/fmri-methods-2015
Cc: Karageorgiou, Elissaios
Subject: Re: [fmri-methods-2015] added homework py files into branch anatomical-solution (#8)

In day2_exercises/into4D.pyhttps://github.com//pull/8#discussion_r44865007:

+# however, for a proper analysis of outliers you probably need both (e.g. either signal
+# is high in entire slice but with little variance, or signal is high is certain voxels only
+# with low average value but high std)
+nBlock_task = int(data_task.shape[3])
+for t in range(0,nBlock_task):

  • ave_task = ave_task + [np.average(data_task[...,t])]
    +ave_task_vol = np.average(ave_task)
    +iqRange_task = np.percentile(ave_task,75) - np.percentile(ave_task,25)
    +for t in range(0,nBlock_task):
  • if ave_task[t] > (ave_task_vol + (iqRange_task*2.5)):
  •    outlier_task = outlier_task + [t]
    
    +nBlock_rest = int(data_rest.shape[3])
    +for t in range(0,nBlock_rest):
  • ave_rest = ave_rest + [np.average(data_rest[...,t])]
    +ave_rest_vol = np.average(ave_rest)

I think you should check for outliers in the whole dataset rather than separately for the task and rest blocks.


Reply to this email directly or view it on GitHubhttps://github.com//pull/8/files#r44865007.

iqRange_rest = np.percentile(ave_rest,75) - np.percentile(ave_rest,25)
for t in range(0,nBlock_task):
if ave_rest[t] > (ave_rest_vol + (iqRange_rest*2.5)):
outlier_rest = outlier_rest + [t]

cleanInd_task = np.arange(0,nBlock_task)
cleanInd_task = np.setdiff1d(np.array(cleanInd_task),np.array(outlier_task))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit I've never seen np.setdiff1d before. Won't it always return cleanInd_task in this case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, it is a way to do set comparisons. In this case it takes the vales in cleanInd_task that are not in outlier_task (if there was an outlier here then cleanInd_task would become a smaller array.


From: Matthew Brett [[email protected]]
Sent: Saturday, November 14, 2015 2:59 PM
To: practical-neuroimaging/fmri-methods-2015
Cc: Karageorgiou, Elissaios
Subject: Re: [fmri-methods-2015] added homework py files into branch anatomical-solution (#8)

In day2_exercises/into4D.pyhttps://github.com//pull/8#discussion_r44865024:

+iqRange_task = np.percentile(ave_task,75) - np.percentile(ave_task,25)
+for t in range(0,nBlock_task):

  • if ave_task[t] > (ave_task_vol + (iqRange_task*2.5)):
  •    outlier_task = outlier_task + [t]
    
    +nBlock_rest = int(data_rest.shape[3])
    +for t in range(0,nBlock_rest):
  • ave_rest = ave_rest + [np.average(data_rest[...,t])]
    +ave_rest_vol = np.average(ave_rest)
    +iqRange_rest = np.percentile(ave_rest,75) - np.percentile(ave_rest,25)
    +for t in range(0,nBlock_task):
  • if ave_rest[t] > (ave_rest_vol + (iqRange_rest*2.5)):
  •    outlier_rest = outlier_rest + [t]
    
    +cleanInd_task = np.arange(0,nBlock_task)
    +cleanInd_task = np.setdiff1d(np.array(cleanInd_task),np.array(outlier_task))

I must admit I've never seen np.setdiff1d before. Won't it always return cleanInd_task in this case?


Reply to this email directly or view it on GitHubhttps://github.com//pull/8/files#r44865024.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, sorry, I see what you mean with setdiff. I still think that this would be a lot cleaner using boolean arrays rather than indices, as we were discussing in class.


cleanInd_rest = np.arange(0,nBlock_rest)
cleanInd_rest = np.setdiff1d(np.array(cleanInd_rest),np.array(outlier_rest))

clean_task = data_task[...,cleanInd_task]
clean_rest = data_rest[...,cleanInd_rest]

mean_task = np.average(clean_task,axis=3) # Average voxel volumes per task
mean_rest = np.average(clean_rest,axis=3) # Average voxel volumes per task
dTask_Rest = mean_task - mean_rest
plt.figure(1)
plt.imshow(dTask_Rest[...,10],'gray')

mean_task = np.average(data_task,axis=3) # Average voxel volumes per task
mean_rest = np.average(data_rest,axis=3) # Average voxel volumes per task
dTask_Rest = mean_task - mean_rest
plt.figure(2)
plt.imshow(dTask_Rest[...,10],'gray')

# Find std of volumes
dur = list(img.shape)[3]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You just want img.shape[3] here don't you?


for t in range(0,dur):
vol = data[...,t]
#print(vol1.shape)
voxels = list(vol.shape) # find how many voxels in each dimension
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer using these.

slic = int(voxels[2]/2)-1 # define the middle slice
#plt.figure(t)
#plt.imshow(vol[...,slic],'gray')
std_vol = std_vol + [np.std(vol)]

plt.figure(3)
plt.hist(std_vol) # outline in the 800s