-
Notifications
You must be signed in to change notification settings - Fork 1
/
data_access.py
35 lines (21 loc) · 1.14 KB
/
data_access.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# ### Accessing NeuroVault Archived Data
# First, tar unzip the desired archive file:
# +
# #!tar -xvf november_2022.tar.gz
# -
# Next, we will load the files into pandas DataFrames:
import pandas as pd
# Load collections:
collections = pd.read_csv('november_2022/statmaps_statisticmap.csv')
collections.head()
# Now, we load the statistical images, which require merging several tables.
#
# In order to merge `StatisticMap` to `Image`, we need the table `BaseStatisticMap`
image = pd.read_csv('november_2022/statmaps_image.csv')
basecollectionitem = pd.read_csv('november_2022/statmaps_basecollectionitem.csv')
statisticmap = pd.read_csv('november_2022/statmaps_statisticmap.csv')
# `image` table is first merged to `basecollectionitem` using `basecollectionitem_ptr_id`:
image_merged = pd.merge(image, basecollectionitem, left_on='basecollectionitem_ptr_id', right_on='id')
# Next, the `statisticmap` table can be merged to `image` using `image_ptr_id', which corresponds to 'basecollectionitem_ptr_id'
statisticmap_merged = pd.merge(statisticmap, image_merged, left_on='image_ptr_id', right_on='basecollectionitem_ptr_id')
statisticmap_merged.head()