IMPORT kremin_2021 #133

alvinwmtan · 2024-08-28T21:23:20Z

Kremin et al. (2021) has two data subsets: Eng–Fra bilingual and Eng–Spa bilingual. The former is included in montat_2022, but the latter is not; we will import this whole dataset separately.

alvinwmtan · 2024-08-30T00:06:45Z

IDless import complete; ready for review

alvinwmtan · 2024-08-30T00:06:46Z

Checklist for code review v2024

To start:

Git pull this repo to get the latest version
Update your peekds and peekbankr to the latest version
- Be sure to restart your R Session to apply these updates
Get the latest version of the dataset from osf (delete your raw_data, so that the script automatically downloads the data)
Run the import script
Does it run into issues due to missing libraries? during restructuring, import statements for libraries like janitor might have been lost in some datasets - re-add them if necessary
Does the validator complain about the processed data? Complain to Adrian (or fix the listed issues if you feel like it)

Common issues to check:

Trials

Are trials now unique between administrations?
Is exclusion info handled correctly? (look closely at how exclusion information is marked)

Trial Types

Check if the trial type IDs are created independently of administrations/subjects
Is vanilla_trial coded appropriately?

Stimuli

If the images are on osf, make sure the image path contains the image path on osf
Make sure each row represents a label-image association
- the labels should be the words that the participants hear. For example, "apple" is okay, "red_apple_little" is wrong and was probably erroneously extracted from the file name
Are there items in the imported dataset not mentioned in the paper?
Are distractors represented correctly?
- Special explanation for distractors: If an item only ever appeared in distractor position, it still gets its own row. The label is typically the label given to the image in the experiment description (e.g., "the distractor was an image of a chair"). If there is no obvious label provided in the experiment design, leave label blank.

Subjects

Does CDI data follow the new aux_data format?
Age rounded correctly? (decision: we do no rounding)

General

vboyce · 2024-09-10T00:23:31Z

@alvinwmtan Is the raw data for kremlin on the peekbank osf? (because I'm not able to download it and not seeing it there?)

alvinwmtan · 2024-09-10T00:30:23Z

sorry, just uploaded. should be there now

vboyce · 2024-09-10T01:01:08Z

Thanks! Might I also have "demo_comp.Rda"?

alvinwmtan · 2024-09-10T01:51:55Z

ah yes sorry, forgot i had to copy it over from sander-montant_2022

vboyce · 2024-09-10T18:52:04Z

sorry to keep asking for files (@alvinwmtan), but I can't find
target-distractor-pairs.csv
trial_info_fr.csv
trial_info_sp.csv
and also the images if we have them

alvinwmtan · 2024-09-10T22:36:33Z

my bad; added them. there are wmv files that could be screenshotted to grab the images but i haven't done so—feel free to!

note also that trial_info_fr.csv and trial_info_sp.csv were constructed by me rather than from the original raw_data; it is possible but just somewhat annoying to programmatically pull together all the different pieces of info needed.

vboyce · 2024-09-13T19:02:34Z

Images could be pulled via screenshot
readme mentions that CDI data for the montreal subset exists (but has not been imported yet) (and DVAP -- another vocab measure)

vboyce · 2024-09-13T19:04:03Z

@alvinwmtan I'm getting a validation error because of an aoi region that is all NA's -- it looks like this is coming from the fact that one of the datasets has aoi coordinates and the other doesn't (hand coded) and NAs get added to that one in a bind_rows?
Does that sound right? / Do you have ideas for fixing?

alvinwmtan · 2024-09-14T13:41:54Z

this is correct (montreal has AOIs and princeton doesn't). somehow i didn't get a validation error when i ran it though? not sure why this is popping up

vboyce · 2024-09-16T23:45:22Z

so the validation issue seems to not be about the NAs and more be about that there are multiple regions in the aoi_region_set, but there's only 1 in the trial_types? which I think I've traced back to

 if(!is.na(data$l_x_max[[1]])){
    
    trial_types$aoi_region_set_id <- 0

in the digest_data function at
https://github.com/peekbank/peekbank-data-import/blob/33e750103cb1249b35daa01c6a7e9de1da5a6749/helper_functions/idless_draft.R#L284C1-L284C39.

Commenting out that line does seem to "fix" things, but I don't know what it's purpose is -- @adriansteffan what is this line supposed to be doing?

adriansteffan · 2024-09-18T11:30:42Z

so the validation issue seems to not be about the NAs and more be about that there are multiple regions in the aoi_region_set, but there's only 1 in the trial_types? which I think I've traced back to
 if(!is.na(data$l_x_max[[1]])){
    
    trial_types$aoi_region_set_id <- 0
in the digest_data function at https://github.com/peekbank/peekbank-data-import/blob/33e750103cb1249b35daa01c6a7e9de1da5a6749/helper_functions/idless_draft.R#L284C1-L284C39.

Commenting out that line does seem to "fix" things, but I don't know what it's purpose is -- @adriansteffan what is this line supposed to be doing?

The line is supposed to remind me to read my code more carefully before committing. I removed it, thanks for the catch!

adriansteffan · 2024-12-16T15:51:28Z

Stimulus screenshots are on osf and image paths updated. Will look into CDI/Lang Exposure/DVAP next

adriansteffan · 2024-12-16T21:51:16Z

Imported lang exposure/dvap data, cdi data still needs some checking if the data actually matches our sample

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IMPORT kremin_2021 #133

IMPORT kremin_2021 #133

alvinwmtan commented Aug 28, 2024 •

edited

Loading

alvinwmtan commented Aug 30, 2024

alvinwmtan commented Aug 30, 2024 •

edited by vboyce

Loading

vboyce commented Sep 10, 2024

alvinwmtan commented Sep 10, 2024

vboyce commented Sep 10, 2024

alvinwmtan commented Sep 10, 2024

vboyce commented Sep 10, 2024

alvinwmtan commented Sep 10, 2024 •

edited

Loading

vboyce commented Sep 13, 2024

vboyce commented Sep 13, 2024

alvinwmtan commented Sep 14, 2024

vboyce commented Sep 16, 2024 •

edited

Loading

adriansteffan commented Sep 18, 2024

adriansteffan commented Dec 16, 2024

adriansteffan commented Dec 16, 2024

IMPORT kremin_2021 #133

IMPORT kremin_2021 #133

Comments

alvinwmtan commented Aug 28, 2024 • edited Loading

alvinwmtan commented Aug 30, 2024

alvinwmtan commented Aug 30, 2024 • edited by vboyce Loading

vboyce commented Sep 10, 2024

alvinwmtan commented Sep 10, 2024

vboyce commented Sep 10, 2024

alvinwmtan commented Sep 10, 2024

vboyce commented Sep 10, 2024

alvinwmtan commented Sep 10, 2024 • edited Loading

vboyce commented Sep 13, 2024

vboyce commented Sep 13, 2024

alvinwmtan commented Sep 14, 2024

vboyce commented Sep 16, 2024 • edited Loading

adriansteffan commented Sep 18, 2024

adriansteffan commented Dec 16, 2024

adriansteffan commented Dec 16, 2024

alvinwmtan commented Aug 28, 2024 •

edited

Loading

alvinwmtan commented Aug 30, 2024 •

edited by vboyce

Loading

alvinwmtan commented Sep 10, 2024 •

edited

Loading

vboyce commented Sep 16, 2024 •

edited

Loading