IMPORT sander-montant_2022 #115

mzettersten · 2024-05-17T20:01:12Z

No description provided.

adriansteffan · 2024-06-15T17:04:27Z

import not started, refer to #125 first

alvinwmtan · 2024-08-27T18:15:08Z

Go ahead with import; #125 has been resolved.

alvinwmtan · 2024-08-29T18:12:47Z

First-pass IDless import complete. Ready for code review

alvinwmtan · 2024-08-29T18:12:52Z

Checklist for code review v2024

To start:

Git pull this repo to get the latest version
Update your peekds and peekbankr to the latest version
- Be sure to restart your R Session to apply these updates
Get the latest version of the dataset from osf (delete your raw_data, so that the script automatically downloads the data)
Run the import script
Does it run into issues due to missing libraries? during restructuring, import statements for libraries like janitor might have been lost in some datasets - re-add them if necessary
Does the validator complain about the processed data? Complain to Adrian (or fix the listed issues if you feel like it)

Common issues to check:

Trials

Are trials now unique between administrations?
Is exclusion info handled correctly? (look closely at how exclusion information is marked)

Trial Types

Check if the trial type IDs are created independently of administrations/subjects
Is vanilla_trial coded appropriately?

Stimuli

If the images are on osf, make sure the image path contains the image path on osf
Make sure each row represents a label-image association
- the labels should be the words that the participants hear. For example, "apple" is okay, "red_apple_little" is wrong and was probably erroneously extracted from the file name
Are there items in the imported dataset not mentioned in the paper?
Are distractors represented correctly?
- Special explanation for distractors: If an item only ever appeared in distractor position, it still gets its own row. The label is typically the label given to the image in the experiment description (e.g., "the distractor was an image of a chair"). If there is no obvious label provided in the experiment design, leave label blank.

Subjects

Does CDI data follow the new aux_data format?
Age rounded correctly? (decision: we do no rounding)

General

vboyce · 2024-09-11T18:48:35Z

exclusions look to only be for participant-level, not trial-level (probably all we have)

full phrase is missing for some of the data (presumably because we don't have it?) (Alvin confirms we don't have it)

[resolved] looks like trial_type info is coming from a trial_info csv, possibly coded by someone (alvin?) off of raw stimuli? I'm wondering why there are trials that are marked vanilla but have condition mispronounced. condition column of trial_info was miscoded (very understandably) and has been corrected.

vboyce · 2024-09-13T18:22:02Z

I think everything is good except I couldn't track down the images.

It sounds from the readme that the unpublished bh2017 but with younger kids should have videos that could be screenshotted somewhere, but I didn't find it in a cursory look through osf repos.

The schott osf is still private (and the corresponding github https://github.com/e-schott/CrossLanguagePhonologicalOverlap) doesn't seem to have stimuli. Idk if it's worth asking for the stimuli.

alvinwmtan · 2024-09-14T13:39:42Z

@vboyce bh2017 + unpub video files can be found here: https://osf.io/htn9j/ (would also be good to update bh2017 if you get the images (ref #97))

mzettersten · 2024-09-16T02:23:39Z

@vboyce Should I reach out and ask? Maybe for images for all of the various projects, if it were to make sense?

vboyce · 2024-09-16T17:17:57Z

@mzettersten reaching out for the various projects for image stimuli would be great!

and @alvinwmtan thanks, I can do the screenshotting for bh2017 and unpub and update both places!

vboyce · 2024-09-17T00:31:53Z

hmm, do the bh2017 files open for you @alvinwmtan ? For me the two bouche ones do, but the others seem corrupted or wrong extension or something, and I can't open them.

adriansteffan · 2024-12-17T00:39:42Z

the videos needed some convincing, but I beat them over the head with ffmpeg and they told me their secrets. I have extracted the stimuli and updated the file paths for both the unpublished sample of this dataset and for byers-heinlein_2017.

That should conclude the review! The only thing we could still do is try to get the missing images that did not overlap with bh2017

alvinwmtan changed the title ~~IMPORT montat_2022~~ IMPORT sander-montant_2022 Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IMPORT sander-montant_2022 #115

IMPORT sander-montant_2022 #115

mzettersten commented May 17, 2024

adriansteffan commented Jun 15, 2024

alvinwmtan commented Aug 27, 2024

alvinwmtan commented Aug 29, 2024

alvinwmtan commented Aug 29, 2024 •

edited by adriansteffan

Loading

vboyce commented Sep 11, 2024 •

edited

Loading

vboyce commented Sep 13, 2024

alvinwmtan commented Sep 14, 2024

mzettersten commented Sep 16, 2024

vboyce commented Sep 16, 2024

vboyce commented Sep 17, 2024

adriansteffan commented Dec 17, 2024

IMPORT sander-montant_2022 #115

IMPORT sander-montant_2022 #115

Comments

mzettersten commented May 17, 2024

adriansteffan commented Jun 15, 2024

alvinwmtan commented Aug 27, 2024

alvinwmtan commented Aug 29, 2024

alvinwmtan commented Aug 29, 2024 • edited by adriansteffan Loading

vboyce commented Sep 11, 2024 • edited Loading

vboyce commented Sep 13, 2024

alvinwmtan commented Sep 14, 2024

mzettersten commented Sep 16, 2024

vboyce commented Sep 16, 2024

vboyce commented Sep 17, 2024

adriansteffan commented Dec 17, 2024

alvinwmtan commented Aug 29, 2024 •

edited by adriansteffan

Loading

vboyce commented Sep 11, 2024 •

edited

Loading