IMPORT fernald_totlot #143

adriansteffan · 2024-10-23T15:42:14Z

Some things we need to discuss next meeting:

vanilla trials and stimuli are a bit of a mess (groups variable can give clue?)
cdi data has columns that could be referring to something different (probably not though)
should we implement a cutoff for timepoints/frames after which very little hancoded data exists?
should NA values in the "aoi" column translate to "missing", NA, or should they be filtered out?
are there images for this?

mzettersten · 2024-10-31T07:35:13Z

don't worry about images (non-trivial to dig up)
trials are pretty much consistent with how we guessed in our meeting:
- 18 months:
  - whole: vanilla (just regular trials)
  - yfill: also vanilla
  - gated: these are the "cut off" words, non-vanilla
  - zteach: this is a teaching trial in which a novel object is mapped to a novel label (nonce or kreeb) - non-vanilla
  - (a)learn: this is a test trial for the novel labels - non-vanilla
- 21 months:
  - xfacil: these are trials in which the verb facilitates the recognition of the label ("eat the cookie") - non-vanilla
  - whole: regular vanilla trials
  - look: these are matched to the xfacil trials, e.g. "Look at the cookie" instead of the xfacil trial "eat the cookie". These are vanilla, with the caveat that I'm not entirely sure where F0 is set within the trial (beginning of the trial or onset of cookie?). We should look carefully at these specific trials to clear this up.
  - gated: as before, non-vanilla (truncated words)
  - new: these are mutual exclusivity-style trials, non-vanilla
  - ylearn: novel word trials, non-vanilla
- 25 months: all as before, with the following changes:
  - losse: this is like "gated", non-vanilla
  - hard: vanilla - this just means that the specific words were "harder"
  - nice/super/pretty/none - all vanilla; these just mean that an additional adjective was shown (except for the none trials, these are just paired with the pretty trials e.g.), but I think it is safe to consider these all vanilla for our purposes. Let's also have a close eye on the timing of the curve inflection for these, just to make sure the points of disambiguation are right

mzettersten · 2024-10-31T07:41:39Z

NA in the aoi column should typically be "missing"
some kind of percentage cutoff (<1%?) would be reasonable to standardize
I think the CDI data should be straightforward:
- und12new, und15new: comprehension for CDI W&G at 12 months and 15 months
- said12, vocab15: production for CDI W&G at 12 months and 15 months
- vocab18, vocab21, vocab25: CDI words and sentences (production) at 18, 21 and 25 months
- vocper18/21/25: corresponding percentile scores
- compl18, complx21, complx25: grammatical complexity measure at 18, 21, 25 months (see paper for details)

adriansteffan · 2024-10-31T23:08:31Z

what is a "toma"? a novel word?
same as with adams marchman: The timing of the looking score time graph looks off both before AND after the resampling/rezeroing/norming, being to early and too late respectively. A second pair of eyes could help here
check the outlier conditions (super, yfill) - need another set of eyes, there does not seem to be any obvious anomaly other than the scores being abnormally high/low. (shifts/stretching of the time axis was a hypothesis, but all data starts at t=0, so that can probably be ruled out)
- yfill are 9 18 months old, so that might explain something
- super are 58 25 month olds, so that one is a headscratcher
  (decision: filter out super for now until we find an explanation, yfill is likely caused by low n/age)

mcfrank · 2024-11-01T16:19:43Z

Yes, a toma is definitely a novel word!

adriansteffan · 2024-12-06T14:03:48Z

Checklist for code review v2024

To start:

Git pull this repo to get the latest version
Update your peekds and peekbankr to the latest version
- Be sure to restart your R Session to apply these updates
Get the latest version of the dataset from osf (delete your raw_data, so that the script automatically downloads the data)
Run the import script
Does it run into issues due to missing libraries? during restructuring, import statements for libraries like janitor might have been lost in some datasets - re-add them if necessary
Does the validator complain about the processed data? Complain to Adrian (or fix the listed issues if you feel like it)

Common issues to check:

Trials

Are trials now unique between administrations?
Is exclusion info handled correctly? (look closely at how exclusion information is marked)

Trial Types

Check if the trial type IDs are created independently of administrations/subjects
Is vanilla_trial coded appropriately?

Stimuli

If the images are on osf, make sure the image path contains the image path on osf
Make sure each row represents a label-image association
- the labels should be the words that the participants hear. For example, "apple" is okay, "red_apple_little" is wrong and was probably erroneously extracted from the file name
Are there items in the imported dataset not mentioned in the paper?
Are distractors represented correctly?
- Special explanation for distractors: If an item only ever appeared in distractor position, it still gets its own row. The label is typically the label given to the image in the experiment description (e.g., "the distractor was an image of a chair"). If there is no obvious label provided in the experiment design, leave label blank.

Subjects

Does CDI data follow the new aux_data format?
Age rounded correctly? (decision: we do no rounding)

General

adriansteffan assigned adriansteffan and mzettersten Oct 23, 2024

adriansteffan unassigned adriansteffan and mzettersten Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IMPORT fernald_totlot #143

IMPORT fernald_totlot #143

adriansteffan commented Oct 23, 2024 •

edited

Loading

mzettersten commented Oct 31, 2024 •

edited

Loading

mzettersten commented Oct 31, 2024

adriansteffan commented Oct 31, 2024 •

edited

Loading

mcfrank commented Nov 1, 2024 via email

adriansteffan commented Dec 6, 2024

IMPORT fernald_totlot #143

IMPORT fernald_totlot #143

Comments

adriansteffan commented Oct 23, 2024 • edited Loading

mzettersten commented Oct 31, 2024 • edited Loading

mzettersten commented Oct 31, 2024

adriansteffan commented Oct 31, 2024 • edited Loading

mcfrank commented Nov 1, 2024 via email

adriansteffan commented Dec 6, 2024

adriansteffan commented Oct 23, 2024 •

edited

Loading

mzettersten commented Oct 31, 2024 •

edited

Loading

adriansteffan commented Oct 31, 2024 •

edited

Loading