florida_twins_behavior #734

ben-domingue · 2024-12-13T03:08:09Z

we should investigate whether this should be multiple tables.
code here: https://github.com/ben-domingue/irw/blob/main/data/florida_twins_behavior.R

saviranadela · 2024-12-19T17:57:06Z

similar with florida_twins, so i guess we need to, but i might be wrong...

they have panas items, cads, friends, etc that can be split

ben-domingue · 2024-12-19T21:44:04Z

yeah i think we should split here. apologies that i messed this up in the beginning!

ben-domingue · 2024-12-19T22:20:14Z

see also comment i made in #735 . i think these are intertwined.

ben-domingue · 2025-01-08T23:24:24Z

@saviranadela same here

saviranadela · 2025-01-09T08:52:14Z

@ben-domingue i think i understand why you separated this from the other one. based on the LDbase site, they categorize the data into four types:

Parent survey
Child survey (which isn’t necessarily a twin survey?)
Behavior and environment survey (described on the site as 'corresponding to both the parent and twin self-report survey data')
Twin progress monitoring

from what you’ve worked with previously, we only have # 2 (which is the florida_twins) and # 3 (florida_twins_behavior).

my guess is that # 3 is meant to include elements of both # 1 and # 4, which could explain why you didn’t include # 1 and # 4.

so my conclusion is, this might not be redundant with florida_twins #735

please let me know if this makes sense to you! 😬

more: https://ldbase.org/datasets/1c53beea-ddc1-4efa-a88b-dc18f311f1c6

ben-domingue · 2025-01-09T16:45:54Z

i think this makes sense. the two datasets at the beginning were just not quite right; at this point, i am happy with just decomposing the big 2 into N smaller datasets but i'm not really sure what the right value for N is. it sounds like you're getting some traction on it though?

saviranadela · 2025-01-12T00:02:16Z

decomposed to 7 smaller datasets

data:
florida_twins_behavior.zip

code:

library(tidyverse)
library(readr)

df <- read_csv('multiparentandchild0311 LDBase.csv')

names(df) <- tolower(names(df))

df <- df |>
  select(-starts_with('panas_pa'),
         -starts_with('panas_na'),
         -starts_with('ecs_ec'),
         -starts_with('ecs_imp'),
         -starts_with('rcads_mdd'),
         -starts_with('rcads_ocd'),
         -starts_with('rcads_gad'),
         -starts_with('rcads_pda'),
         -starts_with('rcads_sad'),
         -starts_with('rcads_sp'),
         -starts_with('cadsyv_pos'),
         -starts_with('cadsyv_dar'),
         -starts_with('cadsyv_pro'),
         -starts_with('cadsyv_neg'),
         -starts_with('cadsyv_soc'),
         -starts_with('cadsyv_resp'),
         -starts_with('cadsyv_dis'),
         -starts_with('tas_autonomic'),
         -starts_with('tas_offtask'),
         -starts_with('tas_thoughts'),
         -starts_with('friends_bad'),
         -starts_with('friends_school'),
         -starts_with('friends_good'),
         -contains('hem'),
         -contains('chaos'),
         -starts_with('p_'),
         -starts_with('p_panas'),
         -contains('pdbd'),
         -contains('feeling'),
         -pair_gender,
         -zyg_par,
         -starts_with('bg_id'),
         -`...1`,
         -id1,
         -contains('swan'),
         -twinid,
         -starts_with('n'))|>
  pivot_longer(cols = -c(id0, famid),
               names_to = 'item',
               values_to = 'resp',
               values_drop_na = T) |>
  rename(id = id0, family_id = famid)

unique(df$item)

# print response values
table(df$resp)

df_panas <- df %>%
  filter(grepl("panas", item))

df_ecs <- df %>%
  filter(grepl("ecs", item))

df_rcads <- df %>%
  filter(grepl("rcads", item))

df_cads <- df %>%
  filter(grepl("^cads_", item))

df_tas <- df %>%
  filter(grepl("tas", item))

df_friends <- df %>%
  filter(grepl("friends", item))

df_cadsyv <- df %>%
  filter(grepl("^cadsyv", item))

length(unique(df_panas$item))
length(unique(df_ecs$item))
length(unique(df_rcads$item))
length(unique(df_cads$item))
length(unique(df_tas$item))
length(unique(df_friends$item))
length(unique(df_cadsyv$item))

write.csv(df_panas, "florida_twins_behavior_panas.csv", row.names=FALSE)
write.csv(df_ecs, "florida_twins_behavior_ecs.csv", row.names=FALSE)
write.csv(df_rcads, "florida_twins_behavior_rcads.csv", row.names=FALSE)
write.csv(df_cads, "florida_twins_behavior_cads.csv", row.names=FALSE)
write.csv(df_tas, "florida_twins_behavior_tas.csv", row.names=FALSE)
write.csv(df_friends, "florida_twins_behavior_friends.csv", row.names=FALSE)
write.csv(df_cadsyv, "florida_twins_behavior_cadsyv.csv", row.names=FALSE)

saviranadela · 2025-01-12T00:02:52Z

ben-domingue added the data fix fixing an existing dataset label Dec 13, 2024

ben-domingue mentioned this issue Dec 19, 2024

florida_twins #735

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

florida_twins_behavior #734

florida_twins_behavior #734

ben-domingue commented Dec 13, 2024

saviranadela commented Dec 19, 2024 •

edited

Loading

ben-domingue commented Dec 19, 2024

ben-domingue commented Dec 19, 2024

ben-domingue commented Jan 8, 2025

saviranadela commented Jan 9, 2025 •

edited

Loading

ben-domingue commented Jan 9, 2025

saviranadela commented Jan 12, 2025

saviranadela commented Jan 12, 2025

florida_twins_behavior #734

florida_twins_behavior #734

Comments

ben-domingue commented Dec 13, 2024

saviranadela commented Dec 19, 2024 • edited Loading

ben-domingue commented Dec 19, 2024

ben-domingue commented Dec 19, 2024

ben-domingue commented Jan 8, 2025

saviranadela commented Jan 9, 2025 • edited Loading

ben-domingue commented Jan 9, 2025

saviranadela commented Jan 12, 2025

saviranadela commented Jan 12, 2025

saviranadela commented Dec 19, 2024 •

edited

Loading

saviranadela commented Jan 9, 2025 •

edited

Loading