note that the correctness of phenotype / covariate extraction should be correct. #1

KangchengHou · 2022-02-23T17:26:59Z

trait_list = ["bmi", "cholesterol", "hdl_cholesterol", "height", "ldl_direct"]

for trait in trait_list:
    df_trait1 = pd.read_csv(f"out/pheno/{trait}.tsv", sep="\t")

    df_trait2 = pd.read_csv(
        f"/u/project/sgss/UKBB/PRS-RESEARCH/00-compile-data/out/REAL-PHENO/{trait}.raw.pheno",
        sep="\t",
    )
    df_trait = pd.merge(
        df_trait1, df_trait2, left_on=["FID", "IID"], right_on=["FID", "IID"]
    )
    print(np.allclose(df_trait["PHENO_x"], df_trait["PHENO_y"], equal_nan=True))

df_covar1 = pd.read_csv(
    f"/u/project/sgss/UKBB/PRS-RESEARCH/00-compile-data/out/REAL-PHENO/all.covar",
    sep="\t",
)
df_covar2 = pd.read_csv("out/pheno/height.tsv", sep="\t")
df_covar = pd.merge(
    df_covar1, df_covar2, left_on=["FID", "IID"], right_on=["FID", "IID"]
)
for col in ["SEX", "AGE"] + [f"PC{i}" for i in range(1, 11)]:
    print(np.allclose(df_covar[f"{col}_x"], df_covar[f"{col}_y"], equal_nan=True))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

note that the correctness of phenotype / covariate extraction should be correct. #1

note that the correctness of phenotype / covariate extraction should be correct. #1

KangchengHou commented Feb 23, 2022

note that the correctness of phenotype / covariate extraction should be correct. #1

note that the correctness of phenotype / covariate extraction should be correct. #1

Comments

KangchengHou commented Feb 23, 2022