Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confirmatory Factor Analysis and Structural Equation Models #695

Open
NathanielF opened this issue Aug 19, 2024 · 2 comments
Open

Confirmatory Factor Analysis and Structural Equation Models #695

NathanielF opened this issue Aug 19, 2024 · 2 comments
Labels
proposal New notebook proposal still up for discussion

Comments

@NathanielF
Copy link
Contributor

NathanielF commented Aug 19, 2024

Notebook proposal

Title: Confirmatory Factor Analysis and Structural Equation Models

Why should this notebook be added to pymc-examples?

This fills a gap in the coverage we have of CFA and SEM models highlighting in particular their role in the analysis of psychometric survey data. It's super interesting and tied to Judea Pearl style causal inference on DAGs.

image

DATA

image

Basic CFA example in PyMC

coords = {'obs': list(range(len(df_p))), 
          'indicators': ['PI', 'AD',    'IGC', 'FI', 'FC'],
          'indicators_1': ['PI', 'AD',  'IGC'],
          'indicators_2': ['FI', 'FC'],
          'latent': ['Student', 'Faculty']
          }


obs_idx = list(range(len(df_p)))
with pm.Model(coords=coords) as model:
  
  Psi = pm.InverseGamma('Psi', 5, 10, dims='indicators')
  lambdas_ = pm.Normal('lambdas_1', 1, 10, dims=('indicators_1'))
  lambdas_1 = pm.Deterministic('lambdas1', pt.set_subtensor(lambdas_[0], 1), dims=('indicators_1'))
  lambdas_ = pm.Normal('lambdas_2', 1, 10, dims=('indicators_2'))
  lambdas_2 = pm.Deterministic('lambdas2', pt.set_subtensor(lambdas_[0], 1), dims=('indicators_2'))
  tau = pm.Normal('tau', 3, 10, dims='indicators')
  kappa = 0
  sd_dist = pm.Exponential.dist(1.0, shape=2)
  chol, _, _ = pm.LKJCholeskyCov('chol_cov', n=2, eta=2,
    sd_dist=sd_dist, compute_corr=True)
  ksi = pm.MvNormal('ksi', kappa, chol=chol, dims=('obs', 'latent'))

  m1 = tau[0] + ksi[obs_idx, 0]*lambdas_1[0]
  m2 = tau[1] + ksi[obs_idx, 0]*lambdas_1[1]
  m3 = tau[2] + ksi[obs_idx, 0]*lambdas_1[2]
  m4 = tau[3] + ksi[obs_idx, 1]*lambdas_2[0]
  m5 = tau[4] + ksi[obs_idx, 1]*lambdas_2[1]
  
  mu = pm.Deterministic('mu', pm.math.stack([m1, m2, m3, m4, m5]).T)
  _  = pm.Normal('likelihood', mu, Psi, observed=df_p.values)

  idata = pm.sample(nuts_sampler='numpyro', target_accept=.95, 
                    idata_kwargs={"log_likelihood": True})
  idata.extend(pm.sample_posterior_predictive(idata))

Suggested categories:

  • Level: Intermediate.

Related notebooks

Perhaps this one: https://www.pymc.io/projects/examples/en/latest/case_studies/factor_analysis.html
But it seems to be recount factor analysis more as a machine learning feature reduction technique than as a means of analysis as per the psychometrics use-case.

References

Will likely adapt (WIP) a blog post i'm working on here: https://nathanielf.github.io/posts/post-with-code/CFA_AND_SEM/CFA_AND_SEM.html

The original work references the book Bayesian Psychometric Modeling by Mislevey and Levy

@NathanielF NathanielF added the proposal New notebook proposal still up for discussion label Aug 19, 2024
@NathanielF
Copy link
Contributor Author

@cluhmann
Copy link
Member

Seems kind of cool. There have definitely been questions about this on discourse.

@NathanielF NathanielF mentioned this issue Sep 1, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal New notebook proposal still up for discussion
Projects
None yet
Development

No branches or pull requests

2 participants