Linear non-Gaussian acyclic models for causal inference of latent variables in structural equation model
In psychology and social sciences, confirmatory data analysis and hypothesis testing are in active use, but sometimes prior studies are not available under which researchers may consider exploratory approach to analysing the data. Existing causal discovery methods designed to explore directional rel...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Research |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/167946 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | In psychology and social sciences, confirmatory data analysis and hypothesis testing are in active use, but sometimes prior studies are not available under which researchers may consider exploratory approach to analysing the data. Existing causal discovery methods designed to explore directional relationships between variables are capable of handling mainly observed variables but not latent. Latent variables are equally prevalent as some psychological constructs are not directly observable. Some current methods to determine the cause-effect sequences of latent variables (e.g., direction dependence analysis and structural equation modelling with higher-order moment structures) have been under-utilized due to the large sample size requirement and lack of statistical software support. To close the gap in the literature of causal discovery and of structural equation modelling with latent variables, this thesis develops a data-driven latent-variable causal discovery method – linear non-Gaussian acyclic models for latent variables (LiNGAM-LV). With the input of raw data and user-specified measurement model, LiNGAM-LV algorithms output the path models with latent variables in the forms of directed acyclic graph based on the criteria of pathway directionality and the balance between model complexity and model fit. This thesis proposes three types of LiNGAM-LV algorithms (i.e., ICA-LiNGAM-LV, DirectLiNGAM-LV and ParceLiNGAM-LV) and provides R codes for the implementation. A Monte Carlo simulation study follows that varies sample sizes, non-normality, data missingness and model complexity. It is aimed at evaluating the performance of the three algorithms with respect to path-related fit indices, accuracy and root mean square error. The simulation results revealed that the LiNGAM-LV algorithms gave a promising performance in general; the performance of DirectLiNGAM-LV and ICA-LiNGAM-LV were comparable while ParceLiNGAM-LV was the worst. LiNGAM-LV algorithms provide insight into the causal relationships between latent variables without relying on priori hypotheses. Recommendations on using LiNGAM-LV in practice are discussed. |
---|