Linear non-Gaussian acyclic models for causal inference of latent variables in structural equation model

In psychology and social sciences, confirmatory data analysis and hypothesis testing are in active use, but sometimes prior studies are not available under which researchers may consider exploratory approach to analysing the data. Existing causal discovery methods designed to explore directional rel...

Full description

Saved in:
Bibliographic Details
Main Author: Luk, Chun To
Other Authors: Ho Moon-Ho Ringo
Format: Thesis-Master by Research
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/167946
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In psychology and social sciences, confirmatory data analysis and hypothesis testing are in active use, but sometimes prior studies are not available under which researchers may consider exploratory approach to analysing the data. Existing causal discovery methods designed to explore directional relationships between variables are capable of handling mainly observed variables but not latent. Latent variables are equally prevalent as some psychological constructs are not directly observable. Some current methods to determine the cause-effect sequences of latent variables (e.g., direction dependence analysis and structural equation modelling with higher-order moment structures) have been under-utilized due to the large sample size requirement and lack of statistical software support. To close the gap in the literature of causal discovery and of structural equation modelling with latent variables, this thesis develops a data-driven latent-variable causal discovery method – linear non-Gaussian acyclic models for latent variables (LiNGAM-LV). With the input of raw data and user-specified measurement model, LiNGAM-LV algorithms output the path models with latent variables in the forms of directed acyclic graph based on the criteria of pathway directionality and the balance between model complexity and model fit. This thesis proposes three types of LiNGAM-LV algorithms (i.e., ICA-LiNGAM-LV, DirectLiNGAM-LV and ParceLiNGAM-LV) and provides R codes for the implementation. A Monte Carlo simulation study follows that varies sample sizes, non-normality, data missingness and model complexity. It is aimed at evaluating the performance of the three algorithms with respect to path-related fit indices, accuracy and root mean square error. The simulation results revealed that the LiNGAM-LV algorithms gave a promising performance in general; the performance of DirectLiNGAM-LV and ICA-LiNGAM-LV were comparable while ParceLiNGAM-LV was the worst. LiNGAM-LV algorithms provide insight into the causal relationships between latent variables without relying on priori hypotheses. Recommendations on using LiNGAM-LV in practice are discussed.