STRUCTURE OF CONTINGENCY TABLE USING CARDANO AND CARDANO-FERRARI FORMULAS ON CORRESPONDENCE ANALYSIS

Correspondence analysis (CA) is a categorical data analysis technique that provides a representation of dependencies between two categorical variables and visualizes it on correspondence plot. Generally, this research focuses on the role of eigenvalues in determining the quality of correspondence pl...

Full description

Saved in:
Bibliographic Details
Main Author: Eka Lestari, Karunia
Format: Dissertations
Language:Indonesia
Subjects:
Online Access:https://digilib.itb.ac.id/gdl/view/52898
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Correspondence analysis (CA) is a categorical data analysis technique that provides a representation of dependencies between two categorical variables and visualizes it on correspondence plot. Generally, this research focuses on the role of eigenvalues in determining the quality of correspondence plot. It is a theoretical background of the importance efforts to obtain eigenvalues analitically using the Cardano and Cardano-Ferrari formulas. In a conventional CA, eigenvalues are commonly obtained by numerical processes. In this dissertation, the Cardano and Cardano-Ferrari formulas were developed to obtain eigenvalues in the context of CA, each of which is formulated in a lemma. The left and right singular vectors of each formula are respectively formulated in some lemmas. The correspondence plot is constructed from the principal coordinates of the rows and columns formulated in two theorems. Furthermore, an investigation into the structure or class of contingency tables that can be solved by using the Cardano and Cardano-Ferrari formulas. The structure is formulated in two theorems. Therefore, this study gives opportunities to develop CA, both in terms of linear algebra and statistics. The eigenvalues obtained from the analytic approach compared to the eigenvalues obtained by the numerical approach. The results show that the analytic approach by the Cardano and Cardano-Ferrari formulas, has several advantages over the numerical approach by a bisection method, such as: (1) it produces the eigenvalue (roots) with the same results as the numerical approach, as well more precision because without errors involving; (2) it does not require an initial guess; (3) it does not involve repetition (iteration), then the computation time becomes faster; (4) the manual calculation is easy because it uses a formula, and (5) the algorithm is simpler. Next, the pattern of changes in eigenvalues is also identified if there is a patterned change in the contingency table, for example if there is a change in sample size. Therefore, a simulation of the eigenvalues calculation on several sample sizes was carried out through a systematic sampling technique. The results show that the change in sample size which results in a small change in the matrix ?????, does not always cause a small change in its eigenvalues. In other words, the larger sample size (closer to the original data size) does not guarantee that the eigenvalues will approach to the eigenvalues of the matrix ????? on the original data (100% data). Therefore, an elliptical confidence area is built to determine how close (or far) a coordinate is at each sample size with the corresponding coordinates on the original data map, so that it is said to be similar or significantly represent the original data plot. This study shows that the larger the sample size led to the smaller the confidence area of the ellipse coordinate points on the plot. Thus, the relative position of the coordinate point with the original data is getting closer, and it increasingly represents the coordinate point configuration in the original data. The application of the Cardano and Cardano-Ferrari formula in the context of CA is applied to the tracer study of ITB 2018, regarding the relevance of the study program, and the contribution to future careers. The correspondence plot obtained displays an interesting pattern on low dimensions, making it easier to interpret data. Algorithms and programming are built using Python to help CA users. This study contribute to CA users from a practical point of view and interesting incremental advances in the computational aspect.