CLUSTER ANALYSIS USING MINKOWSKI DISTANCE WITH POWERS BETWEEN [1, 4] BETWEEN TWO OBJECTS/INDIVIDUALS AND THE UTILIZATION OF PRINCIPAL COMPONENT ANALYSIS AND LINEAR DISCRIMINANT ANALYSIS CASE STUDY: GROUNDWATER QUALITY DATA FROM SUBDISTRICTS IN BANDUNG REGENCY, 2023

This study investigates the impact of cluster analysis using the Minkowski distance metric with varying powers, as well as the utilization of dimensionality reduction techniques such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) to enhance the robustness of clusteri...

Full description

Saved in:
Bibliographic Details
Main Author: Arbi Wijaya, Mohammad
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/84068
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:This study investigates the impact of cluster analysis using the Minkowski distance metric with varying powers, as well as the utilization of dimensionality reduction techniques such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) to enhance the robustness of clustering results. Groundwater quality data, which includes information on 10 metal contaminants from various sub-districts in Bandung Regency in 2023, is used as an analysis example. In this context, multivariate analysis is critical for decision-making but poses challenges when faced with large data. Cluster analysis becomes an essential tool for identifying patterns by grouping objects based on similarities determined by distance measures. The Minkowski distance metric, which includes r, offers a more general approach compared to Euclidean distance, as it allows for adjusting sensitivity to data variations. PCA is used to reduce the dimensionality of data while preserving variance using eigenvalues and eigenvectors from the covariance matrix, allowing for the selection of relevant principal components. Subsequently, LDA is applied to enhance class separability by maximizing the ratio of betweencluster scatter to within-cluster scatter. This transformation results in more stable data for cluster analysis. The results show that the combination of PCA and LDA can enhance the stability and interpretability of clustering outcomes, even when changes occur in the Minkowski distance metric parameters. This approach provides a stronger framework for analyzing groundwater quality data, enabling better understanding of the relationships among metal contaminants while ensuring clustering robustness against various distance definitions.