CLUSTER ANALYSIS USING MINKOWSKI DISTANCE WITH POWERS BETWEEN [1, 4] BETWEEN TWO OBJECTS/INDIVIDUALS AND THE UTILIZATION OF PRINCIPAL COMPONENT ANALYSIS AND LINEAR DISCRIMINANT ANALYSIS CASE STUDY: GROUNDWATER QUALITY DATA FROM SUBDISTRICTS IN BANDUNG REGENCY, 2023
This study investigates the impact of cluster analysis using the Minkowski distance metric with varying powers, as well as the utilization of dimensionality reduction techniques such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) to enhance the robustness of clusteri...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/84068 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | This study investigates the impact of cluster analysis using the Minkowski distance
metric with varying powers, as well as the utilization of dimensionality reduction
techniques such as Principal Component Analysis (PCA) and Linear Discriminant
Analysis (LDA) to enhance the robustness of clustering results. Groundwater
quality data, which includes information on 10 metal contaminants from various
sub-districts in Bandung Regency in 2023, is used as an analysis example. In this
context, multivariate analysis is critical for decision-making but poses challenges
when faced with large data. Cluster analysis becomes an essential tool for
identifying patterns by grouping objects based on similarities determined by
distance measures. The Minkowski distance metric, which includes r, offers a more
general approach compared to Euclidean distance, as it allows for adjusting
sensitivity to data variations. PCA is used to reduce the dimensionality of data while
preserving variance using eigenvalues and eigenvectors from the covariance
matrix, allowing for the selection of relevant principal components. Subsequently,
LDA is applied to enhance class separability by maximizing the ratio of betweencluster scatter to within-cluster scatter. This transformation results in more stable
data for cluster analysis. The results show that the combination of PCA and LDA
can enhance the stability and interpretability of clustering outcomes, even when
changes occur in the Minkowski distance metric parameters. This approach
provides a stronger framework for analyzing groundwater quality data, enabling
better understanding of the relationships among metal contaminants while ensuring
clustering robustness against various distance definitions.
|
---|