INTEGRATION OF AERIAL GEOPHYSICS, OPTIC, AND RADAR SATELLITE IMAGERIES FOR IDENTIFICATION OF LITHOLOGY DISTRIBUTION USING REMOTE PREDICTIVE MAPPING APPROACH, CASE STUDY: KOMOPA AREA, PAPUA PROVINCE

Collecting training points is a very challenging activity in lithology mapping in areas that are difficult to reach with high vegetation density. If no outcrop points are found, the training points must be obtained by drilling the soil. Thus, it is common for the number of training points obtained t...

Full description

Saved in:
Bibliographic Details
Main Author: Nugroho, Hary
Format: Dissertations
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/72637
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Collecting training points is a very challenging activity in lithology mapping in areas that are difficult to reach with high vegetation density. If no outcrop points are found, the training points must be obtained by drilling the soil. Thus, it is common for the number of training points obtained to be too few and insufficient to be used to interpret lithology data. If this condition is forced, an accurate lithology map cannot be accepted. In an area like this, a methodology is needed to help geologists make lithology maps with limited training points and other supporting data in a fast and efficient process. The Remote Predictive Mapping (RPM) method can be used for lithology mapping in areas like this. RPM is a geological mapping technique that uses geoscientific data, such as satellite imagery and aerial geophysical data that includes magnetic, radiometric, and electromagnetic data, which are processed using machine learning methods to obtain predictive lithology maps. Machine learning is used to classify geoscience data with training point references, or what is called "supervised classification." The training point serves as a link between satellite imagery, airborne geophysics data, and lithology data. As in conventional lithology mapping, the success of lithology mapping with the RPM method depends on the number and completeness of training points that represent the types of lithology in the field. However, in the RPM method, the balance of the number of training points representing lithology types will affect the classification results and their accuracy. This is caused by machine learning, which always assumes that the training points are evenly distributed. Thus, if there is an imbalance in the data, machine learning will be biased towards the class with the highest number of training points (the majority class). This study aims to create an efficient lithology mapping methodology by applying RPM. In this research, a study was conducted to obtain the most optimal combination of data and methods through (1) a study of data usage, (2) a study of the number and distribution of training points (field data), (3) the application of machine learning algorithms combined with several improvement methods at the data and algorithm levels, and (4) the application of the smoothing method to improve the accuracy of the classification results (post-classification). The study area is located in Komopa Village, Aweida District, Paniai Regency, Papua Province. The land surface in this area is covered with dense vegetation and a thick layer of humus. The study area is approximately 84 km2. The data used are Sentinel-2A satellite imagery, ALOS PALSAR radar imagery, digital elevation models (DEM), and geophysical data in the form of magnetic, electromagnetic, and radiometric data. The training points varied, starting at 25, 50, 100, 200, 300, 400, and 500 points with simple random distribution in unbalanced conditions and 25 and 50 points with stratified random distribution in balanced conditions, which were tested with 502 test points. The machine learning used is Random Forest with the application of improvement methods at the data and algorithm levels, which include oversampling and cost-sensitive learning. The Fuzzy C-Means method and probability classification results from Random Forest are applied for post-classification repair or smoothing. The classification results were tested using the confusion matrix and compared with the existing lithology maps at a scale of 1:25,000 produced by Mine Serve International in 2000. The most efficient model for unbalanced data distribution is a model with 100 training points that integrate Random Forest with the oversampling method. At these training points, the imbalance ratio (IR) is between 11:1 and 30:1, with the best data combination including Sentinel-2A satellite imagery elements, DEM, RTP, and 20 kHz and 36 kHz electromagnetic data elements providing increased accuracy. testing 4%, precision 20%, recall 20%, F1 score 20%, and Kappa score 21%. For data with a balanced or stratified distribution pattern, the best model is the 50 TP model with the addition of RTP data, which gives an increase in test accuracy of 11%, precision of 5%, recall of 8%, F1 score of 8%, and Kappa score of 10%. The smoothing process is proven to improve the classification results by eliminating some noise by applying the Direction Magnitude and Fuzzy C-Means methods and Random Forest probabilities. This smoothing method can increase training accuracy by up to 7% and the F1 score by 4%.