Data-driven Exploratory Analysis for Raster Data Using Self-Organizing Maps Regressor

Aulia Khoirunnisa Fajri, Indra Ranggadara, - Suhendra, Aries Suharso


Sugarcane is one of the plantation commodities in Indonesia which has a big potential. Sugarcane growth consists of 4 phases that happen in a year. In the Grand Growth phase, sugarcane needs an appropriate condition to grow well and enter the next phase. The factors that affect sugarcane’s Grand Growth phase are water, temperature, and sunlight. Rainfall is one of the sugarcane water sources needed, but the rainfall intensity is different, and the rainfall distribution is uneven every year. The uneven rainfall caused water stress in a sugarcane plantation. That is why it is necessary to identify the water content in sugarcane plantations to maintain the quality of sugarcane. This study predicted the water content of sugarcane plantations so the areas indicated with water stress can be anticipated. Raster data are collected from Landsat-8 satellite imagery and analyzed using one of the data-driven exploration analysis methods, PCA (Principal Component Analysis), to analyze the overlay of the Landsat 8 imageries of the sugarcane plantation area. After that, the raster data were processed to calculate the water index of the sugarcane plantation, known as NDWI (Normalized Different Water Index). NDWI values of the sugarcane plantation area are converted into an array and then become data input for the Self-Organizing Map Regressor algorithm to predict the water content of the sugarcane plantation. The results are predicted water index values for the sugarcane plantation with 72% accuracy.


NDWI; PCA; self-organizing map; sugarcane; water stress.

Full Text:



W. Chang and X. Chen, “Monthly rainfall-runoffmodeling at watershed scale: A comparative study of data-driven and theory-driven approaches,†Water (Switzerland), vol. 10, no. 9, pp. 1–21, 2018, doi: 10.3390/w10091116.

BPS, “Statistik Tebu Indonesia 2018,†Badan Pusat Statistik RI, 2019.

S. Marjayanti, “Teknik Budidaya Tebu,†Pelatih. Budid. Tanam. Tebu Pt Perkeb. Nusant. Xii, pp. 1–21, 2012.

M. S. de Camargo, B. K. L. Bezerra, A. C. Vitti, M. A. Silva, and A. L. Oliveira, “Silicon fertilization reduces the deleterious effects of water deficit in sugarcane,†J. Soil Sci. Plant Nutr., vol. 17, no. 1, pp. 99–111, 2017, doi: 10.4067/S0718-95162017005000008.

A. S. Tayade, S. Vasantha, S. Anusha, R. Kumar, and G. Hemaprabha, “Irrigation Water Use Efficiency and Water Productivity of Commercial Sugarcane Hybrids Under Water-Limited Conditions,†vol. 63, no. 1, pp. 125–132, 2020.

W. Maass, J. Parsons, S. Purao, V. C. Storey, and C. Woo, “Data-driven meets theory-driven research in the era of big data: Opportunities and challenges for information systems research,†J. Assoc. Inf. Syst., vol. 19, no. 12, pp. 1253–1273, 2018, doi: 10.17705/1jais.00526.

M. Komorowski, D. C. Marshall, J. D. Salciccioli, and Y. Crutain, “Secondary Analysis of Electronic Health Records,†Second. Anal. Electron. Heal. Rec., pp. 1–427, 2016, doi: 10.1007/978-3-319-43742-2.

J. C. Campbell and M. Shin, Geographic Information System Basics v.1.0. 2012.

S. Keller et al., “Hyperspectral data and machine learning for estimating CDOM, chlorophyll a, diatoms, green algae and turbidity,†Int. J. Environ. Res. Public Health, vol. 15, no. 9, pp. 1–15, 2018, doi: 10.3390/ijerph15091881.

F. Bação and V. Lobo, “Introduction to Kohonen’s Self-Organising Maps,†Inst. Super. Estat. E Gest. Inf., p. 22, 2010.

U. Asan and S. Ercan, “An Introduction to Self-Organizing Maps,†Comput. Intell. Syst. Ind. Eng., vol. 6, no. March 2014, pp. 469–479, 2012, doi: 10.2991/978-94-91216-77-0.

L. C. Chang, W. H. Wang, and F. J. Chang, “Explore training self-organizing map methods for clustering high-dimensional flood inundation maps,†J. Hydrol., vol. 595, p. 125655, 2021, doi: 10.1016/j.jhydrol.2020.125655.

M. Milovanovic et al., “A novel method for classification of wine based on organic acids,†Food Chem., vol. 284, no. January, pp. 296–302, 2019, doi: 10.1016/j.foodchem.2019.01.113.

N. Chen, L. Chen, Y. Ma, and A. Chen, “Regional disaster risk assessment of china based on self-organizing map: Clustering, visualization and ranking,†Int. J. Disaster Risk Reduct., vol. 33, pp. 196–206, 2019, doi: 10.1016/j.ijdrr.2018.10.005.

B. R. Shivakumar and S. V. Rajashekararadhya, “Classification of Landsat 8 Imagery Using Kohonen ’ s Self Organizing Maps and Learning Vector Quantization,†in Advances in Communication, Signal Processing, VLSI, and Embedded Systems, no. January, 2020, pp. 445–462.

F. M. Riese and S. Keller, “Introducing A Framework Of Self-Organizing Maps For Regression Of Soil Moisture With Hyperspectral Data,†pp. 6151–6154, 2018.

S. Keller, F. M. Riese, J. Stötzer, P. M. Maier, and S. Hinz, “Developing A Machine Learning Framework For Estimating Soil Moisture With VNIR Hyperspectral Data,†ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci., vol. 4, no. 1, pp. 101–108, 2018, doi: 10.5194/isprs-annals-IV-1-101-2018.

P. Mallick, O. Ghosh, P. Seth, and A. Ghosh, “Kohonen’s Self-organizing Map Optimizing Prediction of Gene Dependency for Cancer Mediating Biomarkers Partho,†Emerg. Technol. Data Min. Inf. Secur., pp. 863–870, 2019, doi: 10.1007/978-981-13-1501-5.

M. J. Friedel, S. R. Wilson, M. E. Close, M. Buscema, P. Abraham, and L. Banasiak, “Comparison of four learning-based methods for predicting groundwater redox status,†J. Hydrol., vol. 580, no. September 2019, p. 124200, 2020, doi: 10.1016/j.jhydrol.2019.124200.

M. Lourenco Baptista, E. M. P. Henriques, and K. Goebel, “A Self-Organizing Map and a Normalizing Multi-Layer Perceptron Approach to Baselining in Prognostics under Dynamic Regimes,†Neurocomputing, vol. 456, pp. 268–287, 2021, doi: 10.1016/j.neucom.2021.05.031.

W. Wang and Y. Lu, “Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model,†IOP Conf. Ser. Mater. Sci. Eng., vol. 324, no. 1, 2018, doi: 10.1088/1757-899X/324/1/012049.

C. Li et al., “Estimating apple tree canopy chlorophyll content based on Sentinel-2A remote sensing imaging,†Sci. Rep., vol. 8, no. 1, pp. 1–10, 2018, doi: 10.1038/s41598-018-21963-0.

M. H. Gholizadeh and A. M. Melesse, “Study on Spatiotemporal Variability of Water Quality Parameters in Florida Bay Using Remote Sensing,†J. Remote Sens. GIS, vol. 06, no. 03, 2017, doi: 10.4172/2469-4134.1000207.

F. M. Riese, S. Keller, and S. Hinz, “Supervised and semi-supervised self-organizing maps for regression and classification focusing on hyperspectral data,†Remote Sens., vol. 12, no. 1, 2020, doi: 10.3390/RS12010007.

N. Verbeeck, R. M. Caprioli, and R. Van de Plas, “Unsupervised machine learning for exploratory data analysis in imaging mass spectrometry,†Mass Spectrom. Rev., vol. 39, no. 3, pp. 245–291, 2020, doi: 10.1002/mas.21602.

J. Lever, M. Krzywinski, and N. Altman, “Points of Significance: Principal component analysis,†Nat. Methods, vol. 14, no. 7, pp. 641–642, 2017, doi: 10.1038/nmeth.4346.

D. Granato, J. S. Santos, G. B. Escher, B. L. Ferreira, and R. M. Maggio, “Use of principal component analysis (PCA) and hierarchical cluster analysis (HCA) for multivariate association between bioactive compounds and functional properties in foods: A critical perspective,†Trends Food Sci. Technol., vol. 72, no. 2018, pp. 83–90, 2018, doi: 10.1016/j.tifs.2017.12.006.

A. Herrera, D. Ballabio, N. Navas, R. Todeschini, and C. Cardell, “Principal Component Analysis to interpret changes in chromatic parameters on paint dosimeters exposed long-term to urban air,†Chemom. Intell. Lab. Syst., vol. 167, pp. 113–122, 2017, doi: 10.1016/j.chemolab.2017.05.007.

H. Shin, H. Jeong, J. Park, S. Hong, and Y. Choi, “Correlation between Cancerous Exosomes and Protein Markers Based on Surface-Enhanced Raman Spectroscopy (SERS) and Principal Component Analysis (PCA),†ACS Sensors, vol. 3, no. 12, pp. 2637–2643, 2018, doi: 10.1021/acssensors.8b01047.

L. Wang, S. Wang, Z. Yuan, and L. Peng, “Analyzing potential tourist behavior using PCA and modified affinity propagation clustering based on Baidu index: Taking Beijing City as an example,†Data Sci. Manag., vol. 2, no. May, pp. 12–19, 2021, doi: 10.1016/j.dsm.2021.05.001.

S. Asante-Okyere, C. Shen, Y. Y. Ziggah, M. M. Rulegeya, and X. Zhu, “Principal Component Analysis (PCA) Based Hybrid Models for the Accurate Estimation of Reservoir Water Saturation,†Comput. Geosci., p. 104555, 2020, doi: 10.1016/j.cageo.2020.104555.

T. Bouwmans, S. Javed, H. Zhang, Z. Lin, and R. Otazo, “On the Applications of Robust PCA in Image and Video Processing,†Proc. IEEE, vol. 106, no. 8, pp. 1427–1457, 2018, doi: 10.1109/JPROC.2018.2853589.

N. Subba Rao, B. Sunitha, N. Adimalla, and M. Chaudhary, “Quality criteria for groundwater use from a rural part of Wanaparthy District, Telangana State, India, through ionic spatial distribution (ISD), entropy water quality index (EWQI) and principal component analysis (PCA),†Environ. Geochem. Health, vol. 42, no. 2, pp. 579–599, 2020, doi: 10.1007/s10653-019-00393-5.

A. C. Eckert-Gallup, C. J. Sallaberry, A. R. Dallman, and V. S. Neary, “Application of principal component analysis (PCA) and improved joint probability distributions to the inverse first-order reliability method (I-FORM) for predicting extreme sea states,†Ocean Eng., vol. 112, pp. 307–319, 2016, doi: 10.1016/j.oceaneng.2015.12.018.

T. Gergely, O. Georgiana, G. Pascal, F. Matei, and T. Salagean, “Statistical Analysis of a Digital Elevation Model Using Arcgis,†J. Young Sci., vol. IV, pp. 1–4, 2016.

B. Balázs, T. Bíró, G. Dyke, S. K. Singh, and S. Szabó, “Extracting water-related features using reflectance data and principal component analysis of Landsat images,†Hydrol. Sci. J., vol. 0, no. 0, 2018, doi: 10.1080/02626667.2018.1425802.

E. Sharaf and E. Din, “Enhancing the accuracy of retrieving quantities of turbidity and total suspended solids using Landsat-8-based-principal component analysis technique,†J. Spat. Sci., vol. 00, no. 00, pp. 1–20, 2019, doi: 10.1080/14498596.2019.1674197.

L. Wang, “Research on Distributed Parallel Dimensionality Reduction Algorithm Based on PCA Algorithm,†no. Itnec, pp. 1363–1367, 2019.

Joint Research Center, “NDWI : Normalized Difference Water Index,†2011.

H. Xu, “Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery,†Int. J. Remote Sens., vol. 27, no. 14, pp. 3025–3033, 2006, doi: 10.1080/01431160600589179.

D. Jiang, Q. Wang, F. Ding, J. Fu, and M. Hao, “Potential marginal land resources of cassava worldwide: A data-driven analysis,†Renew. Sustain. Energy Rev., vol. 104, no. December 2018, pp. 167–173, 2019, doi: 10.1016/j.rser.2019.01.024.

A. Bronshtein, “Train/Test Split and Cross Validation in Python,†2017.



  • There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development