Makine öğrenimi algoritmalarını kullanarak jeokonumsal verilerin tahmini = estimation of geospatial data by using machine learning algorithms

Makine öğrenimi algoritmalarını kullanarak jeokonumsal verilerin tahmini = estimation of geospatial data by using machine learning algorithms

Erten, Gamze Erdoğan

Tarih: 2021

Özet:

Jeoistatistik, jeokonumsal değişkenleri doğru bir şekilde tahmin etmek için günümüzde kullanılan en yaygın yöntemdir. Ancak, etkin bir şekilde uygulanabilmesi için verilere ilişkin durağanlık ve doğrusallık gibi bazı varsayımlara ihtiyaç duyar. Son yıllarda, makine öğrenimi (ML) algoritmaları özellikle karmaşık koşullarda tahmin için verimli çözümler sunabildiklerinden giderek popüler olmaya başlamıştır. Bununla birlikte, bu algoritmalar: (1) verilerin birbirinden bağımsız olduğunu varsaymakta ve (2) verileri konumlarında yeniden üretememektedirler. Dahası, çoğu ML algoritması tahmin haritalarında koordinat yönleri boyunca önemli yapay görüntüler vermektedir. Bu tezde ML’yi temel alan iki yeni jeokonumsal tahmin yönteminin sunulması amaçlanmaktadır. Topluluk süper öğrenci (ESL) modeli olarak adlandırılan ilk yöntem, ML tahmin haritalarındaki yapaylıkları yönetmek için sunulmuştur. Bu model ML ile modelleme yapabilmek için süper öğrenci (SL) modelinden yararlanmakta ve bir koordinat rotasyon stratejisi ile orijinal veri setinden çok sayıda farklı eğitim seti oluşturmaktadır. İncelenen vaka çalışmaları; ESL modelinin durağanlık ve doğrusallık varsayımlarına ihtiyaç duymadan, tahmin doğruluğu açısından geleneksel jeoistatistiksel yöntemlerle karşılaştırılabilir sonuçlar verdiğini ve ESL modelinin anizotropiyi dikkate alıp, tahmin haritalarındaki yapaylıkları yönettiğini göstermektedir. Sunulan ikinci yöntem; kriging ve ML yöntemlerini, her iki yöntemin jeokonumsal verilerin tahminindeki dezavantajlarını azaltmak, daha doğru ve tutarlı tahminler elde etmek için biraraya getirmektedir. ML ve krigingden elde edilen tahmin sonuçları, kriging varyansına dayalı bir ağırlıklandırma fonksiyonu ile birleştirilmekte ve ağırlıklar sıralı ikinci dereceden programlama kullanılarak optimize edilmektedir. Birleştirilmiş yöntem çok sayıda benzetilmiş ve gerçek veri seti seti üzerinde test edilmiş ve sonuçlar bu yöntemin, Gauss veri seti dışındaki tüm veri setlerinde hem kriging hem de ML’ den elde edilenlere kıyasla tahmin sonuçlarını iyileştirdiğini göstermiştir

Geostatistics is the most prevalent method used today to estimate geospatial variables accurately. However, some assumptions about the data such as stationarity and linearity are needed for geostatistical methods to be effectively applied. In recent years, machine learning (ML) models have started to become popular, as these models promise to provide efficient solutions for estimation, especially in complex cases. However, these models have two major limitations: (1) the data is considered to be independent, and (2) the data is not reproduced at their locations. Moreover, most ML models produce visible artifacts in the resulting estimates along the coordinate directions, which is not realistic in terms of modeling the geological deposits. This thesis aims to present two new geospatial estimation methods based on ML. The first method called the ensemble super learner (ESL) model is presented to manage estimation artifacts in ML geospatial estimation. This model makes use of the super learner (SL) model for ML modeling and creates numerous different training sets from the original dataset by a coordinate rotation strategy. The case studies demonstrate that the ESL model provides results comparable to the traditional geostatistical methods in terms of estimation accuracy without the need for stationarity and linearity assumptions. The ESL model also incorporates anisotropy information and manages the artifacts in ML spatial estimation. The second method combines kriging and ML to mitigate the disadvantages of each method in geospatial estimation as well as obtain more accurate and consistent estimates. In the proposed method, estimation results from both ML and kriging are combined by a weighting function based on the kriging variance, and weights are optimized using sequential quadratic programming. The combined model is demonstrated in numerous synthetic and real case studies and the results indicate that this method improves the estimation results in comparison to the ones obtained both from kriging and the ML in all cases, except in the truly Gaussian dataset

Tüm öğe kaydını göster