Análise e comparação de algoritmos ensemble de classificação na descoberta de exoplanetas

LUZ, Thiago Sales Freire

Análise e comparação de algoritmos ensemble de classificação na descoberta de exoplanetas

LUZ, Thiago Sales Freire; http://lattes.cnpq.br/1717282381510877

URI: https://repositorio.unifei.edu.br/jspui/handle/123456789/3927

Data: 2023-09-28

Resumo:

Exoplanets are planets discovered outside our solar system. Their discovery happens because of scientific work with telescopes such as the Kepler. The data collected by Kepler is known as Kepler Object of Interest. Machine Learning algorithms are trained to classify these data into exoplanets or non-exoplanets. An Ensemble Algorithm is a type of Machine Learning technique that combines the prediction performance of two or more algorithms to gain an improved final prediction. The current works on exoplanet identification use mostly traditional non-Ensemble algorithms. Therefore, research that uses Ensemble algorithms for exoplanet identification is scarce. This paper performs a comparison among some Ensemble algorithms on the exoplanet identification process. Each algorithm is implemented with a set of different values for its parameters and executed multiple times. All executions are performed with the cross-validation method. A confusion matrix is created for each algorithm implementation. The results of each confusion matrix provided data to evaluate the following algorithm’s performance metrics: accuracy, sensitivity, specificity, precision, and F1 score. The Ensemble algorithms achieved an average performance of more than 80% in all metrics. Changing the default values of the Ensemble algorithms parameters improved their predictive performance. The algorithm with the best performance is Stacking. In summary, the Ensemble algorithms have great potential to improve exoplanet prediction. The Stacking algorithm achieved a higher performance than the other algorithms. This aspect is discussed in the text. The results of this work show that it is reasonable to increase the use of Ensemble algorithms. The reason is their high prediction performance to improve exoplanet identification.

Mostrar registro completo