Resumo:
Artificial intelligence is becoming increasingly integrated into various daily activities,
producing more robust tools and solutions, improving results, and enhancing human
capabilities. The more complex supervised learning models, known as “black boxes”
such as Neural Networks, are powerful but fall short in interpretability for solutions
dealing with sensitive data in contexts like finance, healthcare, legal, or academia.
In this regard, “white box” models, such as Decision Trees, prove to be robust and
more suitable solutions due to their high level of interpretability. In addition to wellestablished
machine learning models, such as Classification and Regression Trees
(CART), recent studies have introduced new models like Optimal Classification
Tree using Mixed-Integer Optimization (OCT-MIO), which is capable of fitting
the training data even better and achieving higher accuracy in some cases. This
work presents the modeling, implementation, and comparison of these two models,
both in training and testing using cross-validation (K-Fold). It also includes a
interpretability analysis of the obtained classification trees and the use of OCT-MIO
as a heuristic. The experiments utilize real and sensitive data, such as for stress level
diagnosis, credit approval prediction, and academic success prediction. Although
CART is a good classification model, it was observed that the OCT-MIO model
is a viable alternative capable of achieving results that are comparable, equal, or
even better, especially for classification trees with a smaller height, which are ideal
in scenarios where interpretability is required. Thus, the OCT-MIO model can
classify data more accurately than CART in trees with a minimal height sufficient
to classify all classes of a problem, while maintaining interpretability.