Resumo:
Granting credit is a vital activity in the financial industry. For the success of financial institutions, as well as the equilibrium of the credit system as a whole, it is important that credit risk management systems efficiently evaluate the probability of default of potential debtors based on their historical data. Classification algorithms are an interesting approach to this problem in the form of Credit Scoring models. Since the emergence of quantitative analytical methods with this purpose, statistical models persist as the most commonly chosen method, given their easier implementation and inherent interpretability. However, advances in Machine Learning have developed new and more complex algorithms capable of handling a bigger amount of data, often with an increase in predictive power. These new approaches, although not always readily transferable to practical applications in the financial industry, present an opportunity for the development of credit risk modeling and have piqued the interest of researchers in the field. Nonetheless, researchers seem to focus on model performance, not appropriately setting up guidelines to optimize the modeling process or considering the present regulation for model implementation. Thereby, this dissertation establishes frameworks for consumer credit risk modeling based on classification algorithms while guided by a systematic literature review on the topic. The proposed frameworks incorporate ML techniques, data preprocessing and balancing, feature selection (FS), and hyperparameter optimization (HPO). In addition to the bibliographic research, which introduces us to the main classification algorithms and appropriate modeling steps, the development of the frameworks is also based on experiments with hundreds of models for credit risk classification, using Logistic Regression (LR), Decision Trees (DT), Support Vector Machines (SVM), Random Forest (RF), as well as boosting and stacking ensembles, to efficiently guide the construction of robust and parsimonious models for credit risk analysis in consumer lending.