Abstract:
Introduction: Decision making support in the context of dropout in higher education institutions requires systemic monitoring of dropout rates and the description of dropout students by personal and academic variables. The initial premise is the development of an institutional repository that stores historical and non-volatile data, the data warehouse, is the tool that enables the Federal University of Itajubá (Unifei) to respond to this demand. Objective: The main objective is the development and implementation of a DW for decision-making support on dropouts at Unifei. The secondary objectives are to discover methods for the development of data warehouses, to discover methods and metrics used to calculate dropout rates and to know the descriptive characteristics of students related to dropout. Methods: The object of study is Unifei and the management information system in use at the institution, the Sistema Integrado de Gestão de Atividades Acadêmicas (SIGAA). To guide the intervention, a methodology was proposed using soft systems methodology and the Ralph Kimball’s method of developing data warehouses. Bibliographic research was conducted in education journals to collect data related to dropouts. Documentary research was conduct on SIGAA’s database to assess the data available. Results: The bibliographic research revealed three groups of methods for calculating dropout rates: methods that consider complete generations of students, methods that consider only newcomers within a specific period and methods that propose the monitoring of historical series. The positive and negative points were contrasted between the groups and a proposed calculation method was derived from the literature. Also, from the bibliographic source, 65 descriptive variables about the students were identified, being 33 personal variables (50.5%) and 32 variables related to the students' academic life (49.2%). Documentary research was conducted in the SIGAA database and the subsequent data triangulation showed that SIGAA stores the data necessary to calculate dropout rates from courses since 1998. Furthermore, SIGAA stores 23 (35.4%) descriptive variables identified in the literature. The information requirements derived from the data sources triangulation were used to guide the development of the data warehouse. Architectural models of the data, both dimensional and physical, were realized. The necessary steps to extract data from its sources, treatments performed and loading into the data warehouse environment were described. Conclusion: The evidence brought by the results supports the statement that the development and implementation of an institutional DW is a tool that allows higher education institutions to monitor the phenomenon of dropout in a systematic, periodic, and constant manner.