Abstract:
The Pampulha Lake is an urban dam located in the Belo Horizonte/MG city, being considered a postcard of important tourist and cultural value. Due to human pressure, it is important to monitor the parameters of surface water quality, especially chlorophyll-a, the main biological variable that indicates the trophic state and environmental integrity of aquatic ecosystems. In this sense, fieldworks are used to monitor this and other parameters; however, due to the values involved, these are collected spaced in time. To solve this handicap, aquatic remote sensing (ARS) has been used as a complementary form of data generation, as it enables the acquisition of information according to the satellite's passage. These ARS monitoring points are called Virtual Stations, which are of great value because they are inexpensive, serve remote areas, enable the acquisition of data since the satellite launch and the constancy of information as the satellite passes through the study area. Therefore, this research aimed to explore techniques linked to the satellite images processing, statistical analysis, and polynomial regression, in order to predict and model chlorophyll-a concentrations in Pampulha Lake.. For this, monthly data of chlorophyll-a were used in the period from 2016 to 2020, obtained through request to the Municipality of Belo Horizonte (PBH). This dataset was the starting point for the selection of the 28 Sentinel-2 satellite images which were included in the time window of up to 7 days between the satellite's passage and fieldwork collection; and 23 images for a period of up to 3 days.. A second order polynomial regression analysis was applied to perform the calibration and validation of the created equations. The coefficient of determination (R²) and Pearson's linear correlation coefficient (R) were used to evaluate the performance of the regression model, both in the case of calibrations and validations of the generated equations. These statistical indices were used considering a single equation for all points, separating by seasons (dry and rainy), and by monitoring point. In addition, a proceeding of exclude anomalous values (outliers) in the calibration and validation stages was also considered. The results indicated that the models generated by polynomial regression of degree two had a better fit to the set of the sensor Multispectral Instrument (MSI) images carried aboard the satellite Sentinel-2, and those obtained with up to 7 days between the fieldwork collection and the satellite pass showed a lower correlation between chlorophyll-a estimates than those with up to 3 days. Furthermore, the performance of the chlorophyll-a equations was not satisfactory when considering the calibration with all the data from the 6 monitoring points and regarding the seasons of the year (dry and rainy), even considering the exclusion of anomalous values. The best metrics of the statistical indicators were found for a difference of up to 3 days between the fieldwork and the satellite passage; at points 2, 3, 4, 5 and 6, presenting R² values of up to 85,67% and R equal to 0,92, which demonstrates a strong correlation for the equations in these monitoring points. Therefore, the integration of remote sensing data in lakes mapping with the application of polynomial regression in data analysis is a very promising approach to predict chlorophyll-a, as well as its spatial and temporal variations.