Spatial variation of physicochemical parameters in a constructed wetland for wastewater treatment: An example of the use of the R programming language

Introduction: The implementation of wastewater treatment systems such as constructed wetlands has a growing interest in the last decade due to its low cost and high effectiveness in treating industrial and residential wastewater. Objective: To evaluate the spatial variation of physicochemical parameters in a constructed wetland system of sub-superficial flow of Pennisetum alopecuroides (Pennisetum) and a Control (unplanted). The purpose is to provide an analysis of spatial dynamic of physicochemical parameters using R programming language. Methods: Each of the cells (Pennisetum and Control) had 12 piezometers, organized in three columns and four rows with a separation distance of 3,25m and 4,35m, respectively. The turbidity, biochemical oxygen demand (BOD), chemical oxygen demand (COD), total Kjeldahl nitrogen (TKN), ammoniacal nitrogen (N-NH4), organic nitrogen (N-org.) and phosphorous (P-PO4) were measured in water under inflow and out-flow of both conditions Control and Pennisetum (n= 8). Additionally, the oxidation-reduction potential (ORP), dissolved oxygen (DO), conductivity, pH and water temperature, were measured (n= 167) in the piezometers. Results: No statistically significant differences between cells for TKN, N-NH4, conductivity, turbidity, BOD, and COD were found; but both Control and Pennisetum cells showed a significant reduction in these parameters (P<0,05). Overall, TKN and N-NH4 removal were from 65,8 to 84,1% and 67,5 to 90,8%, respectively; and decrease in turbidity, conductivity, BOD, and COD, were between 95,1-95,4%; 15-22,4%; 65,2-77,9% and 57,4-60,3% respectively. Both cells showed ORP increasing gradient along the water-flow direction, contrary to conductivity (p<0,05). However, OD, pH and RESUMEN. “Variación espacial de los parámetros fisicoquímicos en un humedal artificial para el tratamiento de aguas residuales: un ejemplo del uso del lenguaje de programación R”. Introducción: La implementación de sistemas de tratamiento de aguas residuales como los humedales artificiales, ha tenido gran interés en la última década por su bajo costo y efectividad en el tratamiento de aguas residuales. Objetivo: Evaluar la variación espacial de algunos parámetros fisicoquímicos en un sistema de humedal construido de flujo sub-superficial de Pennisetum alopecuroides (Pennisetum) y un Control (no plantado). Se proporciona un ejemplo de análisis de dinámica espacial de parámetros fisicoquímicos mediante el uso del lenguaje de programación R. Metodología: Cada una de las celdas (Pennisetum y Control) tenía 12 piezómetros, organizados en tres columnas y cuatro filas con una separación de 3,25 m y 4,35 m, respectivamente. Se midieron en agua del flujo de entrada y salida de ambas condiciones Control y Pennisetum (n = 8): la turbidez, la demanda bioquímica de oxígeno (DBO), la demanda química de oxígeno (DQO), el nitrógeno Kjeldahl total (TKN), el nitrógeno amoniacal (N-NH4), el nitrógeno orgánico (N-org.) y el fósforo (P-PO4). Además, se midieron el potencial de oxidación-reducción (ORP), oxígeno disuelto (DO), conductividad, pH y temperatura del agua (n = 167) en los piezómetros. Resultados: No se hallaron diferencias estadísticamente significativas entre celdas para TKN, N-NH4, conductividad, turbidez, BOD y COD (p>0,05); pero ambas celdas presentaron una reducción significativa en estos parámetros (p<0,05). La eliminación de TKN y N-NH4 fue de 65,8 a 84,1% y de 67,5 a 90,8%, respectivamente. La disminución de turbidez, conductividad, BOD y COD fue entre 95,1-95,4%; 15-

temperature were inconsistent in the direction of the water flow in both cells. Conclusions: Pennisetum demonstrated pollutant removal efficiency, but presented results similar to the control cells, therefore, remains unclear if it is a superior option or not. Spatial variation analysis did not reflect any obstruction of flow along the CWs; but some preferential flow paths can be distinguished. An open-source repository of R was provided.
The lack of sanitation has become a worldwide problem. This has exacerbated with the generation of increasing amounts of wastewater from industries and the residential areas (van Puijenbroek, Beusen, & Bouwman, 2019;Ahmad, Majhi, Kothari, & Singh, 2020;Gomes da Silva & Gouveia, 2020). In Costa Rica, the average of wastewater is estimated at 966 455 m 3 d -1 (the average per person is about 0.2 m 3 d -1 (Instituto Nacional de Estadística y Censos [INEC] 2015)). Only 15% of the country's wastewater has some type of treatment (Acueductos y Alcantarillados [AyA], Ministerio de Ambiente y Energía [MINAE] y Ministerio de Salud [MS], 2016). This information is calculated from industrial companies that register the produced wastewater and companies reporting to have some treatment system; which could indicate that wastewater problem sanitation within the country is more serious than is estimated.
At present, there is a progressive explosion in the development of wastewater treatment systems (Xu & Cui, 2019). Among the wastewater treatment technologies used in Costa Rica are aerobic treatment systems such as activated extended aeration sludge and conventional activated sludge. The anaerobic systems used are septic tanks, drainage, up-flow anaerobic sludge blanket digestion, pre-conditioning systems with submarine emissary, and systems by facultative ponds (Ruiz, 2012). Although these treatment technologies are very efficient, their application within the country will largely depend on the balance between their minimum acceptable result and the minimum cost of construction and maintenance.
Constructed wetlands (CW) for wastewater treatment is an emerging ecological technology of low investment, low energy consumption, and convenient management (Tang, Huang, Scholz, & Li, 2011;Cui, Ouyang, Gu, Yang, & Xu, 2013;López-Ocaña et al., 2018). During the 1980s and 1990s, CW with horizontal subsurface flow (CW-HSF) was the most common type of CW used for wastewater treatment, due to the elimination of organic compounds and suspended solids, using filtration beds with anoxic and anaerobic conditions (Vymazal, 2019). In recent years, researchers have recognized the economic and environmental benefits for treating both industrial and residential wastewater effectively (He, Peng, Hua, Zhao, & Xiao, 2018).
CW-HSF with emerging macrophyte plants performs effectively in removing conventional pollutants (e.g.: nitrogen, phosphorus, and organic compounds) due to their excellent sediment filtration (López-Ocaña et al., 2018). Similar systems that have operated for more than 20 years, show removal levels higher than 91,7%; 82,9% and 88,3% for biochemical and chemical oxygen demand, and total suspended solids, respectively during the last five years (Vymazal, 2019).
However, despite the diversity of researches conducted in the last decade, the strategies for the spatial monitoring of these wastewater treatment systems (WTS) for proper functioning are still poorly studied. Notwithstanding CWs' main drawbacks are substrate media clogging and high area requirements (Tatoulis, Akratos, Tekerlekopoulou, Vayenas, & Stefanakis, 2017).
Overall, an analysis of spatial variation of total suspended solids (TSS) within CW-HSF could indicate the location of obstruction on the flow and/or possible reduction of the removal. However, in an extensive CW-HSF the measurement of TSS could be laborious and the sample size could be unmanageable. TSS can be estimated from turbidity measurements (Grayson, Finlayson, Gippel, & Hart, 1996;Daphne, Utomo, & Kenneth, 2011); nevertheless, this requires a linear regression model fit, which must be recalibrated at each sample location and time point (Rügner, Schwientek, Beckingham, Kuch, & Grathwohl, 2013). Therefore, we presume that the measurement of physicochemical parameters such as water electrical conductivity could be a promising indicator. Because this is strictly linked to turbidity and total dissolved solids. According to Villa, Fölster, & Kyllmar (2019), the site turbidity and conductivity-total phosphorus relationship was significant at 78% of sites (n= 84; mean r 2 = 0,62). Other physicochemical parameters such as dissolved oxygen and oxidation-reduction potential could also be tested. It is important to note that all above physicochemical parameters are easy and quick to measure with low-cost sensors (e.g., Sensor-Vernier-LabQuest).
This research evaluated the spatial variation of some conventional physicochemical parameters on CW-HSF (1,5 years) of cultivated with Pennisetum alopecuroides and control (not plants). To the best of our knowledge, this is the first study conducted on CW using spatial variability analysis with R programming language. The main R package used was "fields"; this allows to adjust of curves, surfaces, and functions with an emphasis on splines, to later be applied to the interpolation of spatial data and spatial statistics (Nychka, Furrer, Paige, & Sain, 2017). The scope of this research was to provide an example of applications and recommendations for spatial variation analysis using R open-source tools.  (2020), average monthly temperature, precipitation, and relative humidity of air during the study period were between 18,5-28°C, 0-186,5mm, and 79-97% (max.-min.), respectively. CW-HSF was built in an area of ~ 800m 2 (each cell of ~ 200m 2 ) and receives water from toilets, student dining, and research and teaching laboratories of the Escuela de Medicina Veterinaria -UNA. Therefore, there is an important presence of chemical substances (organic and inorganic) from the mentioned facilities. The system in this research consists of a set of grilles, an Inmhoff tank, four cells of constructed wetlands with a horizontal subsurface flow, a sludge drying bed, and a disinfection unit. It is important to note that only two cells out of a total of four cells were used for this study.

MATERIALS AND METHODS
All systems are operating in parallel, i.e., each of the wetlands receives the same amount of wastewater with a total hydraulic capacity of 20m 3 d -1 flow for each cell. A cell of CW-HSF is formed by emerging macrophyte plants (Pennisetum alopecuroides, hereafter referred to as Pennisetum). This has a filter filled with stone, where the plants are sown to, crossing it horizontally, through which the wastewater flows from its entry through the wetlands and to the effluent collection (see details in Fig. 1). CW has a total of 12 piezometers per cell, evenly distributed in four rows and three columns. Each piezometer has a depth of 60 cm and with lateral perforations at different depths.
The Control cells simply consisted of an unplanted system with the same structural characteristics mentioned above. The physicochemical parameters measured were (n= 8): turbidity (nephelometric turbidity units -NTU), biochemical oxygen demand (BOD, mg L -1 ), chemical oxygen demand (COD, mg L -1 ), total Kjeldahl nitrogen (TKN, mg L -1 ), ammoniacal nitrogen (N-NH4, mg L -1 ), organic nitrogen (N-org., mg L -1 ), and phosphorous (P-PO4 -3 , mg L -1 ). All the physicochemical parameters were measured strictly following the established by Standard Methods for the Examination of Water and Wastewater (Rice, Baird, & Eaton, 2017). The electrical conductivity, oxidation-reduction potential (ORP), dissolved oxygen (DO, accuracy ± 0,2 mg L -1 ), pH (accuracy ± 0,2 pH) and temperature (accuracy ± 0,02 °C) of water, were measured in each of the piezometers (Control and Pennisetum, 24 piezometers in total). Likewise, measurements at water in-flows and out-flows of the two systems (Control and Pennisetum), were made. All previous these samplings were conducted for three weeks (three days, between August and October 2019) with a frequency hourly between 9:00h and 14:00h (n= 167). Each physicochemical parameter was measured by means of an independent sensor connected and distributed in two portable consoles of the Vernier type (LabQuest ® 2, Version 2.8.5, Vernier Software & Technology, Beaverton, OR, USA). Before each sampling, all sensors were calibrated using their respective manufacturer's protocol.
Data analysis: All variables were evaluated using analysis of variance (ANOVA) or Kruskal-Wallis test. Fisher's Least significant difference (LSD) tests were applied to compare the means between the systems (water in-and out-flows of systems). Spearman (rρ) correlations were made between all the pairs physicochemical parameters per system and together. Multivariate associations among all physicochemical parameters were analyzed using a principal component analysis (PCA). Piezometers that produced similar responses were clustered using a multivariate technique of grouping analysis according to the method of Tocher, which is based on Euclidean average distances (Rencher, 2005). PCA was performed using the R package "factorextra" (Kassambara & Mundt, 2020).
Kruskal-Wallis tests were carried out with Bonferoni tests to compare between the rows of the piezometers positions for all the physicochemical parameters for each system (Control or Pennisetum), this to verify the change of the parameters along the flow of water within the systems. All statistical assumptions were checked. Likewise, bilinear interpolators between the piezometers were done, to obtain a distribution map of the levels of each physicochemical parameter to understand the dynamics within the systems to due to the direction of the water flow (see "fields" R package; Nychka et al. (2017)). All statistical analyses were performed using R programming language, version 3.6.1 (RCoreTeam, 2020) with a significance level of α=0,05. Finally, a GitHub repository was made to the access to R script, database and results of statistical analyses (see DOI: http://doi.org/10.5281/zenodo.3726063).

CW-HSF efficiency:
Water out-flow of Pennisetum and Control showed a significant reduction with respect to water in-flow, for turbidity, BOD, COD, TKN, and N-NH4 (P<0,05, Table 1). No significant difference was found between water in-and out-flow for N-org. and P-PO4 -3 (P>0,05), a high variability was registered in these parameters. In addition, statistically significant differences were found between the sampling dates for conductivity, DO and pH, and the time of measurement for ORP, pH, and temperature (P<0,05, Table 2). These parameters in the time-course did not followed a determined pattern concerning the time of day except for temperature (Fig. 2). Nevertheless, statistically significant differences between water in-and out-flows cells (Control and Pennisetum) for conductivity, ORP and DO parameters, were found (P<0,05; Table 2).
The interplay between physicochemical parameters: Conductivity of water was correlated negatively with all independent parameters of the system and for all the data together, except for temperature and ORP in Pennisetum and Control, respectively (Table 4). The variability explained by the first two axes of the principal component (PC) analysis for all combinations of physicochemical parameters was 67,3% (Fig. 3A), which, it was not possible to separate the systems (Control vs Pennisetum). A more detailed analysis by distance in the direction of the out-flow (row of piezometers position) showed a variability explained by the first two axes of PC between 65,6% and 71,7% (Fig. 3B-E), additionally, the separation between the systems from the measured parameters, was not possible.

TABLE 1
Physicochemical parameters comparison in water in-and out-flows (Control and Pennisetum) in the CW-HSF, using oneway ANOVA

TABLE 2
Physicochemical parameters comparison in water in-and out-flows (Control and Pennisetum) in the CW-HSF, using three-way ANOVA (factors: date, time, and system)

TABLE 4
Spearman's (rρ) correlation coefficient matrix between pairs of the physicochemical parameters at the system level (C: Control; or P: Pennisetum; bottom diagonal) and both systems together (top diagonal)

Flow conditions and other practical issues:
A challenge in studying these systems is to keep in-flow disturbances to a minimum to reduce variation in BOD and hydraulic load rates (Haddis, Van der Bruggen, & Smets, 2020). Two problems in this investigation caused alterations in the in-flow in the CWs: (i) large waste pieces (clothing, bags, plastic material, laboratory waste, etc.) caused the obstruction of the flow in the set of grilles in the Inmhoff tank; and (ii) around a year of Pennisetum growth, the root system clogged the pipes. Both problems were solved by periodic cleaning maintenance. The control system was the one that presented the minimum problems of changes in in-flow. The only problem observed was the natural growth of herbaceous plants within the CW, but it was solved by their removal during the maintenance.
Despite all the problems, the preliminary advances of this research evidence for both systems (Control and Pennisetum) acceptable performances in the general removal rates of organic pollutants. Even both CW-HSF from the first month of construction complied with the provisions of Costa Rican national legislation (Perez-Salazar, Alfaro-Chinchilla, & Scholz, 2019). Furthermore, preliminary data (unpublished data) shows that both systems report a reduction of ~ 29% of faecal coliform (log) and ~ 44% of Escherichia coli (log), where Pennisetum highlighted with 34% microorganism removal compared to Control. Overall, we do not have enough information to ensure that the use of Pennisetum emerging plants could have superior efficiency on an unplanted system, however, the results that will be discussed below show that both CWs have promising performances in wastewater treatment.
CW-HSF efficiency and the interplay between physicochemical parameters: The differences found between sampling dates for conductivity, DO and pH, the time of measurement for ORP, pH, and temperature, possibly indicate an irregularity in water in-flow quality (Haddis et al., 2020). The temperature (21,1min. -28,2max.) and pH (6,1min. -7,7max.). Recorded between 8:00 and 14:00 were within optimal conditions for the development of all heterotrophic microbial activities (Viessman & Hammer, 2005). CW performance is temperature dependent, as it has a strong effect on nitrogen removal, but may have less effect on BOD and phosphorous removal (Kadlec & Reddy, 2001;Tatoulis et al., 2017).
Both systems showed a significant reduction in electrical conductivity and water turbidity, explained by the rocky bed that filters most of the sediments during the horizontal flow through the system (Vymazal, 2019). In contrast, the ORP shows an increase in Pennisetum; it is a slight improvement in the ability to reduce some of the organic compounds in the medium; similar results found in Srivastava, Abbassi, Garaniya, Lewis, & Yadav (2020). Evapotranspiration is a factor influencing the redox condition, as are the flow regimes of the system (García et al., 2010). Planted wetlands produce up to 20% higher redox gradient as compared to unplanted ones (Doherty et al., 2015); where, it is in the CW depth profile, ORP less negative values near the surface have lower anaerobic conditions, while ORP more negative values in the lower layer indicate more anaerobic conditions (Srivastava et al., 2020).
It is well documented that CW-HSFs have organic compound removal rates, with average rates from 80% to 90% of nitrogen, phosphorus, and suspended solids (López-Ocaña et al., 2018;Tatoulis et al., 2017;Vymazal, 2019). Significant reductions of TKN, BOD, and COD were found for the Control and Pennisetum. These removal efficiencies of vegetated beds in CW-HSF has been reported of 75% (Vymazal, 2010); it is an unanimously agreed outcome on CW-HSFs with emerging macrophyte plants for wastewaters treatment from municipal, agriculture, industry and landfill leachate (Vymazal, 2010;Vymazal & Kröpfelová, 2009). All these results are explained by the prevailing anoxic and anaerobic conditions in the filter bed of these systems (Vymazal, 2019).
The weak but significant correlations of water electrical conductivity with the analysed variables reflect the plasticity of the conductivity and the potential use that could be made as an indicator variable of important changes within the system. Among other uses, the total dissolved solids can be estimated from conductivity (Villa et al., 2019), since it is a measure of the volume of ionized solids, such as salts and minerals. Finally, the variability explained by the PCs of all the variables measured in each of the piezometers did not allow to separate the two treatments (Control versus Pennisetum), reflecting a similar consistency of all the variables throughout the flow of water within CW.
The bilinear interpolations of the physicochemical parameters between the piezometers allowed the establishment of a gradient of the measured levels of conductivity, ORP and OD. The spatial variation of conductivity in the control showed a slight increase in the position of the second row of the piezometers, possibly an irregularity mediated by the retention times of the water within the systems. Meanwhile, in Pennisetum a gradual reduction of the average conductivity between the rows of the piezometer positions was recorded; at the same time, a reduction in its variability was observed. A change in water electrical conductivity throughout the treatment system indicates that plants` salts extraction can overcome the increase in conductivity caused by their evapotranspiration (Teixeira, Matos, Matos, Hamakawa, & Teixeira, 2020). In conclusion, the preliminary results of this research showed no spatial obstruction of the flow and/or possible reduction of the removal. However, some preferential flow paths can be distinguished throughout the system. Applications and recommendations for spatial variation analysis with R script: This research provides the GitHub repository where it is explained step by step how to use the R script (see DOI: http://doi.org/10.5281/zenodo.3726063). Likewise, the references of the main packages used are provided; all packages used are free and constantly updated. For this reason, is recommended to install the R packages directly from Github and verify updates before running the script. Other types of CWs, plant species, a greater number of variables, no restrictions on the repetition number, and treatments can be applied to the R scripts and packages, with a simple modification of the original R script. There are no restrictions on the repetition number (number of piezometers).
This research is a preliminary advance of spatial variability monitoring results. Therefore, it is recommended to establish a long-term analysis of spatial-temporal variability of conventional pollutants on CWs (e.g.: nitrogen, phosphorous, sediments, etc.) to integrate and model the longterm removal efficiency. Also, the seasonality (dry and rainy) of tropical climate could show the performance of systems subjected to flow change effect. Finally, the main recommendation of this research for CW-HSF is the periodic monitoring of water in-flow rate, to detect obstructions in the pipelines and avoid changes in the hydraulic load rate.
Conclusion: Pennisetum demonstrated pollutant removal efficiency, but presented results like to the Control, therefore, it remains in doubt if in long-term it can be demonstrated as superior option respect to the control. Other emerging macrophyte plants should be tested. However, we can conclude that both systems had a significant effect on the removal of organic pollutants within the acceptable range (80-90%). In perspective, spatial variation analysis did not reflect any obstruction of flow along the CWs; but some preferential flow paths can be distinguished. Spatial variation is a tool that can be used in developing countries which wish to carry out monitoring campaigns to their treatment systems at a very low cost. Which can be used for the possibility of making decisions at the maintenance level and improvements in the systems. All R scripts used can be applied to other plants in CWs or similar treatments.