Similarities among small watercourses based on multiparameter physico-chemical measurements

With the introduction of the Water Framework Directive, the relative importance of smaller waterways increased. This statement is particularly true for Hungary, where water-quality monitoring of most smaller rivers only began 12 years ago. Due to their large number, and the lack of historical data concerning their state, systematic monitoring is a challenge. In the current study, 101 creeks are characterized on the one hand by 13 physico-chemical quality parameters (pH, electric conductivity, chloride ion concentration, dissolved oxygen, oxygen saturation, biochemical oxygen demand, chemical oxygen demand, total organic carbon, ammonium nitrogen, total inorganic nitrogen, total nitrogen, orthophosphate and total phosphorus), on the other hand by their watershed ’ s relief, land use, and point sources ’ pollution indicators. Euclidean distance between water bodies (henceforth WBs) is calculated according to normalized physico-chemical monitoring values. They are grouped into clusters using the hierarchical clustering method. Watershed characteristics are used to explain the clustering via linear discriminant analysis. The investigation revealed that the main driver of cluster group creation is related to human impact: diffuse agricultural and point-source pollution. The first of the three clusters involved water bodies with low or no human impact; the second cluster contained those with medium-level anthropogenic disturbance, while waters with high pollution values formed the third cluster. Mean distance between heavily polluted waters was 1.5 times higher than that between those showing no or low disturbance, meaning that pristine waters are more similar to one another than polluted ones. The current number of samples per river is twice as high in cluster 1 as in cluster 3, revealing that there is room for optimization of the monitoring system. This contribution uses Hungary as a case study.


INTRODUCTION
Water quality issues are receiving rising attention worldwide. Among the many types of natural water bodies (rivers, lakes, estuaries, shallow and deep groundwater, sea, etc.), surface freshwaters are among the most exposed, and, in particular, rivers the most variable ones (Clement and Buz as 1999;Wetzel 2001). Spatially and temporally detailed data is indispensable for the adequate assessment of their status, which is a prerequisite of any management intervention. The primary source of such information is on-site water quality measurements along with in-laboratory investigation of samples taken from them Tr asy et al. 2018).
One of the primary driving forces of water quality monitoring (WQM) has always been environmental problems. First WQM efforts date back to the late 19th century (Novotny and Olem 1994;Johnson 2006). The continuous spread of water quality problems (epidemics in the 19th century, oxygen problems in large North-American rivers in the early 20th century, eutrophication problems in the 2nd half of the 20th century, "ubiquitous" emerging pollutants in the last decades (EEA 2018) meant that monitoring activities were under constant development. In the 1950s and 1960s, many regional-scale regular monitoring systems were inaugurated (F€ olster et al. 2014;Tango and Batiuk 2016;Shimada 2018). By the turn of the century, continent-wide water quality regulations fostered large-scale, standardized monitoring programs (U.S. Congress 1972;Dworak et al. 2005;L aszl o et al. 2007).
At the first design of a monitoring network, practical aspects are set against the goals of the network (Strobl and Robillard 2008;Telci et al. 2009). Later, at any redesign / optimization of the system, previously produced monitoring data provide additional information. Considerations for network re-design can be a byproduct or the objective itself of studies aiming at understanding the processes influencing water quality in a watershed (Kov acs et al. 2012a). Beside process-based models (Tsakiris and Alexakis 2012), statistical methods help understand the information contained in large datasets. They either only process the water quality data or link them to properties of the sampling location / related watershed.
Statistical models mostly consist of multivariate analysis tools (hierarchical cluster analysis [HCA], principal component analysis, correlation analysis, linear discriminant analysis [LDA] -to mention only a few), which lead to dimension reduction and thus easier handling/processing of the data (Wunderlin et al. 2001;Varol et al. 2012). HCA is usually used to group similar sampling sites Kov acs et al. 2014) with the ultimate goal of cost optimization. LDA has also been widely used to classify ambient measurement data, and in particular, water quality data (e.g. Tanos et al. 2015). This tool finds a set of prediction equations based on independent variables that are used to classify individuals. Many studies found LDA to be amongst the most effective methods in reducing the complexity of a system (e.g. Wunderlin et al. 2001;Varol et al. 2012).
More complex models investigate the statistical relation between watershed (WS) properties (primarily land use) and water quality (Azhar et al. 2015;K€ andler et al. 2017;R€ oman et al. 2018). Some of them establish a functional connection between the proportion of specific land uses and the value of one or multiple WQ parameters (Giri and Qiu 2016;Bostanmaneshrad et al. 2018). Others also take into account the fragmentation of land use (Bostanmaneshrad et al. 2018), relative location of polluting land uses (Giri et al. 2018) or the distance of each land use from the observation point (Tu and Xia 2008;Chen et al. 2016).
The number of studies in this field, as well as the studies themselves, underpin a single conclusion: while there is an unambiguous relationship between a watershed's land use and human activities and the WQ at its outflow point, due to the complex effects, its detailed description is rather complicated.
In Hungary, country-wide regular WQ monitoring began in 1968 (HSI 1993;Szab o 2008). For four decades, its functioning could be described by continuous, regular, and recursive system development with one major review of sampling locations, frequencies, and parameters every decade: new norms were introduced in 1983 and 1994 (Szab o 2008; Kerekes-Steindl 2016). The economic situation in the early 21 st century, as well as the introduction of the Water Framework Directive (WFD) with Hungary joining the EU in 2004, interrupted this progressive process. While WQ monitoring traditionally concentrated on the most important lakes and larger streams along with some additional vital sites (bathing waters, drinking water reservoirs), the WFD drew the attention on smaller creeks and ponds (Clement and Somly ody 2011). Since monitoring efforts were / could not be raised, this led to a significant rise in sampling sites along with a significant drop in frequencies (Fig. 1).
Current country-wide monitoring is driven by international as well as bilateral regulations (Transnational Monitoring Network, Nitrate Directive, Bathing Water Directive, Water Framework Directive, etc.). Sample collection, measurements, and lab analysis are carried out by one of seven central laboratories. The data are processed in the National Environmental Information System's Surface Water module (in Hungarian: Orsz agos K€ ornyezetv edelmi Inform aci os Rendszer Felsz ıni V ızv edelem Szakmodul, OKIR-FEVISz). Unfortunately, neither are the raw data freely available nor are the results submitted to any consequent or strict quality assessment / quality control processes (Somly ody 2018).
Many parts of the water quality monitoring system have been subjected to optimization by means of advanced statistical methods: the two largest rivers: Danube (e.g. Chapman et al. 2016) and Tisza (e.g. Tanos et al. 2011Tanos et al. , 2015, Lake Balaton (e.g. Kov acs et al. 2012b), and its mitigation wetland (e.g. Hatvani et al. 2014;Kov acs et al. 2015), or the smaller rivers (e.g. V arb ır o et al. 2012) or lakes (e.g. Borics et al. 2014). While studies carried out on larger rivers and lakes concluded in the optimization of the monitoring network, those referring to smaller waters stopped at establishing a suitable typology and did not draw any conclusion concerning the monitoring activities. Also, a holistic analysis of the monitoring database is absent; the yearly redesign of monitoring activities aims at filling unique information gaps rather than following a clear, long-term concept. The expression "data rich, information poor" (Behmel et al. 2016) perfectly characterizes the situation.
Rearrangement of the monitoring system between the years 2004-2007 made it possible to evaluate the status of 80% of the water bodies in 2015 based on monitoring data. However, concerning small rivers, more than half of them either could not be classified or were classified with low reliability due to lack of data (GDWM 2016a). It seems that while a major part of the nation's river network consists of small rivers, their monitoring is not well balanced.

AIMS OF THE STUDY
This study aims at optimizing the spatial distribution of monitoring locations on the small rivers of Hungary. The waters are classified according to similarities in physicochemical patterns. Inter-group and intra-group differences are quantified. Based on these measurements, answers to the following questions are sought.
1. Is it true that more monitoring efforts are concentrated on those groups whose members are more different one from the other? 2. To what extent can the formation of groups be predicted based on specific properties of the watershed?

Assessed water quality and auxiliary data
The river network of Hungary is subdivided into 1,078 water bodies, 889 of which are river water bodies, and the remaining ones are lake water bodies. This study was restricted to river water bodies whose long-term mean flow (LMQ) <1 m 3 s À1 . According to the 2nd River Basin Management Plan of Hungary (GDWM 2015), the 889 river WBs are classified into types 1 to 10 based on their catchment size, relief, soil, and grain size characteristics (Table 3). The studied water bodies belonged to types 1, 2, 3, 5, or 6 (WFD-CIS 2003). Most of them had the physico-chemical status of moderate or good, and none of them was bad (Table 3; . The watershed of each river was delineated based on EU-DEM v1.1 (Copernicus Land Monitoring Service 2016a) with the TauDEM method (Tarboton 1997 ; Fig. 2). EU-DEM also allowed for computing the basic relief properties of the watersheds (Table 1). Land use shares were calculated on the basis of the Corine Land Cover 2012 database (Copernicus Land Monitoring Service 2016b). A cadaster of Hungarian wastewater treatment plants allowed for the calculation of pointsource loads emitted to each river (GDWM 2016b). The emission values were divided by the long term mean flow (LMQ) of the river. The watershed properties served as predictor variables in the discriminant analysis models. Water quality data The subject of the analyses was WQ data from 13 parameters ( Table 2) available for 2007-2016 from Hungary, describing the salinity, acidity, oxygen budget, and nutrient conditions of the water bodies. Samples were gathered by one of the seven national laboratories, and component concentrations were either measured on-site or in the laboratory, according to Hungarian and international standards. The availability of WQ data restricted the number of creeks: only 101 of the total of 638 "small" (LMQ < 1 m 3 s À1 ) rivers were involved in the study.

Methodology
A 13-element vector of WQ values characterized each water body according to the list of parameters ( Table 2). The characteristic value of each parameter was the mean of all values measured at any location along the river. Parameter means were generated in two steps. First, a yearly mean value was calculated. Secondly, the mean of the yearly values was computed. This way, biases caused by eventual changes in sampling frequency (e.g. at different sites) could be eliminated. The values were then z-score transformed. Listing all rivers (in alphabetical order) and parameters (in the above order) in a matrix formed our initial database (101 3 13 matrix). This matrix determines the position of each river in the 13-dimensional space. The Euclidean distance between each pair of points was then calculated and indicated in the distance matrix (DM): a square matrix of size 101. In the DM, small numbers indicate high similarity, while large ones indicate dissimilarities. By definition, the main diagonal of the DM consists of zeros indicating each water's identity with itself. Instead of inserting the large matrix in this study, a novel but simple visualization technique is used: values below a certain threshold are dimmed black, while the rest of the matrix is kept white.
In order to picture the magnitude of the transformed distance units, the above transformation was applied to distances between type 3 and type 6 class boundaries (Table 3). Distances between adjacent class boundaries vary between 4.2 and 6.1 (Table 4). This means that two WBs being two units distant from one another have a great chance to be ordered into the same status class. On the other hand, if the distance between them is six transformed units, then there is a very high probability for one of them being at least an entire class worse than the other. Following this, the thresholds for a value to be dimmed black was chosen to be either 2 (Fig. 3, left; Fig. 5) or 6 ( Fig. 3, right).

Hierarchical Cluster Analysis (HCA)
Based on the DM, the hierarchical cluster dendrogram was generated (Day and Edelsbrunner 1984). During the clustering process, Ward's minimum variance method (Ward 1963) was used to determine the distance between a new group and the previously created groups / nodes. The Ward's method uses the analysis of variance approach to evaluate the distances between clusters while attempting to minimize the sum of squares of any two clusters that can be formed at each step. Finally, the dendrogram was intersected so that three groups would be distinguished.

Linear Discriminant Analysis (LDA)
Linear discriminant analysis is a classification algorithm that uses predictor variables to find the function that mostly differentiates between a nominal categorical variable (dependent variable; McLachlan 1992; Duda et al. 2000). In the present study, LDA was used to attempt to reproduce the HCA groups (the dependent variable) using two sets of watershed properties as predictor variables. In two consecutive models, the number of predictor variables was chosen to be 2 and 5 (Table 5).

Distance analysis
The visualization technique described under Methodology allows for the identification of waters with extreme behavior: those being members of a tight group, or, the other extremity: being distant from (almost) all the others. Black patches on Fig. 3 (left) indicate first-case waters. These are small creeks with low or no human impact, situated mostly in forested mountainous areas. Second-case waters are indicated by white horizontal and vertical lines on Fig. 3 (right); they are situated mostly in lowland urban or agricultural areas and thus disturbed by heavy human pollution.

Grouping of water bodies
After grouping the water bodies with HCA based on the DM, a dendrogram was obtained (Fig. 4). The dendrogram was intersected at 70% of the maximum linkage distance, creating three clusters. It can be seen that not only are the members of Cluster 1 closer to each other than to any member of Clusters 2 or 3, but distances between members of Cluster 1 are smaller than those between members of Cluster 2 or Cluster 3. In order to visualize the tendency that mean intra-group distances increase from Cluster 1 to Cluster 3, previously alphabetically ordered elements of the DM were rearranged: members of Cluster 1 were gathered into the upper left corner, members of Cluster 2 stayed in the middle, and members of Cluster 3 got into the lower right corner. The density of black patches on this new figure (Fig. 5, left) also confirms the increasing tendency.
To link cluster grouping with river basin management plan (RBMP) classes, the DM was reordered again: this time according to the RBMP class of the WB (class bad was Figure 4. Hierarchical clustering dendrogram. Dashed line: transection level. "cl 1" to "cl 3": clusters 1 to 3 (also denoted by colors blue, green, and red). Table: Intra-group and inter-group mean distances Figure 5. Distance matrix rearranged. Left: sorted according to clusters 1 to 3; right: sorted according to RBMP class. Instead of indicating the unique numbers, values <2 transformed units are dimmed black, and the rest are white. cl 3 5 cluster 3; h 5 high empty since no WB of the study fell into the worst class). Mean intra-group distances were calculated for this grouping as well, resulting in values of 3.2, 3.1, 4.0, and 5.3 for classes high, good, moderate, and poor, respectively (see Fig. 5, right). When comparing the cluster group and the RBMP class for the unique WBs, it was found that 31 out of 40 WBs in Cluster 1 came from classes high or good, 29 out of 42 WBs in Cluster 2 came from class moderate and 14 out of 19 WBs in Cluster 3 came from class poor. This finding demonstrates that the simple and straightforward method of hierarchical clustering was robust in reproducing the RBMP classes; or, the other way round: that RBMP class threshold values reflect the cluster boundaries quite well. This finding serves as an additional verification of the class boundaries .

Prediction of grouping with LDA models
In Model 1, almost two-thirds of the WBs could be classified correctly into Clusters 1 to 3. In particular, 68, 69, and 26% of WBs were correctly classified into Clusters 1, 2, and 3, respectively (Table 6). Since Model 1 only worked with two predictor variables, a two-dimensional visualization is possible (Fig. 6).
Model 2 allowed for a 12% increase in prediction accuracy compared to that of Model 1. The correctness rate increased for all three cluster groups (see the last column of Table 6). Both models affirm that cluster groupings were in strong relationship with point-source and diffuse pollution loads.

DISCUSSION
Many previous studies documented the fact that land use, along with point-source loads, determines the clustering of monitoring sites (Wang et al. 2014;Zhou et al. 2016). Zhou et al. (2016) even concluded that clusters are created according to the degree of human point-source and diffuse impacts. The novel visualization technique used in this study led to the conclusion that undisturbed WBs show higher clustering inclination than heavily polluted ones.
On the one hand, this is a direct consequence of the fact that the parameter set in this study was chosen to represent anthropogenic effects. Higher concentrations mean higher disturbance. On the other hand, this means that the quality of unpolluted water is much more predictable based on background properties than that of a highly polluted one. In other words, pollution can take many, quite varied, forms.
The share of agricultural land along with other land uses being the most critical predictor variables of water quality is in good agreement with many previous studies (Chen et al. 2016agricultural Barclay et al. 2016). According to our findings, WBs with no or meager wastewater share and arable land below 30% will likely have a good or high status. On the contrary, those with a wastewater share around or above 40% will not achieve good status. Status of WBs with wastewater share between 5 and 40%, arable land share above 30% is uncertain. These numbers are in good accordance with the findings of K€ andler et al. (2017), who concluded that settlement areas govern the chemical signature if their proportion of the land use is above 20%, arable land rules the WQ when its proportion is above 40% and that forest proportions bigger than 70% lead to WQ with low concentrations of pollutants.
Effects on the planning of future monitoring activities The results of HCA and LDA are to be used for optimizing monitoring activities. Our study established that the current number of measurements is not in proportion with the complexity of the pollution problem (Table 6). In particular, WBs falling in Cluster 3 need more measurements, which can be carried out at the expense of members of Cluster 1, the latter being more similar to one another. When planning monitoring activities on previously unmonitored small watercourses, the following guidelines must be respected: Relatively fewer measurements are needed on WBs where WW share < 10% and arable land share < 30% because they are very likely to achieve good status. More measurements are needed on waters where arable land share > 30% or 40% < WW share < 70%, to track the desired status improvement. The focus should be on WBs where 10% < WW share < 40% and / or 30% < arable land share because their status will probably be close to the good / moderate border. WBs with WW share > 70% are very rare and probably have very bad status.

Evaluation, perspectives
The methodology presented in this paper is capable of estimating water quality based on watershed properties. Nevertheless, it should not be concluded that measurements can be neglected (the method relies firmly on measurement data) but rather that measurements should be focused on water bodies at risk (i.e., those characterized with high human impact). On the other hand, pristine waters have "more to lose"; therefore, their monitoring is also important (e.g. F€ olster et al. 2014).
Further investigation is needed to determine the sensitivity of the results to the WQ parameters chosen. A possible development of the presented methodology could be the taking into account of the (hydrological) distance of a land use and point-source inlet from the monitoring site.

CONCLUSIONS
Hierarchical clustering analysis was used to group 101 water bodies into three main groups based on their normalized physico-chemical measurement data. It was found that (1) clusters emerged according to the level of pollution in the water body and that (2) water bodies with no or low disturbances show higher similarities (smaller distances) among them than those with high pollution. It has been established that current monitoring practice puts too much emphasis on undisturbed waters while it does not deal sufficiently with heavily polluted ones. Concerning unmonitored waters, future monitoring should focus on those with WW share between 10 and 40% and arable land share above 30%, because in their case, only monitoring can decide whether they will achieve good status.