Measuring the circularity of congressional districts

Shape analysis has special importance in the detection of manipulated redistricting, which is called gerrymandering. In most of the US states, this process is made by non-independent actors and often causes debates about partisan manipulation. The somewhat ambiguous concept of compactness is a standard criterion for legislative districts. In the literature, circularity is widely used as a measure of compactness, since it is a natural requirement for a district to be as circular as possible. In this paper, we introduce a novel and parameter-free circularity measure that is based on Hu moment invariants. This new measure provides a powerful tool to detect districts with abnormal shapes. We examined some districts of Arkansas, Iowa, Kansas, and Utah over several consecutive periods and redistricting plans, and also compared the results with classical circularity indexes. We found that the fall of the average circularity value of the new measure indicates potential gerrymandering.


INTRODUCTION
Traditionally, redistricting is always contentious in the United States because it may favor a certain political party. However, it has to be carried out to resolve geographic malapportionment caused by demographic changes. Gerrymandering is among the most commonly discussed practices of political manipulation. This process aims to establish a political advantage for a particular group by partially shaping district boundaries. The effort to create as compact districts as possible is a standard criterion (Polsby -Popper 1991;Webster 2013). Hence, measuring the circularity of districts can be a suitable tool to help detect gerrymandering. This study focuses on the quantification of shape circularity. We introduce a novel circularity measure based on Hu moment invariants that can be considered as a further development of the concept introduced by Nagy and Szak al (2019). We also test the new measure on various US congressional districts in different periods.
The process of redistricting is usually the following. After the number of citizens belonging to one district is determined, the boundaries have to be drawn, which is critical from the perspective of proportional representation of the voters. Districting problems are examined by, for instance, Ricca et al. (2011), Chambers (2008Chambers ( , 2009), Oehrlein and Haunert (2017), Tasn adi (2011), Puppe and Tasn adi (2015). Several methods have been developed to measure the shape of a district, (e.g., Chambers -Miller 2010, 2013Dusek 2015;Maceachren 1985;Young 1988). It is difficult to characterize a district with a single number because there are several correlative properties regarding the form of a planar object, e.g., geographical compactness (circularity), elongation, convexity, connectivity, or the jaggedness of the boundary. These terms have different meanings in different fields of science, so there is no common terminology for describing shapes. The standard meaning of circularity is the degree to which a shape differs from a circle, and it is the most important index from a practical point of view.
In the literature, measuring circularity is an accepted approach to cope with the compactness criterion that aims to limit gerrymandering. The classical circularity measures use the perimeter, the area, and various length measures of shape, but a measure that is perfect in every respect does not exist. The most widespread indexes are, for example, the Reock Test (Reock 1961), the Polsby-Popper Test (Polsby -Popper 1991), and the Lee-Sallee index (Lee -Sallee 1970). Moment invariants are often used in pattern recognition to characterize shapes and have been studied in the literature, for instance, by Csetverikov (2014) and Hu (1962). These moments are invariant to similarity transformations and can be calculated effectively. Initially, a moment-based circularity measure was introduced by Zunic et al. (2010), Nayak and Stojmenovic (2007), and Zunic (2012).
Many attempts have been made in order to detect gerrymandering (Ansolabehere -Palmer 2016). A method for evaluating the shape of political districts based on geometric characteristics and comparison with ranking established by human judgment can be found in Lunday (2014). Fan et al. (2015) analyzed the compactness of redistricting plans in California and North Carolina by calculating four compactness measures, including some classical indexes. Nagy and Szak al (2019) studied the shape of the congressional districts by a measure C b (see Definition 2 in Section 2) that depends on a parameter b. This index is invariant under similarity transformations, and its sensitivity to lacerated boundaries is adjustable by b. The measure was evaluated for several parameters, and they were compared with the Lee-Sallee index, the Polsby-Popper, and the Reock tests on several congressional districts. The authors concluded that b 5 2 is an appropriate parameter on their sample set. Thanks to the Hu moment invariants, the evaluation of this measure is efficient compared to other indexes. This paper recognizes the shortcomings of Nagy and Szak al (2019). We revealed that when we compare the circularity of two districts with different b parameters, the circularity order can change in certain cases. So, we aim to further develop the C b measure and introduce a more robust index M (see Definition 3 in Section 2) that does not need any parameters. We found that this new measure provides a powerful method to detect districts with abnormal shapes. We examined the same districts over several consecutive periods, and found that the fall of average circularity can indicate gerrymandering. This paper is organized as follows. Section 2 gives a brief overview of circularity measures. Section 3 reveals an unpleasant feature of C b , and demonstrates the application of the new circularity measure M on congressional districts of Arkansas, Iowa, Kansas, and Utah. Finally, conclusions are drawn in Section 4.

CIRCULARITY MEASURES
Measuring compactness of congressional districts could be a helpful tool in the detection of gerrymandering. The definition of many shape descriptors is based on the degree to which a shape differs from a circle. The following requirements hold for a circularity measure C: 1. C(D) ∈ (0,1] for any planar shape D; 2. C(D) 5 1 if and only if D is a circle; 3. C(D) is invariant with respect to similarity transformations (translations, rotations, and scaling); 4. For each d > 0 there is a shape D such that 0<CðDÞ<d, i.e., there are shapes whose measured circularity is arbitrarily close to 0.
There are a large number of shape descriptors in the literature that can be applied as a circularity measure for a region, and there are several attempts to classify them into different categories (see Maceachren 1985 andNiemi et al. 1990).
In this section, we briefly summarize the most commonly used circularity indexes. In Definition 3, we introduce a new circularity measure that is the further developed version of the one in Definition 2, which was presented in Nagy and Szak al (2019).

Classical circularity indexes
One of the most popular contour-based indexes is the Polsby-Popper Test (PPT). It compares the area of the shape D to the area of a circle O that has the same perimeter as the shape. (1) One of the most famous area-based methods is the Reock Test (RT), which finds the smallest circle O containing the district D and takes the ratio of its area to that of the circle: Another possible way to quantify the circularity of a shape is to place a reference shape R on the examined shape D making sure that the overlapping area is maximal. The following formula defines the Lee-Sallee Index (LSI), where Δ is the symmetric difference: τðDÞ ¼ areaðD Δ RÞ areaðD∪RÞ : In this case, according to the established practice, we arrange the two shapes that their centroids coincide, and we use a circle O as a reference shape that has an equal area to the examined shape D. Thus, we get the following circularity measure:

Moment invariants as a circularity measure
Let us assume that all the examined shapes are compact in the topological sense, which does not restrict our image processing task. Furthermore, we analyze the shape of a congressional district and do not take the population into account; the brightness function can be considered as the indicator function. Before we introduce our novel circularity measure, some definitions shall be recalled here.
Definition 1. Let m p;q ðDÞ denote the central (image) moment of order (p þ q) of a compact planar shape D, h p;q ðDÞ the normalized central moments, and let the area of circle O equals to the area of D. Then C 1 ðDÞ is a circularity measure where f 1 is the first Hu moment invariant, and the last inequality is an equality if and only if D is a circle, for details see Hu (1962) and Zunic et al. (2010). It is easy to see that this circularity measure satisfies the above-mentioned requirements, see Nagy and Szak al (2019) and Hu (1962). The following circularity measure C b from Zunic et al. (2010) is a generalization of C 1 , and it is applicable in special cases when we want to set the sensitivity manually for a specific purpose. This moment-based circularity measure was first applied for measuring the circularity of congressional districts by Nagy and Szak al (2019).
Definition 2. Let D be a planar shape whose centroid coincides with the origin and let b be a real number greater than À1 and b ≠ 0. Then C b ðDÞ is the generalized moment-based measure The current study revealed an undesired feature which emerged from the examined data. The circularity order can change when we apply different b parameters to dissimilar shapes, as it is highlighted in Figs 3 and 4. This is the main reason why we should take multiple b parameters into account at the same time. Furthermore, our goal is to describe district circularity by a single value and make a comparison with other districts or with other periods. Therefore, in the next definition, we propose the normalized measure of the area under the curve of C b for b ∈ ð −1; 0Þ∪ð0; ∞Þ as a new circularity measure, and denote it by M.
Definition 3. Let C b ðDÞ be the generalized moment-based measure. Then M is a circularity measure In fact, M equals to the average of C b for b. Furthermore, it also keeps the beneficial properties of C b , and it is more robust in some cases. Figure 1 illustrates the different characteristic of C b on two dissimilar shapes, and it also shows the circularity values of the classical indexes and the new measure M. More details on the application of the new measure will be given in the following section.

APPLICATION OF THE NEW CIRCULARITY MEASURE ON US CONGRESSIONAL DISTRICTS
The United States Congress consists of two chambers: the Senate and the House of Representatives. The 50 states have been divided into a total of 435 congressional districts in the House of Representatives, with each one representing approximately 711 thousand citizens. The US Congress has 535 voting members: 435 Representatives and 100 Senators. The members of the House of Representatives serve two-year terms representing the people of a congressional district. The Census Bureau within the United States Department of Commerce conducts a decennial census, and the results of it are used to determine the boundaries of the districts. The boundaries can be redrawn, e.g., two districts can be merged into one (see Iowa from 113th Congress) or a district can be separated into two districts (see Utah from 113th Congress). In most states, this process is made by non-independent actors and often causes debates about partisan manipulation. According to Levitt (2019), only a few states have a fully independent commission that draws the district lines because state legislatures usually have primary control over the congressional lines in their state.
Unfortunately, the definition of compactness is not clear-cut (Chambers -Miller 2010). However, the measure of how far the district's point is from the center, determined by classical circularity indexes, is essential in redistricting. 1 In some cases, there is prima facie evidence that a district is manipulated, see Illinois' fourth district in the 107th Congress on Fig. 5. On the other hand, it is hard to determine the ideal shape for a congressional district. Moreover, optimal partisan redistricting is also hard, see Fleiner et al. (2017).
In order to determine the circularity measures, some image processing tasks should be carried out on the cartographic shapefiles that were retrieved from US Census Bureau (2019). The vector maps of the selected states were rendered to gray-scale bitmap images. Thus, the different areas could be separated by the pixel intensity values. Then, the centroid of the district can be calculated with the help of the central moments. This allows us to determine the reference circle. The area of the reference shape equals to the area of the examined district. Finally, we 1 Interestingly, Arkansas does not require congressional or legislative districting plans to be compact.

Society and Economy
need the Hu moment invariants to determine the new circularity measure. The classical circularity indexes require only the shape area and perimeter.
3.1. An undesired feature of the moment-based circularity measure Nagy and Szak al (2019) examined the measure C b as a function of b. Figure 2 shows an example for b 5 -0.5, 1, 2, 8. They also compared the values of C b with the Lee-Sallee index, the Polsby-Popper, and the Reock tests on various congressional districts. We will also use these indexes as benchmarks when we try to detect gerrymandering.
However, our research revealed in certain cases, when we compare the circularity of two districts with different b parameters, the order can change, which is an undesired feature of the circularity index. Figure 3 presents the case of b 5 -0.5 for two districts where AR03=107 is more circular than AR01=113, while for b 5 1 it is just the opposite. This can also appear in a more relevant part of the domain, between 1 and 2, as it can be seen in Fig. 4. This was one of the reasons why we introduced the new index M, defined in Definition 3. The other beneficial property of M is that it does not require any parameters. Finally, an illustration that shows the nature of the examined circularity measures, in a nutshell, can be found in Fig. 5. In this example, we can see the circularity evaluation of two diversiform districts. Illinois's fourth district in the 107th Congress, a famous example of gerrymandering, with its tangled boundaries and small area compared to its perimeter, has much lower circularity than Arkansas second district from the 113th Congress. The different characteristics of the two curves are distinctly visible as b changes. In the upper instance, the C b ðDÞ curve decreases much slower compared to the lower case. Furthermore, the more irregular the shape, the faster the value converges to 0, for details see Zunic et al. (2010). This example also shows that determining an appropriate b for C b is not straightforward. The values

Detection of gerrymandering
When we try to detect gerrymandering, we should consider the average circularity of a state through successive Congresses and seek significant anomalies. Thus, we can track the changes and reduce the impact of external conditions, e.g., geographical constraints. We have analyzed four states (Arkansas, Iowa, Kansas and Utah) in the period of the 107th (from January 3, 2001 to January 3, 2003), 108th (from January 3, 2003 to January 3, 2005) and 113th (from January 3, 2013 to January 3, 2015) US Congress. The populations of these states are similar, around 3 million, and they all have 3-5 districts. Experimental results are included in the Appendix and can also be seen on an online map (Nagy and Szak al 2020), developed by Leaflet library for interactive maps. All circularity indexes of Utah decreased in stages from the 107th to the 113th Congress. In the case of Iowa, the examined indexes behaved similarly in these periods, the 107th showed the best, while 108th worst results. In Arkansas LSI and PPT decreased monotonically while RT and M had a peak at the 108th. Remarkably, M was more sensitive to the change than RT. The most interesting state was Kansas, where the indexes gave completely different orders, and M was the only one with a falling trend.
An example of presumable gerrymandering is given below. Figure 6 shows Arkansas's third district alone through the 107th, 108th and the 113th Congress. In the table, we can see an almost unambiguous improvement in the circularity values from the 107th to the 108th period, then a significant fall from the 108th to the 113th Congress which gives rise to the suspicion of gerrymandering. The strange shape of the district in the last period is also visible to the naked eye.

CONCLUSION
This paper has investigated the shape circularity of congressional districts, which can be a useful weapon against gerrymandering. Circularity is a fundamental requirement by citizens, and unsurprisingly, it is included in the regulation of redistricting in many states. The measure presented by Nagy and Szak al (2019) performed well compared with classical circularity indexes, but we have found several instances where the circularity order of the districts changed after different b parameters were applied. We have made some improvements in this measure and create a more robust method that does not depend on any parameters. Our experiments on US congressional districts confirmed that the new index is suitable for measuring circularity effectively, since, in many cases, it is more sensitive than the traditional circularity measures.