Abstract
The paper employs a cross-sectional data set comprising the main dimensions of the European Union's International Digital Economy and Society Index (I-DESI) and utilises grouping methods based on objective weights to evaluate the relative digital readiness of Hungary and other Central and Eastern European (CEE) member states of the EU. The objective was not to establish a total ordering (ranking) of the countries in the data set, but rather to identify the most appropriate means of grouping the CEE countries into homogeneous units, utilising multivariate statistical and decision-theoretical techniques (tiered DEA, partially ordered sets and clustering). Despite the disparate methodologies employed, the findings are consistent in that the CEE countries (including Hungary) exhibit a general resemblance to one another and demonstrate comparatively lower levels of digital readiness than Northern and Western European countries. The notable exception is Estonia, which exhibits a distinctive level of digital advancement.
1 Introduction
The International Digital Economy and Society Index (I-DESI) is a set of indicators and a report commissioned by the European Commission and produced by independent experts on three occasions in the period 2016 to 2020. The I-DESI is a composite indicator that aims to provide a comprehensive assessment of the European Union's progress towards a “digital society and economy” compared to selected (usually developed) countries outside the EU. The underlying principle of the methodology is that the set of indicators should reflect the results of the European Commission's original (EU-only) Digital Economy and Society Index (DESI) and be geographically extended using substitute (or proxy) indicators that are also available for non-EU countries.
The DESI and the I-DESI are reports and indicator systems that combine several individual indicators with pre-defined weights and use similar (but not identical) scoring models to assess and rank countries according to their digital performance. Nevertheless, the objective of this paper is not to rank countries; rather, it is to employ objective decision theoretical and multivariate statistical methods (tiered DEA [TDEA], partially ordered sets [poset] and k-means clustering) to form relatively homogeneous groups of countries in terms of digital development. This permits examination of the relative position of Hungary and the Central and Eastern European (CEE) countries.
The 2020 iteration of the I-DESI (European Commission 2021) evaluated digitalisation performance in 45 countries (27 EU, 18 non-EU) across five primary policy areas (dimensions), as shown in Table 1: Connectivity, Human Capital, Citizen Use of Internet, Integration of Digital Technology, and Digital Public Services (see Tarjáni et al. 2022 for a more detailed introduction and statistical analysis). In our study, we utilised data for these key dimensions from 2018 as a foundation for classifying countries into groups using the Hasse diagram technique and the TDEA and k-means clustering methods. Additionally, we sought to examine the position of CEE member states of the EU in terms of digital development within the context of the EU and their principal competitors.
I-DESI 2020 dimensions and original weights
Dimension | Description (explanation of each dimension) | Weights |
Connectivity | The deployment of broadband infrastructure and its quality | 0.25 |
Human Capital | The skills needed to take advantage of the possibilities offered by a digital society | 0.25 |
Citizen Use of Internet | The variety of activities performed by citizens already online | 0.15 |
Integration of Digital Technology | The digitisation of businesses and development of the online sales channel | 0.2 |
Digital Public Services | The digitisation of public services, focusing on eGovernment. | 0.15 |
Source: European Commission (2021).
While the authors would have preferred to include more recent data in the analysis, unfortunately the 2022 edition of I-DESI was not publicly available at the time of the research. Furthermore, as the underlying DESI framework has also been discontinued, it is unlikely that future editions will be published.
In the next section of our study, we provide a brief overview of the literature related to the DESI and I-DESI indicator systems and our grouping methods. Thereafter, we present the methodology of the models used in our study, namely the Hasse diagram technique and the TDEA and k-means clustering method. Furthermore, in the last sections, we present and discuss our results using these methods and our conclusions.
2 Literature review
A substantial and rapidly growing body of literature has emerged that describes digital transformation and development. However, the first part of this section focuses exclusively on the relatively recent literature related to DESI and I-DESI. In the second part, we briefly introduce the literature related to our decision-theoretical grouping methods, namely posets and TDEA.
Bánhidi et al. (2020) employed multivariate statistical methods to analyse the five principal dimensions of DESI. The authors initially examined the linear relationships between dimensions using simple Pearson and partial correlation analysis, as well as factor analysis. This was done with the aim of identifying any potential causal relationships. Their analysis confirms the European Commission's assertion that the five principal dimensions of the DESI are inextricably linked and can only be effectively developed through a unified and coordinated strategy. Subsequently, the EU member states were grouped using cluster analysis and multidimensional scaling (MDS), ranked using multivariate statistical methods, and the resulting rankings were compared with the country rankings based on the European Commission's original scoring model. The findings demonstrate that these disparate methods yield comparable results.
In a more recent study, Tarjáni et al. (2022) employed a similar multivariate statistical analysis on the I-DESI data set. Discriminant analysis was employed to assess the separability of EU and non-EU data, while analysis of variance was utilised to evaluate the mean values of the dimensions. The correlation between the primary dimensions was examined using Pearson's and partial correlation coefficients. Principal component analysis was employed to reduce the five dimensions to two principal components, which were then interpreted in relation to the previous year's data. Outliers were evaluated using Mahalanobis distances, and the results of this analysis also support the European Commission's view that the dimensions of the digital economy are closely correlated and that there is no significant difference in digital development between EU member states (on average) and the non-EU countries (on average) in the database.
Olczyk and Kuc-Czarnecka (2022) analysed the components of the DESI and the suitability of the index's methodology. They also investigated the correlation between the index and economic growth. The results of their analysis demonstrate that two of the five principal dimensions of the DESI (Use of Internet and Digital Public Services), or nearly half (18) of the total 37 indicators, can be excluded without a significant impact on the country ranking derived from the index. Furthermore, the authors have proposed optimising the subjective weights of the DESI. Regarding the relationship with economic development, they found that GDP per capita can be explained to a significant extent by both the original and the modified index, with digital development being an important growth driver, especially for developing countries.
Bánhidi and Dobos (2023a) employed three distinct methodologies on a data set comprising the four principal dimensions of DESI to group the countries of the European Union (EU) and identified the continent's primary geographical “fault lines” in terms of digital readiness. Their objective was to identify groups of countries that were homogeneous (closely aligned) in terms of digital maturity by grouping them together. The three methods employed in the paper were partially ordered sets (poset), Tiered Data Envelopment Analysis (TDEA), and cluster analysis. The three types of clustering demonstrated a high degree of similarity, indicating the robustness of the results.
Török (2024) employed a panel data set incorporating the 2020 edition of I-DESI to examine the links between digital development and economic growth. The findings indicated that there was a strong positive relationship between the value of the I-DESI index and the GDP per capita data. Furthermore, the results demonstrated that from 2015 to 2019, an increase in digital development was accompanied by an increase in GDP. The study's results confirmed that digital development had a positive effect on GDP; however, it was not a sufficient explanation for GDP growth on its own.
Laitsou et al. (2020) employed the DESI and its five principal dimensions to assess the digital performance of the Greek economy and utilised a Gompertz model to project Greece's trajectory towards attaining the digital development levels of leading EU countries. The authors concluded that, despite the country's numerous challenges on both the demand and supply sides of digitalisation, the implementation of appropriate government policies could enable Greece to achieve a level of digitalisation comparable to the EU average by 2030.
Kovács et al. (2022) also employed the DESI data set to examine the potential for catching up in digital development between EU countries. Their findings indicated that the digital development gap between EU member states narrowed between 2016 and 2021 (σ convergence) and that there was a significant negative correlation between initial development level and growth rate (β convergence).
Esses et al. (2021) identified significant correlations in their analysis of the relationship between sustainability and digitalisation in the Visegrád Group (V4) member countries. Their findings contributed to the examination of the linkages between the DESI and GDP, and extended the analysis to broader measures of welfare, including the Human Development Index (HDI) and the Social Progress Index (SPI).
The analysis conducted by Tokmergenova et al. (2021) concentrated on the multicollinearity between dimensions and the statistical relationships between them, utilising data from the I-DESI, which is regarded as an international extension of the DESI. Their findings illustrated the presence of pronounced multicollinearity and redundancy between dimensions.
Bánhidi et al. (2021) investigated Russia's position and performance in comparison to EU member states, utilising data from the I-DESI. Their findings, based on the DEA/CI model, indicated that Russia has made notable advancements in digital economic and social development when compared to Eastern and Southern EU member states, particularly due to its commendable performance in the human capital dimension.
Following a short review of the existing literature on the DESI and I-DESI indices, we now turn our attention to the literature pertaining to the methods we utilise, with a particular focus on the TDEA method and posets.
The basic DEA (Data Envelopment Analysis) methodology (Charnes et al. 1978) effectively categorises decision-making units (DMUs) into two distinct groups: efficient and inefficient. Inefficient DMUs can be ordered according to their efficiency index; however, this is not possible for efficient DMUs. In this context, the question arises as to whether a total ordering (ranking) is truly necessary or whether it is sufficient to divide DMUs into groups of nearly equal efficiency.
This possibility was first proposed by Barr et al. (2000) as a potential avenue for further investigation within the domain of DEA models. They developed a sequential algorithm in which efficient DMUs are separated from other DMUs in a manner analogous to the peeling of an onion. The efficient units are then redefined on the residue and subjected to a further “peeling as an onion” process until the units are exhausted. The DMUs are then grouped according to their efficiency. This method is known as Tiered Data Envelopment Analysis (TDEA) and is also frequently referred to as “onion peeling” in the literature, reflecting the essence of the method. This approach is widely used in the social sciences, particularly in management.
The most productive application was found to be port logistics. The initial published application was Cheon (2009), which examined the efficiency of South Korean ports using this technique. Den et al. (2016) employed this model to assess the efficiency of South Korean and Russian ports.
Another application area is related to higher education. In their paper, Bougnol and Dulá (2006) examined 616 American universities according to their efficiency using TDEA. They concluded that this method yielded results that were consistent with those obtained through the formal measurement method employed by government administration. More recent applications include the work of Johnes (2018), who used this method to examine UK universities in university league tables. In both papers, grouping rather than ranking was the primary objective.
Following a concise overview of the utilisation of TDEA within the social sciences, we proceed to examine the application of discrete multi-criteria decision theory.
In addition to the DEA method, multi-criteria decision theory (Ehrgott 2005) provides a means of comparing the digital efficiency of countries. In contrast to classical decision problems with continuous decision spaces, the decision space in this case is discrete (due to the discrete nature of the countries). As a result, the comparison between elements can be performed in finite steps.
Ehrgott (2005) highlighted the existence of three distinct optimality notations for identifying the most optimal elements, which in this case are countries. One such classification is that of efficient elements, another is that of Pareto optimal elements, and finally, there are non-dominated elements. In models utilising a continuous decision space, the three denotations may differ; however, in the discrete case, the three notions are completely coincident. Henceforth, we will employ the term “Pareto optimal elements,” which encompasses the three denominations.
The field of order theory (Ehrgott 2005; Radziszewski – Szadkowski 2014), addresses the concept of Pareto optimality comparisons between elements. Order theory is a branch of mathematics that concerns itself with the study of binary relations between elements of a given set. Two principal categories of ordering exist: partial and complete. In a complete ordering, a relationship is established between each element of the set, thereby enabling the determination of which of two elements is preferred over the other. If no discernible preference can be identified between two elements within a given set, i.e., the elements are incomparable, we may speak of a partial ordering. In economics, complete ordering is referred to as “ranking.” In this context, a definitive order can be established between the elements. Conversely, partial ordering is a methodology for categorising elements within a set based on their relative preference. In microeconomic consumer theory, for instance, consumer baskets are often compared using partial ordering. This methodology can also be applied to socio-economic systems.
Fattore and Maggino (2014) employed the theory of partially ordered sets (poset) to investigate the issue of sociological poverty and social inequality. The primary objective of their research was to identify potential applications that could prove beneficial in the practical application of poset theory in the construction of socio-economic statistics and social indicators. The initial challenge arises from the assessment and comparison of multidimensional poverty, which gives rise to a ranking problem that lacks a clear definition. This is a particularly salient example in the domain of socio-economic analysis.
Annoni and colleagues (2017) and Beycan and Suter (2017) have continued to apply poset theory in mapping multidimensional sociological poverty theory, with a particular focus on its regional aspect.
Fattore and Arcagni (2021) enumerated eight applications of poset theory, including the relocation of refugees in the EU and the comparison of fiscal policies. The economic application differs from other, primarily sociological applications (Bachtrögler et al. 2016; Badinger – Reuter 2015). It should be noted that the application of poset theory can encompass a broad range of social sciences. The remainder of the paper by Fattore and Arcagni (2021) provides a comprehensive overview of poset theory and the Hasse diagram technique.
The Hasse diagram is visualised using the DART software, which was commissioned by the European Union and is freely available. A comprehensive description of the software is provided by Manganaro et al. (2008).
3 Introduction to tiered DEA, poset and k-means clustering
The three methods utilised have previously been employed in the field of supplier selection (Dobos – Vörösmarty 2022a; 2022b; Dobos 2023). Bánhidi and Dobos (2023a) presented a mapping of 27 countries of the European Union to a grouping and simultaneously to an ordinal scale. In addition to the disparity in the data set, the methodology presented here differs from previous applications in that it does not employ the hierarchical clustering technique, but rather the k-means method. This method is more advantageous because it yields the cluster averages, which can be ranked according to the dimensional weights proposed by the European Commission (2021).
Before applying the TDEA method to the I-DESI data, we briefly describe the meaning of the term “tiered”. We anticipate that such a procedure will map countries to an ordinal scale, but we cannot fully rank them.
The term tiered refers to the fact that a sequential algorithm is applied and that the mathematical operation that is being applied is performed step by step. In our case, this is the search for DEA-efficient countries (DMUs). In the second step, the efficient countries found are removed from the set of countries and the operation used is applied to the remaining countries. The block diagram of the application is shown in Table 2.
Tiered DEA block diagram
|
|
|
|
|
|
Where t is the index of the tier, while E*[t] and I*[t] are the sets of efficient and inefficient DMUs in step t, related to the D[t] set.
Source: Barr et al. (2000).
Table 2 illustrates the application of the TDEA. The same algorithm can be applied to identify “Pareto-optimal” countries in a stepwise fashion. This involves first identifying Pareto-optimal (non-dominated) countries for all countries, and then, after removing them from the list, examining the remaining set of Pareto-optimal countries. These steps are carried out until the set of countries is empty.
In this case, vector yj (j = 1,2,…,p) is the vector of the digital dimensions of the jth country.
This DEA model is run 45 times in the first step, the number of countries included in the digital development analysis. Thus, P = 45 at first, and then its value decreases with the number of DEA-efficient countries filtered out in each step. This series of steps is repeated until no more countries remain. The procedure itself is also called “onion peeling” or simply “peeling technique”.
The above stepwise algorithm is also implemented for Pareto-optimal countries, but then we do not solve a DEA model ((1)–(3)) by linear programming, but find all countries that are not dominated by any other country in the sense that there exists an i-th country such that yj ≤ yi. We can quickly identify such countries using the DART algorithm (Manganaro et al. 2008), which yields a Hasse diagram, which we will present in the next section.
Finally, the k-means cluster analysis is employed to assign countries to ordinal groups in terms of digital development. For this purpose, we selected the k-means clustering method, which identifies each group by its centre (Scitovski et al. 2021). The SPSS 28 software package offers hierarchical clustering, but this necessitates the definition of cluster centres on an individual basis. Once the cluster means are known, the weights provided by the European Commission (2021) can be used to weight the cluster means, which can then be ordered in an ordinal fashion. This provides the countries within the group with the order given by the descending order of the dimension-weighted values, rather than the group numbers obtained by k-means clustering.
After describing the three methods, the resulting country groups are shown in the next section.
4 Using methods to group countries according to their digital development
We first apply the TDEA, then the Hasse diagram technique and finally the k-means cluster analysis to the data of the main dimensions of the I-DESI 2020.
4.1 Application of peeling techniques to identify efficient countries
The model (1)–(3) must be solved for each DMU, in our case for all countries, in order to determine the efficiency indicators. To solve the programming task ((1)–(3)), commercial software such as Microsoft Excel Solver can be used. Throughout this paper, we use this software to solve our TDEA model.
Onion peeling or tiered DEA (TDEA) is a well-known method for determining which DMUs, in our case countries, may be at which efficiency level (Radziszewski – Szadkowski 2014). This method is similar to the Hasse diagram technique but does not necessarily identify all Pareto efficient DMUs.
The peeling technique is a sequential method. The initial step is to examine each DMU and ascertain which are efficient, that is, which have the same DEA efficiency. These are then removed, and a further efficiency test is performed on the remaining DMUs. The DEA efficiency test is conducted in as many steps as possible.
The peeling technique was employed to develop seven onion peels from the data. The computational steps and their results for the case of stepwise decrease in DEA efficiencies are presented in Table 3. Table 4 provides an overview of the onion peels. The countries in the first step have the highest digital development. In this way, the respective levels of development are progressively reduced. The most underdeveloped countries are found in the last, seventh step.
Peeling algorithm results with DEA efficiencies
Country | Peel 1 | Peel 2 | Peel 3 | Peel 4 | Peel 5 | Peel 6 | Peel 7 |
Denmark | 1.000 | ||||||
South Korea | 1.000 | ||||||
United States | 1.000 | ||||||
Finland | 1.000 | ||||||
France | 1.000 | ||||||
Netherlands | 1.000 | ||||||
Iceland | 1.000 | ||||||
Japan | 1.000 | ||||||
Switzerland | 1.000 | ||||||
Australia | 0.934 | 1.000 | |||||
Estonia | 0.911 | 1.000 | |||||
Ireland | 0.869 | 1.000 | |||||
Israel | 0.946 | 1.000 | |||||
Malta | 0.934 | 1.000 | |||||
Norway | 0.985 | 1.000 | |||||
Sweden | 0.977 | 1.000 | |||||
Canada | 0.844 | 0.909 | 1.000 | ||||
Germany | 0.886 | 0.925 | 1.000 | ||||
Luxembourg | 0.922 | 0.990 | 1.000 | ||||
New Zealand | 0.846 | 0.920 | 1.000 | ||||
Spain | 0.845 | 0.922 | 1.000 | ||||
United Kingdom | 0.927 | 0.982 | 1.000 | ||||
Austria | 0.831 | 0.883 | 0.929 | 1.000 | |||
Belgium | 0.853 | 0.911 | 0.940 | 1.000 | |||
China | 0.777 | 0.849 | 0.967 | 1.000 | |||
Cyprus | 0.855 | 0.925 | 0.971 | 1.000 | |||
Lithuania | 0.849 | 0.908 | 0.941 | 1.000 | |||
Czech Republic | 0.822 | 0.878 | 0.911 | 0.985 | 1.000 | ||
Greece | 0.799 | 0.864 | 0.901 | 0.937 | 1.000 | ||
Latvia | 0.770 | 0.823 | 0.851 | 0.905 | 1.000 | ||
Romania | 0.749 | 0.797 | 0.827 | 0.917 | 1.000 | ||
Russia | 0.728 | 0.792 | 0.889 | 0.992 | 1.000 | ||
Slovenia | 0.801 | 0.857 | 0.885 | 0.963 | 1.000 | ||
Brazil | 0.663 | 0.727 | 0.821 | 0.875 | 0.945 | 1.000 | |
Bulgaria | 0.805 | 0.860 | 0.896 | 0.952 | 0.988 | 1.000 | |
Hungary | 0.739 | 0.793 | 0.821 | 0.874 | 0.935 | 1.000 | |
Italy | 0.792 | 0.850 | 0.881 | 0.937 | 0.982 | 1.000 | |
Mexico | 0.679 | 0.753 | 0.830 | 0.910 | 0.954 | 1.000 | |
Portugal | 0.774 | 0.833 | 0.866 | 0.928 | 0.957 | 1.000 | |
Serbia | 0.686 | 0.730 | 0.757 | 0.826 | 0.952 | 1.000 | |
Slovakia | 0.727 | 0.779 | 0.806 | 0.857 | 0.940 | 1.000 | |
Poland | 0.730 | 0.787 | 0.809 | 0.857 | 0.922 | 0.998 | 1.000 |
Croatia | 0.760 | 0.815 | 0.851 | 0.905 | 0.934 | 0.963 | 1.000 |
Turkey | 0.585 | 0.633 | 0.672 | 0.751 | 0.809 | 0.940 | 1.000 |
Chile | 0.707 | 0.759 | 0.791 | 0.841 | 0.869 | 0.916 | 1.000 |
Source: authors.
Peeling of countries
Peel 1 (9 countries) | Denmark, Finland, France, Iceland, Japan, South Korea, Netherlands, Switzerland, United States |
Peel 2 (7 countries) | Australia, Estonia, Ireland, Israel, Malta, Norway, Sweden |
Peel 3 (6 countries) | Canada, Germany, Luxembourg, New Zealand, Spain, United Kingdom |
Peel 4 (5 countries) | Austria, Belgium, China, Cyprus, Lithuania |
Peel 5 (6 countries) | Czech Republic, Greece, Latvia, Romania, Russia, Slovenia |
Peel 6 (8 countries) | Brazil, Bulgaria, Hungary, Italy, Mexico, Portugal, Serbia, Slovakia |
Peel 7 (4 countries) | Chile, Croatia, Poland, Turkey |
Source: authors.
Among the Central and Eastern European countries, Estonia occupies a particularly advantageous position, being grouped together with more developed Western and Northern countries. Estonia has implemented an ambitious strategy and digital policy that has facilitated its development into a “digital nation”. One area in which Estonia has demonstrated particular strength is in the domain of digital public services, or e-government. The country is frequently cited as a model and potential source of best practice in this regard (Adeodato – Pournouri 2020; Espinosa – Pino 2024). The remaining Central and Eastern European countries are situated at the lower end of the scale, grouped together with Southern European and Latin American countries. This is an unsurprising outcome, reflecting their respective levels of economic development. However, it serves to illustrate that these countries may be perceived as laggards in terms of digital advancement, with scope for improvement.
4.2 Grouping of countries according to pareto optimality (poset)
In order theory, the concept of a partially ordered set (poset) formalises the intuitive idea of ordering, sequencing, or arranging a set of DMUs. A poset can be defined as a set of DMUs and a binary ordering relation that decomposes the set into two types of subsets: one (or ones) in which for each pair of DMUs, one DMU follows another DMU, and one in which the DMUs are not related. Some pairs do not satisfy the relation, indicating that none of the DMUs in the pair precede the other. In such cases, the two DMUs are not comparable. The concept of a partial order thus represents a generalisation of the more commonly known full order, in which all pairs are related (Radziszewski – Szadkowski 2014).
The Hasse diagram obtained with I-DESI data is shown in Fig. 1. The countries at level 7 are those that are not dominated by any other country's digital dimension data and are therefore (truly) “Pareto optimal”. In this case, 9 countries will also be efficient, the same number as in the TDEA. Interestingly, the number of levels in this case is also 7, as in the case of TDEA. From the Hasse diagram, it can be seen which level each country is placed in. Since we have already assigned increasing values to the most efficient countries when using the peeling technique, the number of levels is given in reverse order, i.e. in ascending order, not descending order, as we did for the definition of the peels.
Pareto Efficiency Diagram between Countries (Hasse Diagram)
Source: authors, using the DART application.
Citation: Society and Economy 47, 1; 10.1556/204.2024.00012
The Hasse diagram is shown in Fig. 1 and (after reversing the order of the levels and expanding the country codes) in Table 5.
Levels of the countries for the Hasse diagram
Level 1 (9 countries) | Denmark, Finland, France, Iceland, Japan, South Korea, Netherlands, Switzerland, United States |
Level 2 (8 countries) | Australia, Ireland, Israel, Malta, Norway, Luxembourg, Sweden, United Kingdom |
Level 3 (5 countries) | Austria, Belgium, Estonia, Germany, Canada |
Level 4 (5 countries) | Cyprus, Lithuania, Spain, China, New Zealand |
Level 5 (7 countries) | Bulgaria, Czech Republic, Greece, Latvia, Romania, Slovenia, Russia |
Level 6 (9 countries) | Croatia, Hungary, Italy, Poland, Portugal, Slovakia, Brazil, Mexico, Serbia |
Level 7 (2 countries) | Chile, Turkey |
Source: authors.
Cluster analysis (country groupings)
Cluster | Countries |
1 (5 countries) | Denmark, Finland, Netherlands, Norway, United States |
5 (7 countries) | Austria, Belgium, Czech Republic, Hungary, Latvia, Portugal, Slovenia |
6 (14 countries) | Brazil, Bulgaria, China, Cyprus, Greece, Italy, Lithuania, Malta, Mexico, Poland, Romania, Russia, Serbia, Spain |
2 (3 countries) | Iceland, Sweden, Switzerland |
3 (7 countries) | Canada, Germany, Ireland, Israel, Japan, Luxembourg, United Kingdom |
4 (5 countries) | Australia, Estonia, France, New Zealand, South Korea |
7 (4 countries) | Chile, Croatia, Slovakia, Turkey |
Source: authors.
The maximal elements, which are not dominated by any other country, are the top-level countries (i.e., Level 7 in Fig. 1 and Level 1 in Table 5) in Northern and Western Europe, the Far East, and the United States. At the lowest level are Chile and Turkey, which do not dominate any other country in the main dimensions. Even at the previous (penultimate) level, among the countries with lower digital development (including CEE countries), there are countries that dominate them.
In comparison with the outcomes of the Peeling algorithm, while Estonia remains a prominent performer, situated alongside Western countries, its distinction is less pronounced, given that the four levels on which the CEE countries are positioned are contiguous from level 3 to level 6. This suggests that these countries demonstrate a comparable level of development, which is notably inferior to that of the leading countries (exclusively Western, Nordic and Far Eastern countries) in terms of digitalisation. This likely reflects the disparities in the respective strengths of their economies.
4.3 K-means clustering
K-means clustering is a multivariate statistical method that allows for the clustering of objects. In this case, the number of clusters was set to equal the number of onion skins and Pareto efficiency levels, which resulted in the formation of seven clusters.
The k-means algorithm of the SPSS 28 statistical software yielded the means of the seven clusters along the five dimensions presented in Table 7. The weights proposed by the European Commission for each digital dimension were incorporated into the table, thereby weighting the group means to obtain the digital development indicator (level) associated with each cluster. This indicator can be employed to order (rank) the clusters.
Group means of clusters (digital dimensions)
Clusters | Digital Dimensions | Means (digital readiness) | ||||
Connectivity | Human Capital | Citizen Use of Internet | Integration of Digital Technology | Digital Public Services | ||
1 | 0.69 | 0.58 | 0.68 | 0.73 | 0.78 | 0.683 |
2 | 0.66 | 0.48 | 0.50 | 0.45 | 0.81 | 0.572 |
3 | 0.60 | 0.41 | 0.46 | 0.32 | 0.55 | 0.468 |
4 | 0.54 | 0.35 | 0.35 | 0.16 | 0.53 | 0.387 |
5 | 0.66 | 0.54 | 0.67 | 0.77 | 0.50 | 0.630 |
6 | 0.65 | 0.47 | 0.56 | 0.6 | 0.63 | 0.579 |
7 | 0.54 | 0.27 | 0.38 | 0.32 | 0.38 | 0.381 |
Weights | 0.25 | 0.25 | 0.15 | 0.20 | 0.15 |
Source: authors.
Table 7 shows that, based on the weighted cluster means, the clusters can be ordered in descending order on the ordinal scale presented in Table 8. In Table 6, the clusters were still listed in order of the ordinal numbers.
Group means of clusters
Cluster | Group Mean | Rank |
1 | 0.683 | 1 |
5 | 0.630 | 2 |
6 | 0.579 | 3 |
2 | 0.572 | 4 |
3 | 0.468 | 5 |
4 | 0.387 | 6 |
7 | 0.381 | 7 |
Source: authors.
As can be observed, the highest-performing countries, which constitute the first cluster, are exclusively Western and Nordic. Two additional groups are comprised of countries that are exclusively Western and Northern European or Asian. As was the case in the previous groupings, Estonia is grouped with relatively advanced non-CEE countries. The other three clusters include relatively less advanced CEE, Southern European, and Latin American countries.
The result obtained by clustering can be compared with the result of the paper by Bánhidi and Dobos (2023a). Only EU countries can be included in this analysis. Since belonging to a group can be interpreted on a nominal scale, Cramér's V association measure can be used, which in our case is 0.469, which can be considered highly moderate. The chi-square value is 0.474, which indicates that there is indeed a connection between the two groupings obtained by clustering.
Following the presentation of the results, a correlation analysis is employed to compare the results obtained with the three methods.
4.4 Comparison of onion skins, pareto efficient levels, and ordered clusters
Seven groups were established based on the digital development of countries using all three methods. The objective was to determine the extent to which the three methods yielded disparate solutions. To this end, the Kendall's tau-b and Spearman's rho correlation indices were calculated, which are suitable for comparing ordinal variables (Table 9).
Kendall's (tau-b) and Spearman's correlations between the results of the methods
TDEA | Hasse diagram | K-means | |||
Kendall's tau-b | TDEA | Correlation coefficient | 1.000 | 0.939** | 0.703** |
Significance (2-sided) | 0.000 | 0.000 | |||
N | 45 | 45 | 45 | ||
Hasse diagram | Correlation coefficient | 0.939** | 1.000 | 0.716** | |
Significance (2-sided) | 0.000 | 0.000 | |||
N | 45 | 45 | 45 | ||
K-means | Correlation coefficient | 0.703** | 0.716** | 1.000 | |
Significance (2-sided) | 0.000 | 0.000 | |||
N | 45 | 45 | 45 | ||
Spearman's rho | TDEA | Correlation coefficient | 1.000 | 0.977** | 0.821** |
Significance (2-sided) | 0.000 | 0.000 | |||
N | 45 | 45 | 45 | ||
Hasse diagram | Correlation coefficient | 0.977** | 1.000 | 0.831** | |
Significance (2-sided) | 0.000 | 0.000 | |||
N | 45 | 45 | 45 | ||
K-means | Correlation coefficient | 0.821** | 0.831** | 1.000 | |
Significance (2-sided) | 0.000 | 0.000 | |||
N | 45 | 45 | 45 |
** Correlation is significant at the 0.01 level (2-sided).
Source: authors.
The Tiered DEA and Hasse diagram levels demonstrate a robust linear relationship for both correlation coefficients, indicating that the utilisation of either method is sufficient, as they yield nearly identical results. K-means clustering with weighted group means yields slightly different results, yet there persists a robust (significant at the 1% level) relationship between the ordinal variables based on Kendall's tau-b and Spearman's rho coefficients.
5 Conclusions
In our study, we grouped the countries of the European Union and their main international competitors according to the five principal dimensions of the International Digital Economy and Society Index (I-DESI) 2020 edition, as well as according to the objective methods of decision theory and multivariate statistics (tiered DEA, poset, k-means clustering). The 45 countries in the database were divided into seven groups according to their digital development dimensions using all three methods. The similarity of the country groups was then compared using Spearman's rho and Kendall's tau-b correlation coefficients. The results demonstrated that for the I-DESI database, all three clustering methods, but especially the stepwise Pareto efficiency and TDEA, resulted in very similar country groups. The most developed countries of Northern and Western Europe, the United States, and the Far East are clearly distinguished from and clearly ahead of the less developed countries of Central and Eastern Europe, Southern Europe and Latin America by all groupings.
Hungary's position is most favourable in the k-means clustering, which places Hungary in a cluster with the relatively advanced countries of Western, Central-Eastern, and Southern Europe. In contrast, the other two methods place Hungary in the second least developed group, which includes Eastern and Southern European countries and Central and South American countries. The latter classification appears more realistic in that Hungary scored below the average level of development of the countries in the database on all I-DESI dimensions and performed particularly poorly on the digital public services dimension.
While Hungary's performance is not unexpected or particularly poor when benchmarked against other Central and Eastern European countries with analogous historical and economic backgrounds, such as the other three members of the V4 group (Slovakia, Czech Republic, Poland), the performance of select CEE countries, particularly Estonia, illustrates that with an effective digital development strategy and policies, Hungary could potentially rank among the top performers or at least the second tier. When the three grouping methods are considered together, it is evident that the Czech Republic exhibits the highest level of digital development among the V4 countries. However, this advantage is not particularly pronounced. In the k-means clustering method, the Czech Republic is found to be on a par with Hungary, while in the other two methods, it is positioned one level above Hungary and Slovakia.
References
Adeodato, R. – Pournouri, S. (2020): Secure Implementation of E-Governance: A Case Study about Estonia. In: Jahankhani, H. – Kendzierskyj, S. – Chelvachandran, N. – Ibarra, J. (eds): Cyber Defence in the Age of AI, Smart Societies and Augmented Humanity. Cham: Springer, pp. 397–429. https://doi.org/10.1007/978-3-030-35746-7_18.
Annoni, P. – Bruggemann, R. – Carlsen, L. (2017): Peculiarities in Multidimensional Regional Poverty. In: Fattore, M. – Bruggemann, R. (eds): Partial Order Concepts in Applied Sciences. Cham: Springer. https://doi.org/10.1007/978-3-319-45421-4_8.
Bachtrögler, J. – Badinger, H. – de Clairfontaine, A. F. – Reuter, W. H. (2016): Summarizing Data Using Partially Ordered Set Theory: An Application to Fiscal Frameworks in 97 Countries. Statistical Journal of the IAOS 32(3): 383–402. https://doi.org/10.3233/SJI-160973.
Badinger, H. – Reuter, W. H. (2015): Measurement of Fiscal Rules: Introducing the Application of Partially Ordered Set (Poset) Theory. Journal of Macroeconomics 43: 108–123. https://doi.org/10.1016/j.jmacro.2014.09.005.
Bánhidi, Z. – Dobos, I. (2023a): Measurement of Digital Development with Partial Orders, Tiered DEA, and Cluster Analysis for the European Union. International Review of Applied Sciences and Engineering 14(3): 392–401. https://doi.org/10.1556/1848.2023.00612.
Bánhidi, Z. – Dobos, I. (2023b): Országrangsorolás a nemzetközi digitális gazdaság és társadalom index 2020-as adatai alapján, DEA-és TOPSIS-módszerrel. Területi Statisztika 63(4): 515–532. https://doi.org/10.15196/TS630405.
Bánhidi, Z. – Dobos, I. – Nemeslaki, A. (2020): What the Overall Digital Economy and Society Index Reveals: A Statistical Analysis of the DESI EU28 Dimensions. Regional Statistics 10(2): 42–62. https://doi.org/10.15196/RS100209.
Bánhidi, Z. – Dobos, I. – Tokmergenova, M. (2021): Russia’s Place vis-à-vis the EU28 Countries in Digital Development: A Ranking Using DEA-type Composite Indicators and the TOPSIS Method. In: Herberger, T. A. – Dötsch, J. J. (eds): Digitalization, Digital Transformation and Sustainability in the Global Economy. Cham: Springer. https://doi.org/10.1007/978-3-030-77340-3_11.
Barr, R. S. – Durchholz, M. L. – Seiford, L. (2000): Peeling the DEA Onion: Layering and Rank-Ordering DMUs Using Tiered DEA. Southern Methodist University Technical Report 5(1-24).
Beycan, T. – Suter, C. (2017): Application of Partial Order Theory to Multidimensional Poverty Analysis in Switzerland. In: Fattore, M. – Bruggemann, R. (eds): Partial Order Concepts in Applied Sciences. Cham: Springer. https://doi.org/10.1007/978-3-319-45421-4_9.
Bougnol, M. L. – Dulá, J. H. (2006): Validating DEA as a Ranking Tool: An Application of DEA to Assess Performance in Higher Education. Annals of Operations Research 145: 339–365. https://doi.org/10.1007/s10479-006-0039-2.
Charnes, A. – Cooper, W. W. – Rhodes, E. (1978): Measuring the Efficiency of Decision Making Units. European Journal of Operational Research 2(6): 429–444.
Cheon, S. (2009): Impact of Global Terminal Operators on Port Efficiency: A Tiered Data Envelopment Analysis Approach. International Journal of Logistics: Research and Applications 12(2): 85–101. https://doi.org/10.1080/13675560902749324.
Den, M. – Nah, H. S. – Shin, C. H. (2016): An Empirical Study on the Efficiency of Container Terminals in Russian and Korean Ports Using DEA Models. Journal of Navigation and Port Research 40(5): 317–328. https://doi.org/10.5394/KINPR.2016.40.5.317.
Dobos, I. (2023): Tiered Data Envelopment as a Method for Clustering Suppliers. Croatian Review of Operational Research 14(2): 99–110. https://doi.org/10.17535/crorr.2023.0009.
Dobos I. – Vörösmarty G. (2022a): Módszerek a beszállítói csoportképzéshez. SZIGMA Matematikai-közgazdasági folyóirat 53(2): 183–197. https://journals.lib.pte.hu/index.php/szigma/article/view/6030.
Dobos, I. – Vörösmarty, G. (2022b): Supplier Segmentation with Partial Orders, Tiered DEA and Cluster Analysis. SSRN Working Paper, https://doi.org/10.2139/ssrn.4059392.
Ehrgott, M. (2005): Multicriteria Optimization. Springer Science & Business Media.
Espinosa, V. I. – Pino, A. (2024): E-Government as a Development Strategy: The Case of Estonia. International Journal of Public Administration 1–14. https://doi.org/10.1080/01900692.2024.2316128.
Esses, D. – Csete, M. S. – Németh, B. (2021): Sustainability and Digital Transformation in the Visegrad Group of Central European Countries. Sustainability 13(11): 5833. https://doi.org/10.3390/su13115833.
European Commission (2021): International Digital Economy and Society Index 2020. SMART 2019/0087. A study prepared for the European Commission DG Communications Networks. Content & Technology by Tech4i2. https://ec.europa.eu/newsroom/dae/redirection/document/72352.
Fattore, M. – Arcagni, A. (2021): Posetic Tools in the Social Sciences: A Tutorial Exposition. In: Bruggemann, R. – Carlsen, L. – Beycan, T. – Suter, C. – Maggino, F. (eds): Measuring and Understanding Complex Phenomena: Indicators and Their Analysis in Different Scientific Fields. Cham: Springer. https://doi.org/10.1007/978-3-030-59683-5_15.
Fattore, M. – Maggino, F. (2014): Partial Orders in Socio-Economics: A Practical Challenge for Poset Theorists or a Cultural Challenge for Social Scientists? In: Brüggemann, R. – Carlsen, L. – Wittmann, J. (eds): Multi-indicator Systems and Modelling in Partial Order. New York: Springer. https://doi.org/10.1007/978-1-4614-8223-9_9.
Johnes, J. (2018): University Rankings: What Do They Really Show? Scientometrics 115(1): 585–606. https://doi.org/10.1007/s11192-018-2666-1.
Kovács, T. Z. – Bittner, B. – Huzsvai, L. – Nábrádi, A. (2022): Convergence and the Matthew Effect in the European Union Based on the DESI Index. Mathematics 10(4): 613. https://doi.org/10.3390/math10040613.
Laitsou, E. – Kargas, A. – Varoutas, D. (2020): Digital Competitiveness in the European Union Era: The Greek Case. Economies 8(4): 85. https://doi.org/10.3390/economies8040085.
Manganaro, A. – Ballabio, D. – Consonni, V. – Mauri, A. – Pavan, M. – Todeschini, R. (2008): The DART (Decision Analysis by Ranking Techniques) Software. Data Handling in Science and Technology 27: 193–207. https://doi.org/10.1016/S0922-3487(08)10009-0.
Olczyk, M. – Kuc-Czarnecka, M. (2022): Digital Transformation and Economic Growth-DESI Improvement and Implementation. Technological and Economic Development of Economy 28: 775–803. https://doi.org/10.3846/tede.2022.16766.
Radziszewski, B. – Szadkowski, A. (2014): Ranking with Data Envelopment Analysis vs. Partial Order. Open Access Library PrePrints 1: e078.
Scitovski, R. – Sabo, K. – Martínez-Álvarez, F. – Ungar, Š. (2021): Cluster Analysis and Applications. Cham: Springer.
Tarjáni, A. J. – Kalló, N. – Dobos, I. (2022): A nemzetközi digitális gazdaság és társadalom index 2020. évi adatainak statisztikai elemzése. Statisztikai Szemle 100(3): 266–284. https://doi.org/10.20311/stat2022.3.hu0266.
Tokmergenova, M. – Bánhidi, Z. – Dobos, I. (2021): Analysis of I-DESI Dimensions of the Digital Economy Development of the Russian Federation and EU-28 Using Multivariate Statistics. Вестник Санкт-Петербургского университета. Экономика 37(2): 189–204. https://doi.org/10.21638/spbu05.2021.201.
Török, L. (2024): The Relationship between Digital Development and Economic Growth in the European Union. International Review of Applied Sciences and Engineering ,Online First. https://doi.org/10.1556/1848.2024.00797.
Appendix
Cross-sectional data set (I-DESI 2020 dimensions, for the year 2018)
Countries | Dimensions | |||||
Connectivity | Human Capital | Citizen Use of Internet | Integration of Digital Technology | Digital Public Services | I-DESI | |
Weights | 0.25 | 0.25 | 0.15 | 0.20 | 0.15 | |
EU average | 0.62 | 0.42 | 0.47 | 0.41 | 0.56 | 0.50 |
Austria | 0.60 | 0.50 | 0.48 | 0.43 | 0.57 | 0.52 |
Belgium | 0.63 | 0.33 | 0.55 | 0.51 | 0.43 | 0.49 |
Bulgaria | 0.60 | 0.37 | 0.27 | 0.22 | 0.49 | 0.40 |
Cyprus | 0.63 | 0.41 | 0.50 | 0.20 | 0.64 | 0.47 |
Czech Republic | 0.61 | 0.40 | 0.45 | 0.42 | 0.48 | 0.47 |
Denmark | 0.73 | 0.58 | 0.74 | 0.66 | 0.83 | 0.70 |
Estonia | 0.63 | 0.49 | 0.52 | 0.49 | 0.77 | 0.57 |
Finland | 0.70 | 0.60 | 0.58 | 0.80 | 0.74 | 0.68 |
France | 0.67 | 0.50 | 0.41 | 0.46 | 0.86 | 0.57 |
Greece | 0.59 | 0.35 | 0.36 | 0.13 | 0.59 | 0.40 |
Netherlands | 0.64 | 0.57 | 0.65 | 0.83 | 0.77 | 0.68 |
Croatia | 0.57 | 0.27 | 0.30 | 0.27 | 0.26 | 0.35 |
Ireland | 0.61 | 0.57 | 0.51 | 0.61 | 0.69 | 0.60 |
Poland | 0.54 | 0.30 | 0.36 | 0.11 | 0.52 | 0.36 |
Latvia | 0.57 | 0.27 | 0.48 | 0.38 | 0.36 | 0.41 |
Lithuania | 0.63 | 0.41 | 0.49 | 0.23 | 0.38 | 0.44 |
Luxembourg | 0.66 | 0.57 | 0.65 | 0.63 | 0.59 | 0.62 |
Hungary | 0.55 | 0.31 | 0.43 | 0.38 | 0.37 | 0.41 |
Malta | 0.70 | 0.39 | 0.39 | 0.31 | 0.57 | 0.48 |
Germany | 0.63 | 0.50 | 0.54 | 0.67 | 0.54 | 0.58 |
Italy | 0.59 | 0.27 | 0.34 | 0.19 | 0.52 | 0.38 |
Portugal | 0.58 | 0.24 | 0.37 | 0.39 | 0.47 | 0.41 |
Romania | 0.55 | 0.41 | 0.46 | 0.18 | 0.48 | 0.42 |
Spain | 0.60 | 0.39 | 0.43 | 0.24 | 0.71 | 0.47 |
Sweden | 0.69 | 0.60 | 0.64 | 0.73 | 0.57 | 0.65 |
Slovakia | 0.54 | 0.29 | 0.44 | 0.27 | 0.41 | 0.39 |
Slovenia | 0.59 | 0.42 | 0.39 | 0.39 | 0.53 | 0.47 |
Non-EU average | 0.59 | 0.43 | 0.52 | 0.46 | 0.60 | 0.52 |
Australia | 0.65 | 0.57 | 0.52 | 0.50 | 0.77 | 0.60 |
Brazil | 0.46 | 0.36 | 0.37 | 0.10 | 0.56 | 0.37 |
Chile | 0.53 | 0.29 | 0.25 | 0.29 | 0.35 | 0.35 |
South Korea | 0.69 | 0.37 | 0.54 | 0.35 | 0.85 | 0.54 |
United States | 0.70 | 0.66 | 0.68 | 0.73 | 0.81 | 0.71 |
United Kingdom | 0.67 | 0.43 | 0.61 | 0.65 | 0.64 | 0.59 |
Iceland | 0.72 | 0.51 | 0.75 | 0.71 | 0.38 | 0.62 |
Israel | 0.55 | 0.47 | 0.64 | 0.76 | 0.54 | 0.58 |
Japan | 0.75 | 0.42 | 0.52 | 0.58 | 0.60 | 0.57 |
Canada | 0.60 | 0.37 | 0.62 | 0.56 | 0.70 | 0.55 |
China | 0.56 | 0.47 | 0.46 | 0.21 | 0.63 | 0.46 |
Mexico | 0.45 | 0.34 | 0.32 | 0.19 | 0.58 | 0.37 |
Norway | 0.67 | 0.47 | 0.73 | 0.64 | 0.77 | 0.64 |
Russia | 0.46 | 0.37 | 0.48 | 0.28 | 0.61 | 0.43 |
Switzerland | 0.69 | 0.56 | 0.64 | 0.86 | 0.50 | 0.66 |
Serbia | 0.50 | 0.40 | 0.32 | 0.18 | 0.46 | 0.38 |
Turkey | 0.43 | 0.23 | 0.37 | 0.24 | 0.45 | 0.34 |
New Zealand | 0.62 | 0.46 | 0.49 | 0.49 | 0.67 | 0.54 |
Source: European Commission (2021).