Where is victory most certain? The level of luck-based noise factor in Summer Olympic Games

We describe a statistical approach for the measurement of the newly defined luck-based noise factor in sports. It is defined as the difference between the actual outcome and the expected outcome based on the model predictions. We raise the question whether some sports exhibit a higher level of noise-factor than others, making investments in that sport riskier. Data from 14 individual sports in six Summer Olympic Games between 1996 and 2016 were included in the analysis. Market shares are predicted by the autoregressive linear and zero-inflated beta regression models with exogenous variables, where the higher Normalized Mean Squared Error indicates a higher noise-factor. Modern pentathlon, tennis and cycling showed the highest noise-factors, whereas swimming, table tennis and athletics were the least noisy. Possible reasons are discussed in the paper. Our analysis indicates that countries with suitable resources producing leading elite Olympic athletes are predicted to achieve higher success in sports with a lower noise-factor such as swimming. In contrast, investments in noisy sports, such as e.g., modern pentathlon, are associated with a higher risk.


INTRODUCTION
Sports and especially elite sports have started to be in the scope of economic thinking since the recent decades.The reason behind the increasing attention is mainly due to the increasingly flowing amount of money into the sports, owing to the financial boost enabled for researchers to analyse sports successes also from the economic point of view.Hosting a mega sport event, such as the Olympic Games, may already be analysed not only from the perspective of professional advantage but also from a direct and indirect financial perspective.The phenomenon of home advantage, also known as the 'host effect', is a long-researched area and has extensive empirical evidence that athletes of the organising country tend to perform better in some sports (Bernard -Busse 2004;Forrest et al. 2010Forrest et al. , 2017)).Owing to the sport-related economic research, it is also empirically stated what are the direct economic impacts (Sterken 2006;Langer et al. 2018), and what are the indirect impacts, e.g., improving country image or what are the economic benefits beyond the professional advantage of hosting a mega sport event (Gibson et al. 2008;Hahm et al. 2018;Tasci et al. 2019).
Sport event hosting is not the only field where sport and economy are linked; there are also rational considerations in investing in elite sports from a governmental point of view.Elite sportsdefined as the level of sport that meets the International Olympic Committee (IOC) criteriatend to generate international prestige for the nation, increase the participation in leisure sport and promote 'feel-good' factor at the society (Grix -Carmichael 2012).As a result, more and more countries are investing vast amounts of money in sports, especially elite sports, with the goal of making themselves more competitive than their counterparts, that is, to win as many medals at international sports competitions as possible (De Bosscher et al. 2006).The boost in elite sport funding has increased the difficulty of winning medals as well as the 'price of success' at the Olympics, because competition for the medals keeps getting more and more fierce, while the number of medals has barely changed (De Bosscher et al. 2008).This phenomenon has led to a rearrangement in the systems of sport funding, because a country that aims to remain successful needs to increase its elite sport funding continually, as merely maintaining it at the same level has become insufficient (Guly as 2016).
This phenomenon is widening the gap between successful and unsuccessful countries even further, even though competition at the Olympics has been very unbalanced so far (Bernard -Busse 2004;Andreff -Andreff 2015).As many studies have shown, the countries that are poorer or have worse socio-economic indicators have extremely slim chances to increase their success (Forrest et al. 2017;Kov acs et al. 2017), and the pressure to increase funding makes their situation even direr; this is mostly due to social and economic inequalities as GDP per capita and population determine success by more than 50% (Bernard -Busse 2004;De Bosscher et al. 2008).However, in the past few decades the correlation between the two most important socioeconomic indicators (population and GDP), and Olympic success has become less prominent, which means that other factors must have become more important (De Bosscher et al. 2015;Kov acs et al. 2017).According to Shibli et al. (2012), macro-level variables have become less important because 'elite sport systems' took their place, since through these systems the policymakers of a country can have a real impact on winning medals, as opposed to using macroeconomic variables that cannot be controlled in the short-term.The fact that the sport policies can be modified by the policymakers, even in the short-term, highlights the importance of mesolevel factors (Guly as et al. 2016).Therefore, the success of an athlete or a team depends more and more on the efficiency of the elite sport system of its country, that is, how efficiently that country utilises the resources at its disposal (De Bosscher et al. 2006;Kov acs et al. 2017;Weber et al. 2018).
Practically speaking, increasing resource use efficiency means maximising the number of medals wonor market share, in economic termsin proportion to the resources at a country's disposal (Sterbenz et al. 2017).Countries with more modest economic backgrounds need to find the sports in which they have a realistic chance to win medals.One way of achieving this is by segmenting sports based on their economic weight and identifying the costs involved (Forrest et al. 2017).However, if we consider elite sport funding as a form of investment, weighing the expenses of different sports is insufficient in itselfin order to maximise success, it is crucial to be aware of the risks involved in these sports.We should compare these risks, as this can best show the differences between the returns of funding each sport, which is basically the various difficulty levels of winning medals in different sports (Csurilla -Sterbenz 2018).
In sports, the most important source of risk is luck, which is the sum of those external factors that have an impact on the outcome of the competition, yet which the athlete has no control over (Sterbenz et al. 2014;Csurilla -Sterbenz 2018).According to Mauboussin (2012), luck, skill and the combination of the two determine almost everything in almost every aspect of life; therefore, sports can be compared based on how much of a role luck plays in them.However, the methods to quantify luck (Mauboussin 2012;Gettyt et al. 2018;Gilbert -Wells 2019) are hard to apply to the Olympic sports.Most of these studies use the expected outcomes of a league in the prediction of the probability of winning for a team in a season, and quantify the luck based on the standard deviation of winning probability.However, the number of observations compared to the Olympics makes the feasibility of these approaches difficult.Furthermore, due to the different number of medals and the different quantification systems and tournament formats at the Olympic Games, luck in itself cannot be measured in a standardised way across various sports.
Instead of luck, we use the definition of noise, which is also connected to risk and can be measured better than luck (Sterbenz et al. 2014;Csurilla et al. 2019).According to Lazear (2007), noise in the economic sense manifests itself in insecurity in connection with the production and measurement errors of performance for the most part.During production process, noise occurs when highly skilled workers produce low output despite their reasonable efforts.In this case, the noise has a negative effect on performance.On the other hand, measurement errors of performance occur when workers put in considerable effort yet are judged as mediocre or, on the contrary, when their performance is low yet is perceived to be much higher.In sports, noise comprises all the factors that make the outcome of competitions unpredictable; in essence, it is the difference between the results expected from athletes based on their skills and the outcome of the competition (Sterbenz et al. 2014).This means that noise includes not only luck, but those factors that athletes have no control overthese differ from sport to sport.A good example is the tournament format or the obtainable number of medals for an athlete in a given sport.While a round-robin tournament tends to reduce the level of noise, as all contestants have the chance to meet all other contestants, a straight knockout tournament increases the level of noise, as a defeat can cause the end of the tournament for an athlete.The case is similar to the obtainable number of medals by a single athlete: the more medals can be won by one person, the level of noise will be smaller.The explanation is simple in the language of statistics: increasing sample size reduces standard error.Sports where the probable winner always wins against their 'underdog' opponent have no noise, as results align perfectly with the expected results.In every other case, the level of noise depends on how often, and by how much, an underdog can prevail over their probable-winner opponent.
When it comes to the Olympics, it is hard to find any athlete who was able to compete at the highest level for more than two Olympic Games, which makes studying noise at an individual level extremely difficult.Therefore, we based our study on the hypotheses that competition at the Olympics is very unbalanced (Andreff -Andreff 2015) and the less successful countries have little leverage to increase their success (Kov acs et al. 2017).We also know that the successful countries have 'elite sport systems' that produce the supply of athletes in given sports (De Bosscher et al. 2008).This means that noise can be measured not only at the individual level but at the country level as wellthose sports where a given country can remain successful will have a low level of noise, as knowledge and skills correlate with performance.Those sports where no one can perform steadily will have a high level of noise.Our main concept is to form expectations on the results of each country based on the long-term Olympic performances.To what extent the actual results deviate from the expectations could serve as a measure of noise.This concept can be formalised as the residual variance of a regression model, where the results of the previous Olympics are included as explanatory variables.
Our research serves multiple purposes.First, we want to develop a method which determines the level of luck-based noise factor at the Summer Olympic Games in individual sports, as no studies have been published on this topic so far.The competitive balance at the Olympics has a broad literature background; however, it has been examined from the perspective of the uncertainty of outcome hypothesis (Truyens et al. 2016;Weber et al. 2016;Zheng et al. 2019).We estimated several regression model specifications to find the best model for long-term country performance, then predicted the market shares of each country and calculated the noise factor for each sport as the prediction error of the corresponding model.Second, we aimed to identify those sports that could be feasible fields for sport investments for those countries that are less successful in the Olympics.At high noise levels, external factors can have a significant impact on the outcomes of competitions, therefore increasing the amount of funds does not necessarily result in more medals won due to the level of risk.In accordance with the international literature, the concept of 'elite sport' has been used in the paper.

Description of the data
We use the results of 14 Summer Olympic sports for the Olympic Games between 1996 and 2016, that is, a total of six Olympic Games are analysed.Performance data are collected from the Gracenote database 1 .These years are selected because there has been a rearrangement in the market of the Olympic Games over the last few decades due to increased competition and the effects of political changes, for example, the dissolution of the Soviet Union.The athletes from the former Soviet republics competed under the name Unified Team at the 1992 Olympics; as we 1 Gracenote is a data provider company who in part, collects data about the Olympic Games.Access to these data is limited.The authors are grateful to the Hungarian University of Physical Education for providing access to the data.could not clearly distinguish to which country their results belonged to, the data from 1992 would have had a significant distorting effect.
Over the course of the data cleaning, the results of Yugoslavia and its successors, Serbia and Montenegro, are counted in the results of Serbia, because in individual sports the good results were mainly delivered by the Serbian athletes, and not by the Montenegrin ones.The Independent Olympic Athletes team at the 2016 Olympics consisted of the Kuwaiti athletes; therefore, their results are assigned to the Kuwaiti team.
In order to be able to compare different types of sports, we calculate the market share (MS) of the results.Market share is a widely used performance indicator in the field of sport economic research, because it is the only way to compare the performance of the countries in certain sports despite the different number of medals (Bernard -Busse 2004;De Bosscher et al. 2008;Forrest et al. 2010;Kov acs et al. 2017).Market share is a percentage which is calculated as the sum of points earned by a country in a given sport divided by the total obtainable points in the sport.
Since only a few medals are given in some sports, market share calculated by the number of medals would vary greatly, resulting in a high standard deviation.Since the top 8 places appear to be much more stable, they provide a better basis for analysis due to the lower deviation, so we apply the following weights to them, respectively: 6; 3; 2; 1.5; 1.2; 1; 0.86; 0.75 (Csurilla et al. 2019).Market share shows what percentage a given country could win from the total points.The following formula has been used for calculating market shares: where MS i;j;t is the share of 'Olympic performance' of country i in sport j at time t.P i;j;t indicates the gained points of country i in sport j at time t.This way, we get an observation for every country in each sport for each Summer Olympic Games between 1996 and 2016.This creates a panel dataset, where the individual units are countries in specific sports (e.g., United Statesathletics), and the time dimension is associated with the successive Olympic Games.All the country-sport observations are included in the dataset that obtained a positive market share at any Olympics in the sample.We consider these countries as the relevant competitors for points in the particular sports.Descriptive statistics of the MS variable are shown in Table 1.The number of countries earned any positive market shares were the highest in athletics, while in table tennis and modern pentathlon there were less than 30 countries alone in the dataset.This is mainly caused by the fact that there are many different athletics events at the Olympics, while currently only a few for table tennis and only two (men's and women's) for modern pentathlon.The mean market share is generally higher in sports where less countries are dominating the sport.There is a high proportion of 0 market share values in the dataset, with more than 50% for table tennis, tennis and modern pentathlon.
Several factors might influence the Olympic performance of a country, so other variables were included in the dataset.GDP per capita (GDP_PC) and population (POP) have the most remarkable explanatory power on the performance.The correlation makes sense: the wealthier and the more populous a nation is, the more the chance is to produce better qualified athletes and find medal winner talents.Consequently, the size of the Olympic team (ATHLETE) usually highly correlates with these two indicators in the country-level (Vagenas -Vlachokyriakou 2012;Trivedi -Zimmer 2014;Kov acs et al. 2017).However, the number of athletes gives more detailed information about a nation's sport funding strategy in the sport-level than the macrolevel indicators alone.Furthermore, the organising country tend to benefit extra medals from the hosting.The 'host effect' can be captured with a binary variable (HOST) if a county was the host nation of the Olympic Games in a given year (Bernard -Busse 2004;Forrest et al. 2010Forrest et al. , 2017;;Trivedi -Zimmer 2014;Dur aczky -Bozsonyi 2020).Following the previous studies, the GDP per capita and the population have been applied in logarithmic form (Bernard -Busse 2004;Vagenas -Vlachokyriakou 2012;Trivedi -Zimmer 2014;Forrest et al. 2017).The descriptive statistics of the explanatory variables used are presented in Table 2.

Econometric model
Our primary method to measure noise was to determine the expected market shares of every country in each sport, using the results of the previous Olympic Games and other exogenous variables.Then, the next is to calculate how far the actual market shares are from that prediction.This was done in an econometric model framework, using the prediction error as a measure of noise.We followed the approach to model the long-term Olympic performance of each country as an autoregressive process supplemented with exogenous variables that are uniformly calculated regardless of sport, treating every other sport-specific factor as a part of the noise factor.Including other sport-specific variables, this would make the results by sports incomparable, which is essentially the fundamental purpose of this study.A simple equation to predict the MS of country i in sport j at time t using the MS of the previous p Olympic Games, takes the following form.It is essentially an autoregressive model of order p regarding all country-sport units.
In this model, z i;j; t is the vector of exogenous variables (lnGDP_PC, lnPOP, HOST, ATHLETE) and the error term « i;j;t is associated with noise.Observation with missing data for any of the explanatory variables are omitted.The study of Csurilla et al. (2019) is considered as a pilot analysis, where the level of noise was measured by estimating a simpler equation via OLS method.The dependent variable was the MS of a given Olympics, and the independent variables were the results of the previous three Olympic Games.However, in this paper, several other model specifications were considered.First, OLS estimations of the linear autoregressive model with various p orders were conducted.Then, a more sophisticated modelling process was followed to take into account the particular characteristics of our data.
The prediction error is most frequently measured by the mean squared error (MSE).To compare the models with data of different sports, the normalised MSE (NMSE) is applied, which is equal to 1-R 2 in the case of the linear model.The mean of the squared prediction errors is normalised with the variance to obtain the NMSE.
We use this measure as an indicator of noise, as the higher the NMSE, the worse the model can be used for prediction, so more noise is expected.The sports, where previous results can explain later results, are expected to have less noise, as the competitive advantage has an effect on short and medium terms as well, so probable winners usually win as expected.
The linear model might yield high explanatory power, but it has shortcomings for the proportion-type dependent variable (Ferrari -Cribari-Neto 2004).Market shares are proportions, so their values are bounded between 0 and 1.The OLS method comes with the problem of predicting market shares out of the 0-1 interval.Furthermore, in each sport, there are many countries with low market shares, while only a few that take a considerable proportion of the market.Therefore, the distribution of market shares is asymmetric, which calls for a non- linear model.Finally, a considerable amount of countries did not obtain any points in a given sport at most Olympic Games, resulting in a vast number of 0 market share observations.These characteristics of the dependent variable make the parameter estimations of an OLS predictor biased.Our goal is not to interpret the model coefficients, but to analyse the goodness-of-fit of the models.However, considering a model that solves the above issues is required, as it might yield noise levels that are very different from the OLS estimations.Previous models that try to predict the number of Olympic medals won by each country, either using Tobit regression models (Bernard -Busse 2004;Trivedi -Zimmer 2014;Kov acs et al. 2017) or Poisson/Negative Binomial regression models (Lui -Suen 2008; Dur aczky -Bozsonyi 2020).However, our data is different, as the number of medals is a count-type data, while the market share is a proportion data, for which the Poisson distribution-based models are not applicable.Ferrari -Cribari-Neto (2004) propose beta regression to model proportion data in the 0-1 interval.It is based on the assumption that the dependent variable follows a beta distribution.The beta distribution can be flexibly fit to proportion data to address non-linearity, with the two parameters that modify the shape of the distribution.The beta distribution looks as follows, with parameters m and σ sigma, where Gð_ sÞ is the gamma function.
f ðyjm; σÞ ¼ GðσÞ GðmσÞGðð1 À mÞσÞ This parameterisation is convenient for regression purposes, as m corresponds to the mean of the variable and σ is a precision parameter that affects the variance of the distribution.The larger is the σ, the smaller is the variance for a fixed m.A shortcoming of the pure beta regression is that it cannot fit values precisely at 0 and 1.Our data of market shares does not contain 1 value (no country obtained all the points at any Olympic Games), but it contains a considerable number of 0 values.To deal with the issue of zero-inflated data, a zero-inflated beta (ZIB) regression is applied, as suggested by Ospina -Ferrari (2012).The zero-inflated beta distribution takes the following form: This is essentially a beta distribution mixed with a positive constant probability at y ¼ 0, which means, that the probability of a zero observation equals p 0 .For the regression model the ν parameter is introduced, where ν ¼ p 0 =ð1 − p 0 Þ.Link functions are needed to be defined to estimate the three distribution-parameters (m; σ; ν).The commonly used link functions are the logit link for m and ν, and log link for σ.The model equations for the three parameters are the following, with p lags of MS included in the model.
The parameter m of the beta distribution indicates the mean of the outcome variable MS t .The σ precision parameter is assumed to be a constant for all observations, as there is no theoretical reason to believe, that the past results affect this parameter.Moreover, this assumption makes the parameters easier to interpret and keeps the model simpler to avoid overfitting when it is applied for sports with fewer observations.ν affects the probability of an observation to be 0, so it theoretically depends on the given country's previous market shares.The ZIB regression is estimated using the R package gamlss, which uses maximum likelihood estimation, explained in detail in Rigby et al. (2019).
In the case of the ZIB models, the predicted market shares are given as b m * ¼ Eð c MS i;j;t Þ.This expected value takes the form of the average of the parameter m and 0, weighted by the probability of observing 0 (Ospina -Ferrari 2012).
First, the pooled linear autoregressive models, then the ZIB regression model were conducted, both including various p lags.We argue that the ZIB regression is the correct model choice of country market shares, solving all the issues coming from the particular characteristics of the data.The OLS regression results are reported as a benchmark because of the more straightforward way to interpret the parameters.We apply these specifications to find the correct lag structure, which is later used to compare different sports in terms of their level of noise.It is important to note that we had panel dataset, but no fixed effects are included in the models.The Olympic Games fixed effect is not needed because the market shares always sum to 1, so there are no differences between the average expected market shares of different Olympic Games.The country fixed effects are not included, because with many coefficients to estimate, the degrees of freedom fell drastically, and the problem of overfitting appears.
Table 3 displays the parameter estimations of the different model specifications.All the coefficients of the OLS estimation with 1 lag are significantly positive at the 1% level.Our findings confirm the results of the previous studies which means that higher domestic population, per capita GDP and more participating athletes are associated with better Olympic performance and the host country of a certain Olympics is found to perform better.The coefficients of the autoregressive terms intuitively showed that the results of the previous Olympic Games have a positive impact on the current market shares, while the second model shows that the second to last Olympics has a smaller effect.The m coefficients of the ZIB models tell the same story.The majority of the coefficients in the ν equation were all negative, but among the exogenous variables only the number of athletes proved to be statistically significant.The negative coefficient indicates that the probability for a nation to have zero market share at the Olympics is lower if they have better results at the previous Olympics and if they delegate more athletes.
There is a trade-off between the number of observations and the number of lags included in the model.We follow the approach to select the model with the lowest possible number of lags, for which no first-order autocorrelation of the residual was present.The Breusch-Godfrey/ Wooldridge test is applicable to test for serial correlation in the idiosyncratic (country-sport specific) errors in the panel data (Wooldridge 2010).The null hypothesis of no serial correlation is rejected in the linear model with 1 lag but cannot be rejected at the 1% level in case of the model with 2 lags.The second lag of MS is needed to be included to avoid first-order serial  correlation, so models with two lags are preferred.Further lags are also considered, but they do not significantly increase the explanatory power of the model but left less than 70 observations for certain sports.
To compare the sports in terms of their level of noise, the OLS and ZIB regressions are performed separately for each sport, using market shares of the previous two Olympic Games along with the discussed exogenous variables as explanatory variables.This process yielded separate b, σ 0 , g and r coefficients for each sports.After estimating the models, we calculate the NMSE separately for sports to obtain the level of the noise factor.The results are discussed in the next section.

RESULTS AND DISCUSSION
Estimating the ZIB regression with two lags yields NMSE values separately for sports.These are compared to the NMSE of the OLS model with the same lags.The results are presented in Table 4, along with the number of observations used for each sport.
Comparing individual sports by the level of noise shows that unexplained variance in swimming is the lowest followed by table tennis and athletics independent from the model Notes: NMSE values are extracted from the zero-inflated beta regression and the OLS regression with 2 lags of MS, described in the Data and Methodology section.The models are estimated separately for each different sport.
choice.In these sports, the results are strongly related to the past results, as some countries are able to generate a competitive advantage that has a visible effect on the outcome of the competition, and it can be maintained by those nations through several cycles of the Olympic Games.The NMSE is the highest in the case of modern pentathlon, tennis and cycling.
There can be various explanations for this phenomenon.Regarding tennis and modern pentathlon, there are currently only 4 and 2 gold medals awarded respectively at the Olympics.Therefore, our sample size of points to calculate market shares for these sports are very low, and that might inflate the observed variability of market shares.Athletes in these sports do not have the opportunity to compensate for their failures, a small mistake can cost them a medal, and there are not any other events for a second chance.On the other hand, swimming shows low noise partly because there are a lot of medals and points distributed at the Olympics.If athletes of a successful country fail to get medals in one competition, there are plenty of other events to catch up.
There could be other fundamental reasons for the high noise in certain sports.The high variability in tennis could be due to the fact that, as opposed to most individual sports, the Olympics are not the most prestigious competition for its athletes, so the top players do not time their peak performance to coincide with the Olympic Games.The extraordinary unpredictability in modern pentathlon might be explained by the fact that this sport is made up of five different sports; therefore, the value of noise is multiplied as well, making the results of the competitions highly volatile.Furthermore, competitors are paired with horses in a draw, so luck plays a crucial role in equestrian show jumping.
The ZIB regression predictions are also performed separately for different years to reveal how constant the level of noise remains in various sports over time.The market shares of the last three Olympic Games (2016,2012,2008) are included as the dependent variable in separate models, with explanatory variables up to two lags of MS and exogenous variables, similarly to the previous models.The NMSE is calculated separately for each sport and year to present the dynamics of noise (Fig. 1).The unexplained variation shows similar results in the least noisy sports for all years, meaning that our results are quite robust over time.Cycling showed the highest standard deviation between years, which is presumably due to the different nature (road, track, MTB, BMX) and organisation of its types, and also because BMX racing has only been part of the Olympic Games since 2008.Some sports such as rowing, or canoe sprint presented a slightly decreasing trend in noise.While in the case of other sports, including gymnastics, judo, fencing, and shooting, there is an increasing trend in noise.It requires a more extended sample period to derive more general conclusions about the dynamics of noise.
It is essential to confirm whether the measured noise is robust to the applied methodology.We compare noise in each sport, measured by the OLS and ZIB regression methods with various lags.
Table 5 presents the pairwise correlations between the indicators.All the Pearson correlation coefficients are above 0.97; this suggests that our results are robust for the applied methodology.The correlations between different lag orders of the same method are particularly high.Although the ZIB regression method uses a very different functional form, it does not lead to very different conclusions from the linear regression.

CONCLUSION AND LIMITATIONS
It is hard to measure the level of luck-based noise factor in different sports directly.The main purpose of our study is to develop a method that could be applied to individual sports.We estimate noise in each sport with model predictions through the use of nations' market shares instead of the results of individual athletes.The level of noise is identified by the extent of the normalised prediction errors of the models.If the market shares of the countries in a particular sport can be anticipated by the previous results and exogenous factors, then the level of noise is expected to be low, as the competitive advantage of certain nations can lead to their domination over several cycles of the Olympic Games, and 'elite sport systems' can operate effectively.
The results show the lowest level of noise in swimming, followed by table tennis and athletics in the predictions of the OLS and ZIB analyses.Based on the results, we can claim that the elite sport system can work optimally in the case of swimming, as many countries can provide an appropriate supply from Olympics to Olympics.Accordingly, nations that would like to achieve continuous success should primarily focus on swimming.In the case of table tennis, only a few countries have the professional player base currently, that are capable of acquiring the proper level of table tennis skills to be competitive at the highest level.This is mainly based on the popularity of the sport in those countries, which enables an effective, rigorous training system to obtain the needed specialised skills.Therefore, in the 'market' of table tennis, the particular skill is an entry barrier, which tends to cause the low level of unexplained variation.Among the sports involved in the research, modern pentathlon, tennis and cycling had the highest levels of noise with all models.These are the sports where, in addition to a high level of luck, other external factors also contribute to the fact that few countries can continuously achieve good results.Such external factors may be, for example, the low number of medals to be obtained (modern pentathlon and tennis), the intensifying competition (cycling), or changes in the competition format or in a discipline of the sport (modern pentathlon and cycling).
The results of the research can be also considerable for IOC from a financial point of view.Based on our findings, one of the less noisy sports are table tennis, swimming and athletics.Swimming and athletics generate one of the most considerable interest and as a result one of the highest revenues on the events.It might be the case that the ability of sports to generate revenue is linked to the level of noise, although this statement requires further analysis.If one of the IOC's aim is to improve the profitability of the Games, then such changes should be performed, which tend to reduce the level of noise in the less popular sports too.A case in point can be to change the tournament format or to increase the available number of medals for an athlete as we have discussed before.
There are several limitations to our research.First, none of the methods managed to identify the extent to which luck plays; only noise as unpredictability, could be measured.Noise includes several factors other than luck, which also influence results but are not dependent on luck.The best example of this is the noise resulting from different qualification systems, as there are many sports, in which a country can only qualify one athlete.Therefore, it can easily happen that the world's second-best athlete in a particular sport cannot enter the Olympic Games.Second, the poorer countries do not always have elite sport generating systems, so they cannot provide a continuous and appropriate supply of athletes in sports where they would like to achieve success.In sports, where the poorer countries attempt to become successful, it may occur that we measure a higher level of noise than the actual long-term equilibrium.However, the difference between the actual and measured noise cannot be significant and long-lasting in these sports, because a more prosperous country with effective sport management would undoubtedly attempt to take advantage of its dominant position in the hope of obtaining 'easy' medals.Thirdly, an expected medal winner athlete could be injured or be banned for doping rule violation just before the Olympics.The influence of such a situation on noise is remarkable in sports and in countries where the number of medals available or the amount of expected medal winners is relatively low.Moreover, the distinct climatic conditions at the host countries may also be responsible to the fluctuation in the performance of some nations, and thus, for the differences in the noise levels.Some countries, especially with limited resources, have not yet possessed the kind of knowledge which makes accessible for their athletes the preparation to different climatic conditions.
Furthermore, the limitation of our method to use the country-level market shares of an artificial point system should also be mentioned.Even if a country's market shares in a particular sport remain stable, it could be the case that the individual athletes achieving the results are continually changing; therefore, the country-level estimated noise will be much lower than the individual-level noise.The exact level of measured noise might vary with different point systems; however, we ran tests with different point systems, and the results are fundamentally the same in each case.Eventually, the issue of using Olympic data for non-Olympic focused sports is necessary to highlight as well.In such sports, e.g., tennis or road cycling, where the Olympics is not the most prestigious event, the results will be biased, because the best athletes will not time their peak performances to this event or they will not even appear at all on the Games.
We are aiming to continue our research in the future as well.We would like to determine the level of noise in more types of sports so that we can compare all individual Olympic sports.Another goal is to further develop our methodology in order to create a method for the measuring of (only) luck in the Olympic data; however, this will require more detailed performance data so that sports can be analysed not only as a whole, but also by discipline and event as well.

Fig. 1 .
Fig. 1.The trend of the noise-factor (measured by NMSE of the ZIB 2 lag model) for different Olympic sports and years

Table 1 .
Characteristics of the market share variable by Olympic sports

Table 2 .
Summary statistics of explanatory variables

Table 3 .
Zero-inflated beta regression parameter estimates, without separating the sports

Table 4 .
The level of noise for different sports measured by NMSE

Table 5 .
Pairwise correlation among the different indicators of noise