Forecaster | Rank | Tau |
---|---|---|
John Edwards | 1 | 0.599 |
mohamed | 2 | 0.564 |
OlymPicks | 3 | 0.548 |
Hammers_O’Callaghan | 4 | 0.542 |
ALEC | 5 | 0.538 |
fa2410 | 6 | 0.536 |
AJM | 7 | 0.533 |
ChristopherWharton | 8 | 0.531 |
JKH | 9 | 0.512 |
Orla_S | 10 | 0.510 |
KaitoGoto | 11 | 0.508 |
JoePenn | 13 | 0.507 |
JohnDSouza | 13 | 0.507 |
JPN2020 | 14 | 0.506 |
HarrySnart | 15 | 0.499 |
WilliamBowers | 16 | 0.494 |
LexieB | 17 | 0.493 |
ANIK | 18 | 0.485 |
Rank_Deficient | 19 | 0.479 |
WeatherQuant | 20 | 0.458 |
Kizmet24 | 21 | 0.426 |
tghaynes | 22 | 0.395 |
Everyone_is_(equally)_awesome | 23 | 0.000 |
RSS Olympic prediction competition 2024
Context
Each summer the Royal Statistical Society’s Statistics in Sport Section organizes a prediction competition. For 2024 the competition, which is being sponsored by Amelco, involves predicting the medals table for the Paris Olympics.
This year’s competition is novel in several ways. Firstly, forecasters will be asked to provide a single predicted ranking whose distance from the true ranking at the end of the games will determine their score. This stands in contrast to previous forecasting competitions, in which forecasters have been required to specify probabilities for the outcomes of individual sports matches in a tournament. Secondly, the ‘true ranking’ for the Olympics is not as well defined or officially endorsed as it is for other events.
For the purposes of this competition the true final ranking will be determined by the number of gold medals won by each country. Silver medals will be used only to break ties between countries with equal numbers of gold medals. Bronze medals will be used to break ties between countries after silver medals are accounted for. Countries still tied after accounting for bronze medals will remain tied.
Scoring
Predicted rankings will be scored using a statistic called Kendall’s tau, or the Kendall rank correlation coefficient. There are (at least) two ways to compute this statistic, which shed light on its meaning. To understand what the statistic quantifies it is first necessary to think about the all the possible pair-wise comparisons between countries. With \(n\) participating countries there are \(n(n-1)/2=\binom{n}{2}\) of these. A forecaster will gain points for every pair which they have placed in the correct order, and they will lose a point for every pair which they placed in the wrong order. If a pair countries are tied, either in the predicted ranking or the true ranking, this pair does not contribute to the score. The score is then scaled by the total number of pairs so that it falls between -1 and 1.
We can write this down as
\[\begin{align} \tau & = \frac{\text{number of concordant pairs}-\text{number of discordant pairs}}{\text{number of pairs}} \\ & = \frac{2}{n(n-1)} \sum_{i<j}\text{sign}(R_i-R_j)\text{sign}(S_i-S_j) \end{align}\]where ‘concordant’ is a word used in the academic literature to mean ‘in agreement’. Similarly ‘discordant’ means ‘in disagreement’. For example, if a forecaster ranks Kiribati higher than Tanzania and, indeed, Kiribati is higher than Tanzania in the final true ranking then we say that the forecaster has a concordant pair with the true ranking. As alluded to in the equation above, it is also possible to show that Kendall’s tau is the correlation between the signs of the differences in ranks. In this expression \(R_i\) denotes the numerical rank assigned by a forecaster to a country (e.g. if a forecaster predicted country \(i\) to be second from the top in the medals table they would set \(R_i=2\)). The quantity \(S_i\) denotes the true rank for country \(i\). The \(\text{sign}\) function used here returns value one when its argument is positive, minus one when its argument is negative and zero when its argument is zero.
There are 206 countries participating in the 2024 Olympic games, although officially they are referred to as National Olympic Committees (NOCs) rather than countries. In the submission template you will find a list of the NOCs and, as a default, they are all ranked jointly at 206th. This default entry will receive a Kendall’s tau score of zero because, as mentioned above, tied pairs contribute nothing to the score - they are considered neither concordant nor discordant. Forecasters are invited to modify the ranks in any way they see fit. You could, for example,
- cluster countries into groups all with the same rank,
- use non-integer ranks,
- use ranks that go below 1!
The only thing that matters is that your ranks allow for the pairwise comparisons that contribute to Kendall’s tau.
Submission template
You can download the submission template here. Only modify the numbers in the Rank column. In particular do not permute the rows of the NOCs.
Email completed submissions to ben.powell@york.ac.uk with the subject line SiS forecasting competition 2024. This email should contain an attached .csv file called “RSS_pred_comp_submission_NAME.csv” where NAME is the name you would like to appear on the leader board.
If you would like to be considered for the methodology prize, please provide a brief description of how you made your forecast predictions.
Please make your submissions before Monday 22nd July. The games start on Friday 26th July but it will be necessary to have submission files a bit earlier in order to initialize the forecast scoring procedure and the leader board.
Prizes
The principal and most valuable prize for winning the forecasting competition is prestige (and a certificate)! Special attention and corresponding certificates will be awarded for
- Best overall score
- Best student score
- Most innovative methodology
The winners of these three awards will also be invited to the Royal Statistical Society conference to present their methods. This year’s conference will be held in Brighton in early September. Conference fees and travel/accommodation costs are being provided by sponsors Amelco.
Leader board
The competition is now over and our winner is John Edwards. Well done John and everyone who took part!
Medals table
Last updated: 2024-08-11 21:13:10
Code | NOC | Rank | Gold | Silver | Bronze |
---|---|---|---|---|---|
USA | United States | 1 | 40 | 44 | 42 |
CHN | China | 2 | 40 | 27 | 24 |
JPN | Japan | 3 | 20 | 12 | 13 |
AUS | Australia | 4 | 18 | 19 | 16 |
FRA | France | 5 | 16 | 26 | 22 |
NED | Netherlands | 6 | 15 | 7 | 12 |
GBR | Great Britain | 7 | 14 | 22 | 29 |
KOR | South Korea | 8 | 13 | 9 | 10 |
ITA | Italy | 9 | 12 | 13 | 15 |
GER | Germany | 10 | 12 | 13 | 8 |
NZL | New Zealand | 11 | 10 | 7 | 3 |
CAN | Canada | 12 | 9 | 7 | 11 |
UZB | Uzbekistan | 13 | 8 | 2 | 3 |
HUN | Hungary | 14 | 6 | 7 | 6 |
ESP | Spain | 15 | 5 | 4 | 9 |
SWE | Sweden | 16 | 4 | 4 | 3 |
KEN | Kenya | 17 | 4 | 2 | 5 |
NOR | Norway | 18 | 4 | 1 | 3 |
IRL | Ireland | 19 | 4 | 0 | 3 |
BRA | Brazil | 20 | 3 | 7 | 10 |
IRI | Iran | 21 | 3 | 6 | 3 |
UKR | Ukraine | 22 | 3 | 5 | 4 |
ROU | Romania | 23 | 3 | 4 | 2 |
GEO | Georgia | 24 | 3 | 3 | 1 |
BEL | Belgium | 25 | 3 | 1 | 6 |
BUL | Bulgaria | 26 | 3 | 1 | 3 |
SRB | Serbia | 27 | 3 | 1 | 1 |
CZE | Czechia | 28 | 3 | 0 | 2 |
DEN | Denmark | 29 | 2 | 2 | 5 |
AZE | Azerbaijan | 31 | 2 | 2 | 3 |
CRO | Croatia | 31 | 2 | 2 | 3 |
CUB | Cuba | 32 | 2 | 1 | 6 |
BRN | Bahrain | 33 | 2 | 1 | 1 |
SLO | Slovenia | 34 | 2 | 1 | 0 |
TPE | Chinese Taipei | 35 | 2 | 0 | 5 |
AUT | Austria | 36 | 2 | 0 | 3 |
HKG | Hong Kong, China | 38 | 2 | 0 | 2 |
PHI | Philippines | 38 | 2 | 0 | 2 |
ALG | Algeria | 40 | 2 | 0 | 1 |
INA | Indonesia | 40 | 2 | 0 | 1 |
ISR | Israel | 41 | 1 | 5 | 1 |
POL | Poland | 42 | 1 | 4 | 5 |
KAZ | Kazakhstan | 43 | 1 | 3 | 3 |
JAM | Jamaica | 46 | 1 | 3 | 2 |
RSA | South Africa | 46 | 1 | 3 | 2 |
THA | Thailand | 46 | 1 | 3 | 2 |
ETH | Ethiopia | 47 | 1 | 3 | 0 |
SUI | Switzerland | 48 | 1 | 2 | 5 |
ECU | Ecuador | 49 | 1 | 2 | 2 |
POR | Portugal | 50 | 1 | 2 | 1 |
GRE | Greece | 51 | 1 | 1 | 6 |
ARG | Argentina | 54 | 1 | 1 | 1 |
EGY | Egypt | 54 | 1 | 1 | 1 |
TUN | Tunisia | 54 | 1 | 1 | 1 |
BOT | Botswana | 58 | 1 | 1 | 0 |
CHI | Chile | 58 | 1 | 1 | 0 |
LCA | Saint Lucia | 58 | 1 | 1 | 0 |
UGA | Uganda | 58 | 1 | 1 | 0 |
DOM | Dominican Republic | 59 | 1 | 0 | 2 |
GUA | Guatemala | 61 | 1 | 0 | 1 |
MAR | Morocco | 61 | 1 | 0 | 1 |
DMA | Dominica | 63 | 1 | 0 | 0 |
PAK | Pakistan | 63 | 1 | 0 | 0 |
TUR | Turkey | 64 | 0 | 3 | 5 |
MEX | Mexico | 65 | 0 | 3 | 2 |
ARM | Armenia | 67 | 0 | 3 | 1 |
COL | Colombia | 67 | 0 | 3 | 1 |
KGZ | Kyrgyzstan | 69 | 0 | 2 | 4 |
PRK | North Korea | 69 | 0 | 2 | 4 |
LTU | Lithuania | 70 | 0 | 2 | 2 |
IND | India | 71 | 0 | 1 | 5 |
MDA | Moldova | 72 | 0 | 1 | 3 |
KOS | Kosovo | 73 | 0 | 1 | 1 |
CYP | Cyprus | 78 | 0 | 1 | 0 |
FIJ | Fiji | 78 | 0 | 1 | 0 |
JOR | Jordan | 78 | 0 | 1 | 0 |
MGL | Mongolia | 78 | 0 | 1 | 0 |
PAN | Panama | 78 | 0 | 1 | 0 |
TJK | Tajikistan | 79 | 0 | 0 | 3 |
ALB | Albania | 83 | 0 | 0 | 2 |
GRN | Grenada | 83 | 0 | 0 | 2 |
MAS | Malaysia | 83 | 0 | 0 | 2 |
PUR | Puerto Rico | 83 | 0 | 0 | 2 |
CIV | Ivory Coast | 90 | 0 | 0 | 1 |
CPV | Cape Verde | 90 | 0 | 0 | 1 |
PER | Peru | 90 | 0 | 0 | 1 |
QAT | Qatar | 90 | 0 | 0 | 1 |
SGP | Singapore | 90 | 0 | 0 | 1 |
SVK | Slovakia | 90 | 0 | 0 | 1 |
ZAM | Zambia | 90 | 0 | 0 | 1 |
AFG | Afghanistan | 204 | 0 | 0 | 0 |
AND | Andorra | 204 | 0 | 0 | 0 |
ANG | Angola | 204 | 0 | 0 | 0 |
ANT | Antigua and Barbuda | 204 | 0 | 0 | 0 |
ARU | Aruba | 204 | 0 | 0 | 0 |
ASA | American Samoa | 204 | 0 | 0 | 0 |
BAH | Bahamas | 204 | 0 | 0 | 0 |
BAN | Bangladesh | 204 | 0 | 0 | 0 |
BAR | Barbados | 204 | 0 | 0 | 0 |
BDI | Burundi | 204 | 0 | 0 | 0 |
BEN | Benin | 204 | 0 | 0 | 0 |
BER | Bermuda | 204 | 0 | 0 | 0 |
BHU | Bhutan | 204 | 0 | 0 | 0 |
BIH | Bosnia and Herzegovina | 204 | 0 | 0 | 0 |
BIZ | Belize | 204 | 0 | 0 | 0 |
BOL | Bolivia | 204 | 0 | 0 | 0 |
BRU | Brunei | 204 | 0 | 0 | 0 |
BUR | Burkina Faso | 204 | 0 | 0 | 0 |
CAF | Central African Republic | 204 | 0 | 0 | 0 |
CAM | Cambodia | 204 | 0 | 0 | 0 |
CAY | Cayman Islands | 204 | 0 | 0 | 0 |
CGO | Republic of the Congo | 204 | 0 | 0 | 0 |
CHA | Chad | 204 | 0 | 0 | 0 |
CMR | Cameroon | 204 | 0 | 0 | 0 |
COD | Democratic Republic of the Congo | 204 | 0 | 0 | 0 |
COK | Cook Islands | 204 | 0 | 0 | 0 |
COM | Comoros | 204 | 0 | 0 | 0 |
CRC | Costa Rica | 204 | 0 | 0 | 0 |
DJI | Djibouti | 204 | 0 | 0 | 0 |
ERI | Eritrea | 204 | 0 | 0 | 0 |
ESA | El Salvador | 204 | 0 | 0 | 0 |
EST | Estonia | 204 | 0 | 0 | 0 |
FIN | Finland | 204 | 0 | 0 | 0 |
FSM | Federated States of Micronesia | 204 | 0 | 0 | 0 |
GAB | Gabon | 204 | 0 | 0 | 0 |
GAM | Gambia | 204 | 0 | 0 | 0 |
GBS | Guinea-Bissau | 204 | 0 | 0 | 0 |
GEQ | Equatorial Guinea | 204 | 0 | 0 | 0 |
GHA | Ghana | 204 | 0 | 0 | 0 |
GUI | Guinea | 204 | 0 | 0 | 0 |
GUM | Guam | 204 | 0 | 0 | 0 |
GUY | Guyana | 204 | 0 | 0 | 0 |
HAI | Haiti | 204 | 0 | 0 | 0 |
HON | Honduras | 204 | 0 | 0 | 0 |
IRQ | Iraq | 204 | 0 | 0 | 0 |
ISL | Iceland | 204 | 0 | 0 | 0 |
ISV | Virgin Islands | 204 | 0 | 0 | 0 |
IVB | British Virgin Islands | 204 | 0 | 0 | 0 |
KIR | Kiribati | 204 | 0 | 0 | 0 |
KSA | Saudi Arabia | 204 | 0 | 0 | 0 |
KUW | Kuwait | 204 | 0 | 0 | 0 |
LAO | Laos | 204 | 0 | 0 | 0 |
LAT | Latvia | 204 | 0 | 0 | 0 |
LBA | Libya | 204 | 0 | 0 | 0 |
LBN | Lebanon | 204 | 0 | 0 | 0 |
LBR | Liberia | 204 | 0 | 0 | 0 |
LES | Lesotho | 204 | 0 | 0 | 0 |
LIE | Liechtenstein | 204 | 0 | 0 | 0 |
LUX | Luxembourg | 204 | 0 | 0 | 0 |
MAD | Madagascar | 204 | 0 | 0 | 0 |
MAW | Malawi | 204 | 0 | 0 | 0 |
MDV | Maldives | 204 | 0 | 0 | 0 |
MHL | Marshall Islands | 204 | 0 | 0 | 0 |
MKD | North Macedonia | 204 | 0 | 0 | 0 |
MLI | Mali | 204 | 0 | 0 | 0 |
MLT | Malta | 204 | 0 | 0 | 0 |
MNE | Montenegro | 204 | 0 | 0 | 0 |
MON | Monaco | 204 | 0 | 0 | 0 |
MOZ | Mozambique | 204 | 0 | 0 | 0 |
MRI | Mauritius | 204 | 0 | 0 | 0 |
MTN | Mauritania | 204 | 0 | 0 | 0 |
MYA | Myanmar | 204 | 0 | 0 | 0 |
NAM | Namibia | 204 | 0 | 0 | 0 |
NCA | Nicaragua | 204 | 0 | 0 | 0 |
NEP | Nepal | 204 | 0 | 0 | 0 |
NGR | Nigeria | 204 | 0 | 0 | 0 |
NIG | Niger | 204 | 0 | 0 | 0 |
NRU | Nauru | 204 | 0 | 0 | 0 |
OMA | Oman | 204 | 0 | 0 | 0 |
PAR | Paraguay | 204 | 0 | 0 | 0 |
PLE | Palestine | 204 | 0 | 0 | 0 |
PLW | Palau | 204 | 0 | 0 | 0 |
PNG | Papua New Guinea | 204 | 0 | 0 | 0 |
RWA | Rwanda | 204 | 0 | 0 | 0 |
SAM | Samoa | 204 | 0 | 0 | 0 |
SEN | Senegal | 204 | 0 | 0 | 0 |
SEY | Seychelles | 204 | 0 | 0 | 0 |
SKN | Saint Kitts and Nevis | 204 | 0 | 0 | 0 |
SLE | Sierra Leone | 204 | 0 | 0 | 0 |
SMR | San Marino | 204 | 0 | 0 | 0 |
SOL | Solomon Islands | 204 | 0 | 0 | 0 |
SOM | Somalia | 204 | 0 | 0 | 0 |
SRI | Sri Lanka | 204 | 0 | 0 | 0 |
SSD | South Sudan | 204 | 0 | 0 | 0 |
STP | São Tomé and Príncipe | 204 | 0 | 0 | 0 |
SUD | Sudan | 204 | 0 | 0 | 0 |
SUR | Suriname | 204 | 0 | 0 | 0 |
SWZ | Eswatini | 204 | 0 | 0 | 0 |
SYR | Syria | 204 | 0 | 0 | 0 |
TAN | Tanzania | 204 | 0 | 0 | 0 |
TGA | Tonga | 204 | 0 | 0 | 0 |
TKM | Turkmenistan | 204 | 0 | 0 | 0 |
TLS | East Timor | 204 | 0 | 0 | 0 |
TOG | Togo | 204 | 0 | 0 | 0 |
TTO | Trinidad and Tobago | 204 | 0 | 0 | 0 |
TUV | Tuvalu | 204 | 0 | 0 | 0 |
UAE | United Arab Emirates | 204 | 0 | 0 | 0 |
URU | Uruguay | 204 | 0 | 0 | 0 |
VAN | Vanuatu | 204 | 0 | 0 | 0 |
VEN | Venezuela | 204 | 0 | 0 | 0 |
VIE | Vietnam | 204 | 0 | 0 | 0 |
VIN | Saint Vincent and the Grenadines | 204 | 0 | 0 | 0 |
YEM | Yemen | 204 | 0 | 0 | 0 |
ZIM | Zimbabwe | 204 | 0 | 0 | 0 |
Data & resources
The International Olympic Committee maintains a suite of webpages that contain a large amount of data for all previous games. You can access them here although it is not easy to scrape the data in an automated way. Wikipedia has a useful all-time medals table here. Perhaps most useful, however, is the set of medals tables provided by Rob Wood on his website topendsports.com. Even more data is available from a generous Kaggle user (R.Griffin) here.
You are free to use any data you can find to inform your predictions.
You might also be interested in YouTube video-seminars from participants in previous prediction competitions. These include presentations from the winners of the 2020 prediction competition and the winners of the 2023 competition. Finally, if you would like to be kept up to date on the activities of the RSS’s Statistics in Sports Section you can sign up to their mailing list here.
In this YouTube video Dr. Jess Hargreaves (Univ. York) hears from Dr. Johan Rewilak (Univ. South Carolina) on his thoughts about predicting the Olympics.
Lexie Bonas, an undergraduate student at the University of York, has also provided a quick explanation of her submission.
Example scoring calculation
To help forecasters understand the scoring system we provide the following cartoon example. Below is a fictional medals table that is used to compute a true rank for each of five NOCs. The table also includes a fictional set of predicted ranks. The numbers on the far left of the table are just labels arbitrarily (alphabetically) assigned to the NOCs to help refer to them - a bit like the NOC codes in the real medals table.
NOC | True_rank | Forecast_rank | Gold | Silver | Bronze | |
---|---|---|---|---|---|---|
3 | Cambodia | 1 | 4 | 2 | 2 | 0 |
4 | Denmark | 2 | 4 | 0 | 1 | 0 |
2 | Bahamas | 3 | 2 | 0 | 0 | 1 |
1 | Afghanistan | 5 | 2 | 0 | 0 | 0 |
5 | Ecuador | 5 | 4 | 0 | 0 | 0 |
To score the predicted ranks we enumerate all pairs of NOCs and, for each one, check whether the true ordering and the predicted ordering match. These orderings are quantified using the \(\text{sign}\) function (which returns values -1, 0 or 1) applied to the difference between the ranks. If the orderings agree the forecaster scores a point, if they disagree the forecaster loses a point. If either the true or predicted ranks are tied for a particular pair of NOCs then no points are scored or lost. To quantify this sort of agreement we multiply the signs of the rank differences together. The results of these calculations for this toy example are presented in the table below.
i | j | NOC_i | NOC_j | True_rank_i | True_rank_j | True_sign | Forecast_rank_i | Forecast_rank_j | Forecast_sign | Forecast_score |
---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | Afghanistan | Bahamas | 5 | 3 | 1 | 2 | 2 | 0 | 0 |
1 | 3 | Afghanistan | Cambodia | 5 | 1 | 1 | 2 | 4 | -1 | -1 |
1 | 4 | Afghanistan | Denmark | 5 | 2 | 1 | 2 | 4 | -1 | -1 |
1 | 5 | Afghanistan | Ecuador | 5 | 5 | 0 | 2 | 4 | -1 | 0 |
2 | 3 | Bahamas | Cambodia | 3 | 1 | 1 | 2 | 4 | -1 | -1 |
2 | 4 | Bahamas | Denmark | 3 | 2 | 1 | 2 | 4 | -1 | -1 |
2 | 5 | Bahamas | Ecuador | 3 | 5 | -1 | 2 | 4 | -1 | 1 |
3 | 4 | Cambodia | Denmark | 1 | 2 | -1 | 4 | 4 | 0 | 0 |
3 | 5 | Cambodia | Ecuador | 1 | 5 | -1 | 4 | 4 | 0 | 0 |
4 | 5 | Denmark | Ecuador | 2 | 5 | -1 | 4 | 4 | 0 | 0 |
The forecaster’s tau score is their cumulative score (the sum of the right-most column) divided by the number of pairs, i.e. -3/10=-0.3.
Extra info
NOCs without state affiliation: The submission template provided did not include rows for Olympics Committees for neutral athletes or refugees. Their ranks will not contribute to the prediction competition scores. The template did include rows for Belorussian and Russian NOCs, which are not taking part in the games. Predictions for these NOCs will not be used when calculating the prediction competition scores.
Bonus fact: Kendall’s tau was devised and analysed by Maurice Kendall who served as President of the Institute of Statisticians, which broke away from then merged with the Royal Statistical Society.
Bonus reminder: We are orienting our ranks so that smaller numbers correspond to more medals. When we talk about the medals table, countries at the top are those with the most medals and the smallest ranks.