RSS Olympic prediction competition 2024

Context

Each summer the Royal Statistical Society’s Statistics in Sport Section organizes a prediction competition. For 2024 the competition, which is being sponsored by Amelco, involves predicting the medals table for the Paris Olympics.

This year’s competition is novel in several ways. Firstly, forecasters will be asked to provide a single predicted ranking whose distance from the true ranking at the end of the games will determine their score. This stands in contrast to previous forecasting competitions, in which forecasters have been required to specify probabilities for the outcomes of individual sports matches in a tournament. Secondly, the ‘true ranking’ for the Olympics is not as well defined or officially endorsed as it is for other events.

For the purposes of this competition the true final ranking will be determined by the number of gold medals won by each country. Silver medals will be used only to break ties between countries with equal numbers of gold medals. Bronze medals will be used to break ties between countries after silver medals are accounted for. Countries still tied after accounting for bronze medals will remain tied.

Scoring

Predicted rankings will be scored using a statistic called Kendall’s tau, or the Kendall rank correlation coefficient. There are (at least) two ways to compute this statistic, which shed light on its meaning. To understand what the statistic quantifies it is first necessary to think about the all the possible pair-wise comparisons between countries. With n participating countries there are n(n1)/2=(n2) of these. A forecaster will gain points for every pair which they have placed in the correct order, and they will lose a point for every pair which they placed in the wrong order. If a pair countries are tied, either in the predicted ranking or the true ranking, this pair does not contribute to the score. The score is then scaled by the total number of pairs so that it falls between -1 and 1.

We can write this down as

τ=number of concordant pairsnumber of discordant pairsnumber of pairs=2n(n1)i<jsign(RiRj)sign(SiSj)

where ‘concordant’ is a word used in the academic literature to mean ‘in agreement’. Similarly ‘discordant’ means ‘in disagreement’. For example, if a forecaster ranks Kiribati higher than Tanzania and, indeed, Kiribati is higher than Tanzania in the final true ranking then we say that the forecaster has a concordant pair with the true ranking. As alluded to in the equation above, it is also possible to show that Kendall’s tau is the correlation between the signs of the differences in ranks. In this expression Ri denotes the numerical rank assigned by a forecaster to a country (e.g. if a forecaster predicted country i to be second from the top in the medals table they would set Ri=2). The quantity Si denotes the true rank for country i. The sign function used here returns value one when its argument is positive, minus one when its argument is negative and zero when its argument is zero.

There are 206 countries participating in the 2024 Olympic games, although officially they are referred to as National Olympic Committees (NOCs) rather than countries. In the submission template you will find a list of the NOCs and, as a default, they are all ranked jointly at 206th. This default entry will receive a Kendall’s tau score of zero because, as mentioned above, tied pairs contribute nothing to the score - they are considered neither concordant nor discordant. Forecasters are invited to modify the ranks in any way they see fit. You could, for example,

  • cluster countries into groups all with the same rank,
  • use non-integer ranks,
  • use ranks that go below 1!

The only thing that matters is that your ranks allow for the pairwise comparisons that contribute to Kendall’s tau.

Submission template

You can download the submission template here. Only modify the numbers in the Rank column. In particular do not permute the rows of the NOCs.

Email completed submissions to ben.powell@york.ac.uk with the subject line SiS forecasting competition 2024. This email should contain an attached .csv file called “RSS_pred_comp_submission_NAME.csv” where NAME is the name you would like to appear on the leader board.

If you would like to be considered for the methodology prize, please provide a brief description of how you made your forecast predictions.

Please make your submissions before Monday 22nd July. The games start on Friday 26th July but it will be necessary to have submission files a bit earlier in order to initialize the forecast scoring procedure and the leader board.

Prizes

The principal and most valuable prize for winning the forecasting competition is prestige (and a certificate)! Special attention and corresponding certificates will be awarded for

  1. Best overall score
  2. Best student score
  3. Most innovative methodology

The winners of these three awards will also be invited to the Royal Statistical Society conference to present their methods. This year’s conference will be held in Brighton in early September. Conference fees and travel/accommodation costs are being provided by sponsors Amelco.

Leader board

The competition is now over and our winner is John Edwards. Well done John and everyone who took part!

Forecaster Rank Tau
John Edwards 1 0.596
mohamed 2 0.564
OlymPicks 3 0.544
Hammers_O’Callaghan 4 0.543
ALEC 5 0.540
AJM 6 0.534
fa2410 7 0.533
ChristopherWharton 8 0.532
JKH 9 0.513
Orla_S 10 0.511
KaitoGoto 11 0.509
JoePenn 13 0.508
JohnDSouza 13 0.508
JPN2020 14 0.507
HarrySnart 16 0.495
WilliamBowers 16 0.495
LexieB 17 0.493
ANIK 18 0.486
Rank_Deficient 19 0.480
WeatherQuant 20 0.455
Kizmet24 21 0.426
tghaynes 22 0.395
Everyone_is_(equally)_awesome 23 0.000

Medals table

Last updated: 2024-08-11 21:13:10

X Code NOC Rank Gold Silver Bronze
198 USA United States 1 40 44 42
39 CHN China 2 40 27 24
97 JPN Japan 3 20 12 13
11 AUS Australia 4 18 19 16
65 FRA France 5 16 26 22
136 NED Netherlands 6 15 7 12
69 GBR Great Britain 7 14 22 29
102 KOR South Korea 8 13 9 10
93 ITA Italy 9 12 13 15
73 GER Germany 10 12 13 8
142 NZL New Zealand 11 10 7 3
34 CAN Canada 12 9 7 11
199 UZB Uzbekistan 13 8 2 3
84 HUN Hungary 14 6 7 6
60 ESP Spain 15 5 4 9
179 SWE Sweden 16 4 4 3
99 KEN Kenya 17 4 2 5
140 NOR Norway 18 4 1 3
88 IRL Ireland 19 4 0 3
27 BRA Brazil 20 3 7 10
87 IRI Iran 21 3 6 3
196 UKR Ukraine 22 3 5 4
157 ROU Romania 23 3 4 2
71 GEO Georgia 24 3 3 1
18 BEL Belgium 25 3 1 6
30 BUL Bulgaria 26 3 1 3
171 SRB Serbia 27 3 1 1
51 CZE Czechia 28 3 0 2
52 DEN Denmark 29 2 2 5
13 AZE Azerbaijan 31 2 2 3
48 CRO Croatia 31 2 2 3
49 CUB Cuba 32 2 1 6
28 BRN Bahrain 33 2 1 1
167 SLO Slovenia 34 2 1 0
189 TPE Chinese Taipei 35 2 0 5
12 AUT Austria 36 2 0 3
82 HKG Hong Kong, China 38 2 0 2
148 PHI Philippines 38 2 0 2
3 ALG Algeria 40 2 0 1
85 INA Indonesia 40 2 0 1
91 ISR Israel 41 1 5 1
152 POL Poland 42 1 4 5
98 KAZ Kazakhstan 43 1 3 3
95 JAM Jamaica 46 1 3 2
158 RSA South Africa 46 1 3 2
184 THA Thailand 46 1 3 2
62 ETH Ethiopia 47 1 3 0
176 SUI Switzerland 48 1 2 5
56 ECU Ecuador 49 1 2 2
153 POR Portugal 50 1 2 1
75 GRE Greece 51 1 1 6
7 ARG Argentina 54 1 1 1
57 EGY Egypt 54 1 1 1
191 TUN Tunisia 54 1 1 1
26 BOT Botswana 58 1 1 0
38 CHI Chile 58 1 1 0
111 LCA Saint Lucia 58 1 1 0
195 UGA Uganda 58 1 1 0
55 DOM Dominican Republic 59 1 0 2
77 GUA Guatemala 61 1 0 1
117 MAR Morocco 61 1 0 1
54 DMA Dominica 63 1 0 0
144 PAK Pakistan 63 1 0 0
192 TUR Turkey 64 0 3 5
122 MEX Mexico 65 0 3 2
8 ARM Armenia 67 0 3 1
44 COL Colombia 67 0 3 1
100 KGZ Kyrgyzstan 69 0 2 4
154 PRK North Korea 69 0 2 4
114 LTU Lithuania 70 0 2 2
86 IND India 71 0 1 5
120 MDA Moldova 72 0 1 3
103 KOS Kosovo 73 0 1 1
50 CYP Cyprus 78 0 1 0
63 FIJ Fiji 78 0 1 0
96 JOR Jordan 78 0 1 0
123 MGL Mongolia 78 0 1 0
145 PAN Panama 78 0 1 0
185 TJK Tajikistan 79 0 0 3
2 ALB Albania 83 0 0 2
76 GRN Grenada 83 0 0 2
118 MAS Malaysia 83 0 0 2
155 PUR Puerto Rico 83 0 0 2
40 CIV Ivory Coast 89 0 0 1
46 CPV Cape Verde 89 0 0 1
147 PER Peru 89 0 0 1
156 QAT Qatar 89 0 0 1
164 SGP Singapore 89 0 0 1
178 SVK Slovakia 89 0 0 1
1 AFG Afghanistan 204 0 0 0
4 AND Andorra 204 0 0 0
5 ANG Angola 204 0 0 0
6 ANT Antigua and Barbuda 204 0 0 0
9 ARU Aruba 204 0 0 0
10 ASA American Samoa 204 0 0 0
14 BAH Bahamas 204 0 0 0
15 BAN Bangladesh 204 0 0 0
16 BAR Barbados 204 0 0 0
17 BDI Burundi 204 0 0 0
19 BEN Benin 204 0 0 0
20 BER Bermuda 204 0 0 0
21 BHU Bhutan 204 0 0 0
22 BIH Bosnia and Herzegovina 204 0 0 0
23 BIZ Belize 204 0 0 0
25 BOL Bolivia 204 0 0 0
29 BRU Brunei 204 0 0 0
31 BUR Burkina Faso 204 0 0 0
32 CAF Central African Republic 204 0 0 0
33 CAM Cambodia 204 0 0 0
35 CAY Cayman Islands 204 0 0 0
36 CGO Republic of the Congo 204 0 0 0
37 CHA Chad 204 0 0 0
41 CMR Cameroon 204 0 0 0
42 COD Democratic Republic of the Congo 204 0 0 0
43 COK Cook Islands 204 0 0 0
45 COM Comoros 204 0 0 0
47 CRC Costa Rica 204 0 0 0
53 DJI Djibouti 204 0 0 0
58 ERI Eritrea 204 0 0 0
59 ESA El Salvador 204 0 0 0
61 EST Estonia 204 0 0 0
64 FIN Finland 204 0 0 0
66 FSM Federated States of Micronesia 204 0 0 0
67 GAB Gabon 204 0 0 0
68 GAM The Gambia 204 0 0 0
70 GBS Guinea-Bissau 204 0 0 0
72 GEQ Equatorial Guinea 204 0 0 0
74 GHA Ghana 204 0 0 0
78 GUI Guinea 204 0 0 0
79 GUM Guam 204 0 0 0
80 GUY Guyana 204 0 0 0
81 HAI Haiti 204 0 0 0
83 HON Honduras 204 0 0 0
89 IRQ Iraq 204 0 0 0
90 ISL Iceland 204 0 0 0
92 ISV Virgin Islands 204 0 0 0
94 IVB British Virgin Islands 204 0 0 0
101 KIR Kiribati 204 0 0 0
104 KSA Saudi Arabia 204 0 0 0
105 KUW Kuwait 204 0 0 0
106 LAO Laos 204 0 0 0
107 LAT Latvia 204 0 0 0
108 LBA Libya 204 0 0 0
109 LBN Lebanon 204 0 0 0
110 LBR Liberia 204 0 0 0
112 LES Lesotho 204 0 0 0
113 LIE Liechtenstein 204 0 0 0
115 LUX Luxembourg 204 0 0 0
116 MAD Madagascar 204 0 0 0
119 MAW Malawi 204 0 0 0
121 MDV Maldives 204 0 0 0
124 MHL Marshall Islands 204 0 0 0
125 MKD North Macedonia 204 0 0 0
126 MLI Mali 204 0 0 0
127 MLT Malta 204 0 0 0
128 MNE Montenegro 204 0 0 0
129 MON Monaco 204 0 0 0
130 MOZ Mozambique 204 0 0 0
131 MRI Mauritius 204 0 0 0
132 MTN Mauritania 204 0 0 0
133 MYA Myanmar 204 0 0 0
134 NAM Namibia 204 0 0 0
135 NCA Nicaragua 204 0 0 0
137 NEP Nepal 204 0 0 0
138 NGR Nigeria 204 0 0 0
139 NIG Niger 204 0 0 0
141 NRU Nauru 204 0 0 0
143 OMA Oman 204 0 0 0
146 PAR Paraguay 204 0 0 0
149 PLE Palestine 204 0 0 0
150 PLW Palau 204 0 0 0
151 PNG Papua New Guinea 204 0 0 0
160 RWA Rwanda 204 0 0 0
161 SAM Samoa 204 0 0 0
162 SEN Senegal 204 0 0 0
163 SEY Seychelles 204 0 0 0
165 SKN Saint Kitts and Nevis 204 0 0 0
166 SLE Sierra Leone 204 0 0 0
168 SMR San Marino 204 0 0 0
169 SOL Solomon Islands 204 0 0 0
170 SOM Somalia 204 0 0 0
172 SRI Sri Lanka 204 0 0 0
173 SSD South Sudan 204 0 0 0
174 STP São Tomé and Príncipe 204 0 0 0
175 SUD Sudan 204 0 0 0
177 SUR Suriname 204 0 0 0
180 SWZ Eswatini 204 0 0 0
181 SYR Syria 204 0 0 0
182 TAN Tanzania 204 0 0 0
183 TGA Tonga 204 0 0 0
186 TKM Turkmenistan 204 0 0 0
187 TLS East Timor 204 0 0 0
188 TOG Togo 204 0 0 0
190 TTO Trinidad and Tobago 204 0 0 0
193 TUV Tuvalu 204 0 0 0
194 UAE United Arab Emirates 204 0 0 0
197 URU Uruguay 204 0 0 0
200 VAN Vanuatu 204 0 0 0
201 VEN Venezuela 204 0 0 0
202 VIE Vietnam 204 0 0 0
203 VIN Saint Vincent and the Grenadines 204 0 0 0
204 YEM Yemen 204 0 0 0
205 ZAM Zambia 204 0 0 0
206 ZIM Zimbabwe 204 0 0 0

Data & resources

The International Olympic Committee maintains a suite of webpages that contain a large amount of data for all previous games. You can access them here although it is not easy to scrape the data in an automated way. Wikipedia has a useful all-time medals table here. Perhaps most useful, however, is the set of medals tables provided by Rob Wood on his website topendsports.com. Even more data is available from a generous Kaggle user (R.Griffin) here.

You are free to use any data you can find to inform your predictions.

You might also be interested in YouTube video-seminars from participants in previous prediction competitions. These include presentations from the winners of the 2020 prediction competition and the winners of the 2023 competition. Finally, if you would like to be kept up to date on the activities of the RSS’s Statistics in Sports Section you can sign up to their mailing list here.

In this YouTube video Dr. Jess Hargreaves (Univ. York) hears from Dr. Johan Rewilak (Univ. South Carolina) on his thoughts about predicting the Olympics.

Lexie Bonas, an undergraduate student at the University of York, has also provided a quick explanation of her submission.

Example scoring calculation

To help forecasters understand the scoring system we provide the following cartoon example. Below is a fictional medals table that is used to compute a true rank for each of five NOCs. The table also includes a fictional set of predicted ranks. The numbers on the far left of the table are just labels arbitrarily (alphabetically) assigned to the NOCs to help refer to them - a bit like the NOC codes in the real medals table.

NOC True_rank Forecast_rank Gold Silver Bronze
3 Cambodia 1 4 2 2 0
4 Denmark 2 4 0 1 0
2 Bahamas 3 2 0 0 1
1 Afghanistan 5 2 0 0 0
5 Ecuador 5 4 0 0 0

To score the predicted ranks we enumerate all pairs of NOCs and, for each one, check whether the true ordering and the predicted ordering match. These orderings are quantified using the sign function (which returns values -1, 0 or 1) applied to the difference between the ranks. If the orderings agree the forecaster scores a point, if they disagree the forecaster loses a point. If either the true or predicted ranks are tied for a particular pair of NOCs then no points are scored or lost. To quantify this sort of agreement we multiply the signs of the rank differences together. The results of these calculations for this toy example are presented in the table below.

i j NOC_i NOC_j True_rank_i True_rank_j True_sign Forecast_rank_i Forecast_rank_j Forecast_sign Forecast_score
1 2 Afghanistan Bahamas 5 3 1 2 2 0 0
1 3 Afghanistan Cambodia 5 1 1 2 4 -1 -1
1 4 Afghanistan Denmark 5 2 1 2 4 -1 -1
1 5 Afghanistan Ecuador 5 5 0 2 4 -1 0
2 3 Bahamas Cambodia 3 1 1 2 4 -1 -1
2 4 Bahamas Denmark 3 2 1 2 4 -1 -1
2 5 Bahamas Ecuador 3 5 -1 2 4 -1 1
3 4 Cambodia Denmark 1 2 -1 4 4 0 0
3 5 Cambodia Ecuador 1 5 -1 4 4 0 0
4 5 Denmark Ecuador 2 5 -1 4 4 0 0

The forecaster’s tau score is their cumulative score (the sum of the right-most column) divided by the number of pairs, i.e. -3/10=-0.3.

Extra info

NOCs without state affiliation: The submission template provided did not include rows for Olympics Committees for neutral athletes or refugees. Their ranks will not contribute to the prediction competition scores. The template did include rows for Belorussian and Russian NOCs, which are not taking part in the games. Predictions for these NOCs will not be used when calculating the prediction competition scores.

Bonus fact: Kendall’s tau was devised and analysed by Maurice Kendall who served as President of the Institute of Statisticians, which broke away from then merged with the Royal Statistical Society.

Bonus reminder: We are orienting our ranks so that smaller numbers correspond to more medals. When we talk about the medals table, countries at the top are those with the most medals and the smallest ranks.