Patterns in performance and reward in the heptathlon and decathlon
While watching the 2024 Olympic Heptathlon competition, I was reminded that the points scores by event in the heptathlon always show a pattern: the first event, the 100 metres hurdles, usually sees large points numbers across the board, while the shot put, the third event, tends to return a couple of hundred fewer points per athlete.
This prompted me to look at two questions: i) why do the points come out as they do, and ii) does this mean that some events are more important than others, to the end of winning the heptathlon competition? The same questions apply to the decathlon too, of course, and this is also examined here.
Data
I collected the results for the World Championship heptathlon and decathlon competitions from 2007 to 2023 from Wikipedia . These are the elite levels of performance for the two multi-event competitions, so the insights gained from this analysis apply to only this high level and not necessarily to heptathlon or decathlon competitions in general. Details on the scoring systems were found on SportsCalculators, and are originally published by World Athletics.
In the analysis that follows I will use the word ‘score’ to refer to the physical performance mark recorded by the athlete in each event (height, length/distance, or time), and ‘points’ to refer to the number of heptathlon or decathlon points that are received for that score.
Results: Heptathlon
Points spreads
The average (median) number of points received for the heptathlon events in Table 1 shows clearly the pattern we’re talking about: the sprint events (200m and, in particular, 100m hurdles) provide around 200 points more, on average, than do the throwing events (javelin and shot put). This seems surprising, but is not necessarily of any importance, since all athletes are competing in all events, so it is only the points scored relative to one another that matter. The third column in the table above shows the interquartile range of points; that is, the difference between the 25th and 75th percentiles, or, the zone in which the ‘middle half’ of athletes lie. Here we see that the running events show the lowest spreads in points, while the high jump and javelin have the largest ranges. This suggests that the difference between performing quite poorly and quite well (relative to World-Championship-level competitors) is more important, in points terms, in some events that it is in others.
Scoring system
The reason for this effect is the scoring system. The systems for both heptathlon and decathlon have been in their current form since 1984, and use for each event an equation of the form
points = a * (difference between score and reference score b) ^ c
where ‘^’ means ‘to the power of’. Each event therefore requires three coefficients, a, b, and c, to be defined. The values of the coefficients are not comparable individually between events, but the combination of the three creates a points curve for each event, as shown in Figure 1, below. The World Athletics document on the scoring systems explains that one factor in the selection of the coefficients is the world record in each event: it is desired that a world-record performance in any event should yield the same number of points. They note that, in practice, this means that ‘the best scores set in each individual event will vary widely’, but that it is more important that ‘the differences in the scores between different athletes in one event are roughly proportional to the differences in their performances’.
The blue lines in Figure 1 show the relationship between score and points in each event. The lines are almost straight, particularly within the green shaded score regions, which is where 80% of scores are contained, meaning that the scoring system can, practically, be regarded as a linear system. (There is a slight upward bend to some of the curves, particularly at the high-score end of the long jump and 800m, which means that exceptional performance is rewarded slightly more than would occur in a truly linear system, but these differences are small and are not the key point of this analysis.)
What is more interesting is the range of points that are realistically available from each event. This is indicated by the vertical arrows, which show the points increase obtained by moving from a score at the 10th percentile (left edge of the green area) to a score at the 90th percentile (right edge of the green area). 8 out of 10 performances occur in these ranges, and anything lower or higher is somewhat exceptional for that event. The size of this points range (the height of the arrow) is clearly larger in some events, most noticeably the javelin, than in others, most noticeably the 100m hurdles. This is almost the same information as seen in the interquartile range numbers earlier.
The position of the world records in each event (dashed red lines) show why the average number of points is lower in the throwing events: the average heptathlete can only throw about 60% as far as the world record (best specialist, single-event athlete) for shot put or javelin, but the same heptathlete can attain a speed of between 85% and 90% of the world record (converted from time) in the 100m hurdles and the 200m.
There could be several reasons for this, but one important follow-on point that it seems fair to assume is that an event in which performances are far from the world record has more potential for improvements than an event in which performances are already close to their ultimate limit. To put this another way, it seems that most heptathletes are able to run fairly well compared to the specialists in those events (including the 800m), but in the javelin, their performances generally show relatively large deficiencies compared to what is possible in the event. Importantly though, as the wider score spread for javelin shows, some heptathletes can throw the javelin fairly well.
To make this more clear, next, I use the distributions of scores in each event to measure the points gain that would result from an athlete improving her performance in the event from the 50th percentile (better than half of her competitors) to the 60th percentile (better than 6 in 10 competitors). The intention with this metric is that this improvement might be equally difficult, or equally achievable, in each event, as it is measured by what other heptathletes have achieved. When finding the percentile levels, to avoid problems with the discrete nature of scores in the high jump (only jumps every 3 cm are possible), the raw scores from each event are modelled as distributions (normal in most cases, and log-normal in the cases of high jump, 100m hurdles, and 800m), and the percentiles computed from these.
The results (Figure 2) confirm that improving by the same amount relative to one’s peers earns more points in the field events than in the running events. A 10% improvement relative to the competition in the javelin would be the most beneficial performance gain to make, earning the heptathlete 28 points. Later we will discuss whether or not it is truly as easy to make a 10% throwing improvement as it is to improve scores by 10% in the track events.
Results: Decathlon
The picture in the decathlon is broadly similar to that described for the heptathlon. In Table 2, we find that, again, the javelin shows the biggest interquartile range for points scored, followed by the 1500m (which sits much higher up the list than the women’s most similar event, the 800m) and the pole vault. Conversely, the sprint events (100m, 110m hurdles, and 400m) show smaller points spreads. The difference in spreads between the top and bottom events is not quite as severe as it is in the heptathlon.
The score distributions and points curves (Figure 3) show, through the length of the arrows, the events in which the typical range of scores (the width of the green areas) yields the most points difference: the javelin, pole vault, and some way behind, the discus and 1500m. Again, the sprint hurdles shows a small points spread: the worst hurdlers are not penalised much in comparison to the best hurdlers.
The 1500m sits lowest in points terms of any event, with a 10th percentile performance worth only around 600 points. This is a result of the steepness of the blue line, which dictates how much the decathlete is penalised for each additional second that they are away from the world record. The blue line does not have to look like this, but it does as a result of the choice of the coefficients a, b, and c. On the plus side, the steepness of the line creates a relatively large points difference between different scores in the 1500m, as seen below.
Using the same technique as before of modelling event scores as distributions (log-normal for 1500m, 110m hurdles, and javelin, and normal for the rest) and computing percentiles, Figure 4 shows the same pattern as the heptathlon events: the same percentage improvement in score yields the most points in javelin, and the fewest points in the sprint events.
Event correlation
The above results suggest that a few of the most technical events should be the ones that athletes focus on to gain points most easily. However, training in one event will naturally lead to improvements in some other events as well (and possibly degrade performance in others), so it is not so simple as to be able to consider each event in isolation. The value of effort spent on one event will depend on both the points gain that is possible in that event and the complementary benefits obtained in other, similar events.
The correlation plot of Figure 5 shows how scores in each of the heptathlon events are correlated with one another. The highest correlations are between the 200m, 100m hurdles, and long jump. This is not surprising, as a good sprinter will likely perform well in all of these events. There are also correlations, though smaller, between scores in long jump and high jump, and in shot put and javelin.
It is notable that the javelin shows the least correlation with the other events overall. This is consistent with it being an event requiring its own specific technique, and a heptathlete does not naturally become much better at the javelin by improving in any other events, except for (somewhat) the shot put. Javelin even shows a small negative correlation with both the 200m and 800m: the better javelin throwers in heptathlon tend to be the worse runners.
In decathlon (Figure 6), the sprint events are correlated with one another, as are shot put and discus, whereas pole vault, javelin, and 1500m show quite weak correlations with almost all of the other events.
This changes the message from the previous section. While improving relative to the rest of the field in javelin, pole vault, or middle distance running should provide the biggest points gain per unit of improvement, the benefit of this is undercut by a possible decrease in performance in other events. On the other hand, improving skill in one of the sprint-based events tends to create gains in similar events at the same time, perhaps making this a more efficient approach to the competitions overall.
Finally, the right-most column in each of the correlation grids (Figures 5 and 6) seems to confirm this. This column shows the correlation between scores in each event and final position in the heptathlon or decathlon competition (multiplied by -1 so that a high finish becomes the largest number). The biggest correlation with position — that is, the event in which the athlete’s score most dictates where they finish in the overall competition — is found in the long jump in both heptathlon and decathlon, followed by hurdles and high jump in heptathlon, and by hurdles and 400m in decathlon. The importance of long jump is likely due to its ‘centrality’ in the competitions: its relatively high correlations with several other events. Conversely, performance in the 1500m is the least correlated with finish position in decathlon. This is probably because the best 1500m runners tend not to hold advantages in other events too, so often end up lower in the overall standings. The decathlon is more often won by a strong sprinter, because this athlete is rewarded several times over for this skill, with points in the 100m, hurdles, 400m, and long jump.
Conclusion
Insights on how best to approach the heptathlon and decathlon turned out different from how I expected at the start of the analysis. Although it seems that effort to improve one’s score in the javelin is the most efficient way to increase total points, the athletes that perform best overall do not tend to do particularly well in javelin. This is likely because the skills required for javelin, and to a lesser extent, pole vault, discus, and high jump, do not transfer well to other events, so the benefit of the effort is limited to a points return in that event only.
It may be speculated that these are the most technically demanding events, in which it is perhaps possible, though difficult, for the athlete to ‘unlock’ big gains in performance through small adjustments in technique, whereas some of the other events are more controlled by strength or fitness, in which only incremental gains are feasible. Probably the difficulty in making those technical improvements, combined with the shared benefits of general fitness improvements, tips the balance in favour of improving scores in the sprint-related events.
This balance is nevertheless controlled by the scoring system. Each of the blue lines in Figures 1 and 3 is anchored at one point by the world record, but the gradient of the line appears to be a choice that could have been made differently. The bigger the gradient, the more emphasis is put on the difference in score between the best and worst athletes in that event. In fact, the large gradients for the javelin and other events may be necessary to balance these more isolated events against the shared benefits of improvements in the sprint events. If the points bars of Figures 2 and 4 were equal across all events, there would be less incentive to focus effort on the technical events in preference to the more intercorrelated sprint events.
There is no likelihood of the multi-event scoring systems changing in the near future. I would suggest that the right-most columns of Figures 5 and 6 show that the systems currently work well, as there are no events that are completely uncorrelated with finish position (which would mean that performance in those events didn’t matter to the overall result). If there were to be changes to the system, it might be a good idea to target more even correlations in this column, which would mean increasing the points gradient in the 800m and 1500m, in particular, to increase the benefit of performing well in these events, at the expense of the long jump and hurdles, which currently have a bigger bearing on the final standings.
Uneven Scoring in Multi-Event Athletics was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Uneven Scoring in Multi-Event Athletics
Go Here to Read this Fast! Uneven Scoring in Multi-Event Athletics