Rethinking Athletics: Theories of scoring: Harder's approach (and a proposed extension)

I have already written a short post on the theories of D. Harder. He is the author of the "Apples to Oranges" monograph where he explains in detail his theory of scoring which allows comparisons between achievements in different sports. The key to this comparison is that you "... compare the number of athletes who reach any given level ..." (proportional to the number of athletes competing in that sport, of course). The quantitative basis of the method is the following. A mark of 100 points is attributed to some performance if a fraction of 0.5 of the population can realise a score equal or better than this. For 200 points, only a fraction of 0.05 of the population can do better than this performance. The next 100 points, i.e. 300, correspond to a performance realised by just 0.005 of the population and so on up to 1000 points where only a fraction 5.10^-10 of the population (which means less than one person on earth) can realise the corresponding performance. Going in the other direction a performance earning 50 points is one that 84 % of the population should be able to realise, while 0 corresponds to a performance possible for 95 % of the population.

The figure below shows a comparison between the official World Athletics scoring tables and the ones of Harder for men's long jump. The official scoring tables attribute close to 1400 points to the performance corresponding to Harder's 1000 but this is not something crucial.

It is remarkable that Harder had to use a different slope for his scoring curves in the below 100 points part. This is an attempt in these tables to accommodate a large part of the population. While this goes in the right direction, I believe that it is not quite sufficient. In fact, the part between 0 and 100 points is the weak one in Harder's work. In some events the scoring cannot go all the way down to 0. This is for instance the case for pole vault where there is nothing below 1.20 m corresponding to 80 points, since already in high jump a 1.20 performance obtains 60 points. The treatment of the performances of the 50 % of the population presents definitely a difficulty and would necessitate specific studies. While this is not particularly interesting from a competition oriented point of view, it is a challenge for Harder's approach since the latter is meant to be able to offer comparisons between any two sportsmen (the latter term understood in the broadest possible way).

The reasoning that led me to propose an extension of Harder's approach is that every effort deserves a reward. So a fair scoring should be able to allocate points to any performance: only zero performance should obtain zero points. This led me to the mathematical reformulation of Harder's approach, explained in what follows.

It is a well-established statistical fact that the distribution of human performance in various domains follows a bell-shaped curve (dashed curve in the figure below). The details may vary but the fact remains that most individuals perform close to the median with only a very small percentage registering exceptionally good or bad performances. Given this, the fraction of the population realising a performance better than some threshold x (which is crucial in Harder's approach) should be given by a sigmoidal curve (continuous curve below).

In my article, The physical basis of scoring the athletic performance published in New Studies in Athletics, 2007, volume 22:3, pages 47-53, I proposed an analytic expression for this curve:

f(x)=(k+1)/(k+exp(x))

It will be used in what follows in order to introduce a scoring system based on Harder's theory. Going back to the relation of points to performance and interpreting it in the spirit of Harder, i.e. a progress of 100 points means that the fraction of the population realising it is divided by 10, it is natural to introduce a logarithmic relation between points and performance, i.e. p is proportional to log(f), where log is the decimal logarithm. More precisely, I propose the following scoring formula:

p=a log(1+b(exp(cx)-1))

where x is the performance (velocity for track events and length for the field ones). Notice that p is equal to 0 when x is also 0.

How does this expression compare to Harder's curve? By adjusting the parameters a,b,c it is possible to obtain an excellent fit as shown in the figure below.

So, naturally, the question arises whether the expression I propose could be used in order to approximate the official scoring tables. It turns out that this is indeed the case. By adjusting once more the parameters one obtains the fit shown in the figure below.

The two curves are very close from 100 points upwards but below this mark the expression proposed above curves towards 0. The smallest performance obtaining at least 1 point is 28 cm.

To put it in a nutshell, I believe that the scoring formula I introduced is a natural extension of Harder's approach. It has the advantage of a closed-form analytical expression and its parameters can be adjusted so as to follow closely the official, World Athletics, scoring curve. Moreover it provides the most fair scoring, since it attributes a non-zero score to any measurable performance.

13 February, 2021

Theories of scoring: Harder's approach (and a proposed extension)

No comments:

Post a Comment