I want to describe how I arrived at the numbers used for
average number of citations for papers in mathematics.
The starting point was the data given on The Times Higher
Education Supplement. You can find these on the web if you
search for "Average citation rates by field 1998--2008"
These data are also given below.
What we need to do next is to introduce an age for the average
paper. It is good to assume that the average paper is published
mid year. Now we can extend the above data by adding age of
paper in years.
The claim now is that the above data are reproduced very
well by the following linear model.
Linear model: y = 0.59682 x - 0.0825
Consider R^2 in [0, 1] which is a standard measure of the predictive
value of the model.
For the above mofdel
R^2 = 0.987 - very high
Note that the age of an average 2008 paper is put at 0.5 years.
This helps bring the constant in the linear model close to zero
(ideally the constant in the linear model would be zero, since
a paper of zero age should have zero citations).
Points generated by linear model including extrapolation:
The final step was to assume translation invariance which is
a very good assumption given that the model is very linear,
so that we can say that, from the viewpoint of someone in 2010,
the average citations will be the same as those above, but shifted
by one year forward. This gives