Homogenization of climate data is a process of calibrating old meteorological records, to remove spurious factors which have nothing to do with actual temperature change. Some media sources have made conspiracist claims about the validity of homogenization. To address these claims I set out to reproduce the science for myself, from scratch.
Historical weather station records are a key source of information about temperature change over the last century. However the records were originally collected to track the big changes in weather from day to day, rather than small and gradual changes in climate over decades. Changes to the instruments and measurement practices introduce changes in the records which have nothing to do with climate.
On the whole these changes have only a modest impact on global temperature estimates. However if an accurate local record, or the best possible global record, is required the non-climate artefacts should be removed from the weather station records. This process is called homogenization.
The validity of this process has been questioned in the public discourse on climate change, on the basis that the adjustments cause a slight increase in the global warming signal from land based observations. However sea surface temperatures are more important in determining global temperature, and are subject to a larger adjustments in the opposite direction (Figure 1). Furthermore, the adjustments have the biggest effect prior to 1975 and don't have much impact on recent warming trends.
Figure 1: The historic temperature record with no adjustments, with adjustments to ocean temperatures only, and with adjustment to land and ocean temperatures.
I set out to test the assumptions underlying temperature homogenization from scratch. I was able to confirm the underlying assumptions and reproduce most of the results of the existing research. I have documented the steps in the report and released all of the computer code so that others can continue the project. The following video summarizes the most important results.
It is fairly easy to establish that there are inhomogeneities in the record; that they can be corrected, and that correcting them increases the warming trend. Furthermore tests on the GHCN homogenization process with synthetic data fail to detect any bias. However verifying the size of the impact on the global temperature record from scratch is more difficult. The simple method presented in the report underestimates the impact on global temperatures, both in synthetic benchmark data, and in the real world data. The next challenge is to improve the recovery of the global trend in benchmark data, and then determine how this affects the results using the real data. I have made some progress on this issue, but I encourage others who are interested in the problem to take my work and build on it.