Open Mind

Brand New Hockey Sticks

September 7, 2008 · 109 Comments

There’s a new hockey stick in town, in Mann et al. 2008, Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia, PNAS, 105, no. 6. A great deal of supporting information is also available, including all the data and programs used for the reconstructions.


These reconstructions provide estimates of past temperature for the northern hemisphere, the southern hemisphere, and the globe as a whole, as well as for land only and for land+ocean. The research uses 1,209 proxy data series, a much larger network of proxy data than any which came before it.

All 1,209 series were available back to at least A.D. 1800, 460 extend back to A.D. 1600, 177 back to A.D. 1400, 59 back to A.D. 1000, 36 back to A.D. 500, and 25 back to year “0″ (i.e., 1 B.C.).

As a result, it’s possible to reduce the probable errors, further back in time than before. The reconstructions are also done by two different statistical methods: “cps” and “eiv.” The “cps” method is “composite-plus-scaling,” in which the available proxies are combined, then the composite is scaled to find the best match to the calibration data. In this method, data don’t have to match local temperature change, they can still be applied as long as they give information about global or hemispheric temperature change. The “eiv” method is “error-in-variables,” in which proxies are fit to calibration data allowing for errors in both the predictor and the predictand. For this method, proxies are fit to temperature data for the locale from which they derive. Both methods are in excellent agreement, but the authors find better verification statistics for the eiv method, and generally express higher confidence in that method.

Mann et al. also derived past temperature using subsets of the proxies. One such analysis omits individual proxies which don’t show statistically significant correlation with local instrumental data:

Reconstructions were performed based on both the “full” proxy data network and on a “screened” network (Table S1) consisting of only those proxies that pass a screening process for a local surface temperature signal. The screening process requires a statistically significant (P < 0.10) correlation with local instrumental surface temperature data during the calibration interval. Where the sign of the correlation could a priori be specified (positive for tree-ring data, ice-core oxygen isotopes, lake sediments, and historical documents, and negative for coral oxygen-isotope records), a one-sided significance criterion was used. Otherwise, a two-sided significance criterion was used. Further details of the screening procedure are provided in SI Text.

They also created reconstructions without using any tree-ring data at all, in response to objections that tree ring data aren’t reliable indicators of temperature change. But for each subset of data used, for northern hemisphere, southern hemisphere, and globe, for land-only and for land+ocean, for the cps and the eiv analysis methods, all results were qualitatively similar: a hockey stick.

Nominally, the recent observed decadal warmth recorded in the instrumental observations exceeds the uncertainty range of the reconstructions over at least the past 1,600 years for NH land temperatures as reconstructed by CPS (Fig. S5) and the past 1,700 years for NH land plus ocean temperatures as reconstructed by EIV (Fig. S6). Because this conclusion extends to the past 1,300 years for EIV reconstructions withholding all tree-ring data, and because non-tree-ring proxy records are generally treated in the literature as being free of limitations in recording millennial scale variability (11), the conclusion that recent NH warmth likely** exceeds that of at least the past 1,300 years thus appears reasonably robust. For the CPS (EIV) reconstructions, the instrumental warmth breaches the upper 95% confidence limits of the reconstructions beginning with the decade centered at 1997 (2001). It is intriguing to note that the removal of tree-ring data from the proxy dataset yields less, rather than greater, peak cooling during the 16th–19th centuries for both CPS and EIV methods (see Figs. S5a and S6b, respectively), contradicting the claim (33) that tree-ring data are prone to yielding a warm-biased “Little Ice Age” relative to reconstructions using other high-resolution climate proxy indicators.

For different methods, regions, and proxy subsets, there are differences in how far back in time on can draw reliable conclusions about comparisons of modern temperatures with the past:

We find that the hemispheric-scale warmth of the past decade for the NH is likely anomalous in the context of not just the past 1,000 years, as suggested in previous work, but longer. This conclusion appears to hold for at least the past 1,300 years (consistent with the recent assessment by ref. 2) from reconstructions that do not use tree-ring proxies, and are therefore not subject to the associated additional caveats. This conclusion can be extended back to at least the past 1,700 years if tree-ring data are used, but with the additional strong caveats noted. When differences in scaling between previous studies are accounted for, the various current and previous estimates of NH mean surface temperature are largely consistent within uncertainties, despite the differences in methodology and mix of proxy data back to approximately A.D. 1000. The reconstructions appear increasingly more sensitive to method and data quality and quantity before A.D. 1600 and, particularly, before approximately A.D. 1000. Conclusions are less definitive for the SH and globe, which we attribute to larger uncertainties arising from the sparser available proxy data in the SH. Given the uncertainties, the SH and global reconstructions are compatible with the possibility of warmth similar to the most recent decade during brief intervals of the past 1,500 years.

The reconstructions, as well as previous published reconstructions, are displayed in a classic “spaghetti graph”:

For a simpler view, the first graph in this post shows just the composited global land+ocean temperature reconstructions using the eiv method. Here’s the composited global land-only reconstructions by the same method:

But I won’t bother to display individual graphs for all regions for all methods. Those who are keenly interested can get the data and look for themselves.

Categories: Global Warming

109 responses so far ↓

  • vivendi // September 7, 2008 at 9:06 pm

    “In this method, data don’t have to match local temperature change, they can still be applied as long as they give information about global or hemispheric temperature change.”
    Can you explain this please. How would tree rings - or any other proxy - be influenced by global climate? Isn’t local climate variability (= weather and local climate) the driver for plant growth?
    Are you saying that tree rings in the NH show the same characteristics as those in the SH? I have looked at some of the proxies; there doesn’t seem to be any resemblance

  • chriscolose // September 7, 2008 at 9:23 pm

    Tamino,

    if you get the opportunity, can you comment briefly on William Briggs segment concerning smoothing application to time series on this paper

    http://wmbriggs.com/blog/2008/09/06/do-not-smooth-times-series-you-hockey-puck/

    [Response: Clearly Briggs has a stick up his ass. A hockey stick.]

  • mathjunkie // September 7, 2008 at 11:12 pm

    OT: For those who are interested in following the comment thread on a particular post here at Open Mind via an RSS feed, just subscribe to the post’s name with “feed” stuck on the end.

    So, for this post you would subscribe to
    tamino.wordpress.com/2008/09/07/brand-new-hockey-sticks/feed

    At least, this works for me in Google Reader, and I find it is a great way to keep up with the comments without having to search through for the one I last read.

  • Lee // September 7, 2008 at 11:51 pm

    vivendi,

    Global temperature changes can cause local rainfall changes, or cloud cover changes, or season timing changes, and so on. Yes, plants respond to local conditions. But the local conditions the plants are responding to, when we see correlation with global temperatures, may be something other than local temperature.

  • bill // September 8, 2008 at 3:32 am

    Tamino,

    Do you happen to have any substantive comments on Briggs post? Why do you think that using smoothed data as input is a reasonable thing to do? Inquiring minds want to know.

  • nanny_govt_sucks // September 8, 2008 at 4:22 am

    Global temperature changes can cause local rainfall changes, or cloud cover changes, or season timing changes, and so on.

    I think some more explaining is required. Warming can cause more rainfall or less, from what I’ve heard on these boards. Perhaps you’d need to point to some study that shows warming always or most likely causes more rain, more sunny days, and season timing changes that always or most likely benefit growth (in the case of trees). Otherwise these “teleconnections” to global climate are just hand waving, are they not? What am I missing?

  • Glen Raphael // September 8, 2008 at 4:33 am

    “The research uses 1,209 proxy data series, a much larger network of proxy data than any which came before it.”
    I guess that’s the thing I don’t understand about Mann’s whole project. Isn’t it intuitively obvious that the more proxy series you include in this sort of study, the flatter the shaft of the hockey stick is going to be? Regardless of what actually happened with temperatures in the past, combining lots of disparate types of data is not going to give you a good sense of the historical standard deviation and also seems likely to clip a lot of the specific highs and lows you’d see if you stuck to a single proxy or at least a single type of proxy. No?

  • JM // September 8, 2008 at 4:56 am

    chriscolose

    Our host has been a little blunt in his response, but see my comment over there at Sept 7 11:48.

    To quote from it:


    Doubter: “The difference between the climate data and the stock market data is that people know better than to extrapolate the recent stock market data hundreds of years into the future.”

    Bad analogy. In climate data there *is* an underlying physical process (no matter how loudly Briggs wishes to shout about it, he’s wrong on that point). In the stock market there is no physical process - only some abstract process that models the sum of the economic behavior of many actors.”

    (Tamino, this is my own thought. If I’ve misrepresented what I intuit may be your thinking, I apologize.)

  • Mark Zimmerman // September 8, 2008 at 5:20 am

    the first graph in this post shows just the composited global land+ocean temperature reconstructions using the eiv method. Here’s the composited global land-only reconstructions by the same method

    Are your graphs constructed using just the proxies, or the proxies + instrumental data of the 20th century?

  • vivendi // September 8, 2008 at 5:39 am

    Lee, “Global temperature changes can cause local rainfall changes”. Can or do? And how do you know in which way global changes (rainfall, temperature, cloudiness) influence local proxies and what part of the proxy signature is attributable to global or purely local changes? In our region we had a rather cool summer. Was that a global or a local effect?

  • bill // September 8, 2008 at 12:15 pm

    JM,
    Um, you just described the physical process behind the stock market, or any market: Individual actors reacting to local stimuli. The only difference between that and a physical process is the consciousness of the actors, which allows for all sorts of interesting interactions…

    The abstract model is something we make up in both cases, so that we can understand what’s happening.

  • kevin // September 8, 2008 at 1:03 pm

    Glen:
    If I’m reading correctly, and someone please let me know if I’m not, the authors have a ~150 year period of overlap between the proxy data and the instrumental data which demonstrates that the combined proxy data do indeed capture the highs and lows seen in the instrumental record quite well. So, I see what you’re saying, but if we’re interested in global average temperature variation rather than local extremes, it seems the aggregate proxy data work very well.

    For people worried about proxies that correlate with global temperature: A) note that it is observed, so there’s something causing it whether or not you or I understand it. Nothing wrong with trying to understand it of course, but a couple of you seem to take the attitude that if you don’t understand it, it’s not real…and that’s just stupid. B) note that the EIV method gives the same overall results. So if you can’t accept point A, feel free to ignore the results from using the CPS method. The results using the EIV method still stand.

  • kevin // September 8, 2008 at 2:01 pm

    Bill: re: “The only difference between that and a physical process is the consciousness of the actors, which allows for all sorts of interesting interactions…”

    Well, yes, but that’s the difference between modeling the path of a planet or projectile vs. modeling the path I’ll take on a stroll downtown. The one with only physical processes, and no consciousness of actors, is inherently much, much more predictable, no?

    And yes, I realize the earth’s climate system does involve humans as conscious actors; that’s why there are different emissions scenarios in the models.

  • george // September 8, 2008 at 2:41 pm

    Briggs says:

    Unless the data is measured with error, you never, ever, for no reason, under no threat, SMOOTH the series! And if for some bizarre reason you do smooth it, you absolutely on pain of death do NOT use the smoothed series as input for other analyses!

    I would agree that it is important to have a good idea of the noise characteristics before smoothing, but the last part of the statement is nonsense (if for some bizarre reason you do smooth it…)

    In the past, I have developed analysis software for instruments (Raman and IMS) that identify unknown compounds (eg, explosives).

    for that purpose, I used Savitsky Golay and other smoothing techniques in the process of finding the locations of “peaks” in noisy spectra.

    The characteristics of the noise — particularly the width of the noise peaks relative to that of the spectral peaks — was used to arrive at an appropriate width for the smoothing filter. Under most circumstances, the noise peaks are significantly narrower than the spectral peaks, so i t is pretty easy to come up with a good filter width that will substantially preserve the peaks of interest while filtering out the noise.

    If no smoothing had been performed before performing the actual peak-finding analysis, the spectral peak locations would almost certainly have come out incorrect (easy to verify with a known compound on a calibrated instrument) .

    And, suffice it to say, it is not a good thing to have explosive detectors finding peaks where there are none — ie, false alarming. (results in long lines and lots of swearing by all at airports!)

    With no smoothing, the peak locations can appear to move around significantly from one spectrum and one instrument to another depending on the noise in each spectrum.

    So to say that “you absolutely on pain of death do NOT use the smoothed series as input for other analyses!” is simply ridiculous.

  • JM // September 8, 2008 at 3:10 pm

    Bill, I’m sorry I don’t agree. The process underlying climate is governed by unchanging physical law.

    The process “underlying” a market is really nothing more than an abstraction and the “laws” are constantly changing. Quite different IMHO.

    For an example I’d refer to volatility which is not directly observable, can only be inferred from the changes in market prices and changes all the time. (I’ll stick my neck out here a little bit:- when volatility is high, you could view it as market participants responding differently to the same stimuli - ie. what you attribute to conciousness - which doesn’t happen in physical systems. So what happens IMHO is that all prices contain information and Brigg’s comment re. smoothing could have some validity.)

    Maybe ‘abstract’ was not the right word - I was using it more in the sense that it is not possible even in principle to develop an unchanging model of the market. It is (leaving aside issues of chaos etc) much more possible with physical systems.

    The second distinction I’d draw is that of measurement. In a market there is essentially no measurement error - a price is a price is a price. That is not true with physical systems, as all measurements are subject to measurement error.

    Sorry, if I’m not being completely clear here, if you have any comment I’d be interested to hear it.

  • Timothy Chase // September 8, 2008 at 3:39 pm

    Bill wrote:

    Um, you just described the physical process behind the stock market, or any market: Individual actors reacting to local stimuli. The only difference between that and a physical process is the consciousness of the actors, which allows for all sorts of interesting interactions…

    You are right, of course. Even assuming the dualism of Descartes, the stock market won’t exist unless people somehow manage to communicate their intentions to buy or sell to the world at large by means of a physical medium.

    Then again, there is the fact that these actors will themselves have models of the world at large and will act not simply in response to what has taken place so far, but in response to their expectations regarding future events. They are afterall pursuing a profit which they don’t have as of yet, but hope to at some time in the future. They want to be successful — much like the economist who spends so much time trying to figure them out and predict their behavior.

    In any case, climatology is concerned with climate. Globally at least, there tends to be a limit to the rate at which the balance of energy in the system changes. And climates are typically defined as being 30 years in length — with things typically taking place more glacially than the stock market. Given this, it would seem to make sense to be looking for the temperature on a decadal scale — hence the averaging.

  • elspi // September 8, 2008 at 3:45 pm

    ah.. bill
    You seemed to have missed the point. There are these rules (called physics and chemistry) that govern the underlying small scale interactions in the climate.

    In the stock market there are no underlying rules. There are assumptions such as: people are rational actors. The problem with these assumptions is that they are all false.
    There is a whole garden industry devoted to hiring groups of poor students, putting them in a market, and proving yet another one of these assumptions to be false.

    Thus the situations are polar opposites of one another. In the one, you understand the small almost perfectly, and you are trying to extrapolate to the larger system. In the other you don’t understand the small at all (even from a probabilistic standpoint) and so you don’t have a prayer of understanding the large.

  • Greg // September 8, 2008 at 4:21 pm

    Kevin:
    “note that it is observed, so there’s something causing it whether or not you or I understand it. ”

    Not at all… correlation between a local proxy and global temperatures can be down to chance alone (not something causing it at all). In the screening process, amongst hundreds of proxies there will be many that correlate to global temperatures well and are retained. The hope is that the good proxies in this bunch outweigh the bad ones.

    Mark Zimmerman:
    “Are your graphs constructed using just the proxies, or the proxies + instrumental data of the 20th century?”

    The question is redundant, since only those proxies that correlate with the instrumental record in the calibration period pass the screening process. So, a priori, the right side of the graph will always resemble to the instrumental period. As I understand it, the whole point of the process is to find the shape of the graph before the calibration period.

  • Glen Raphael // September 8, 2008 at 4:45 pm

    Kevin: Some of your valid temperature proxies are bound to be less accurate as you go further back in time. When you get back to periods that were a lot colder or a lot warmer than the reference period or different in some other way - drier, wetter, sunnier, whatever - there will be some drift in the accuracy. So some of what you’re seeing when you look further back is noise rather than signal and the noise part will on average have a lot flatter long-term trend than the signal part, so the average of the proxies will flatten as you go back in time.

    Another problem is if your proxies have any *temporal* uncertainty, which I gather some do have. Suppose proxy A says it was extremely warm around the year 1000 and mild before and after that, while proxy B says it was extremely warm around the year 1100 and mild before and after that. Looking at either proxy on its own correctly suggests that it was much warmer at multiple points in the past than now, because those two peaks actually reflect the same underlying event. But since the dates don’t perfectly align, any sort of averaging between the A and B will suggests that it was always mild in the past - the high points will get clipped.

    (And yes, naturally there is some possibility that a few proxies are just bogus with a match during the reference period due to chance and cherrypicking. That also would be guaranteed to diminish the signal but wasn’t my primary concern.)

  • Ray Ladbury // September 8, 2008 at 4:51 pm

    JM, This is off-topic, so my apologies. While there are certainly distinctions between physical systems and financial markets, the techniques of statistical mechanics can and are being used to model them. While the field of econophysics is relatively new, brokerages and hedge funds have been hiring physicists as quantitative analysts since at least the late 70s.

    We can view a market as being “made” by measurement devices called “investors”, who are trying to measure a quantity called “value”. Each device measures “value” with its own distribution of errors, and we may not know the value of the error for any one instrument at any given time, but we can measure the spread among instruments. To make a market, the investor measures value and tries to identify entities that are “undervalued” by other investors. Each investor then bids accordingly, and “bids” and “asks” are matched to determine a price. So the spread of bids is a measure of the error on the price. The fact that the error may depend on the state of the measurement device complicates things, but it doesn’t preclude analysis–in fact, it makes it all that much more important!

  • Patrick Hadley // September 8, 2008 at 4:54 pm

    Georg, I am not a statistician but my reading of Brigg’s post is that temperature data does not contain noise.

    I know nothing at all about your work with explosives, but would I be right in guessing that if your data on explosives gave actual results with real peaks showing that there actual is explosive material present then you could not ignore them by simply smoothing them out as noise. If 100 people went through an explosives detector with 99 giving a score of 0 and the other person giving a score of 10 it would not make much sense to smooth the data and say that the average score was an insignificant 0.1.

    Brigg’s seems to be saying, as far as I understand it, that when a temperature reading is made that is the actual reading of the temperature telling us how hot the item is. To describe a reading as “noise” just because it is higher or lower than average or different from what a model predicts seems to be wrong.

  • nanny_govt_sucks // September 8, 2008 at 4:56 pm

    For people worried about proxies that correlate with global temperature: A) note that it is observed, so there’s something causing it whether or not you or I understand it.

    … or the correlation is spurious.

  • RedLogix // September 8, 2008 at 5:43 pm

    Briggs baffles me. What are the most basic of statistical tools, things like averages and means, if not ’smoothings’?

    The proxies are not thermometers, they are imperfect substitutes for them. Therefore they produce data with both random and systematic errors, ie a lot of noise and mean values that are not correct. The noise we routinely deal with by an appropriate low-pass filter, but the systematic error or bias is not so readily corrected for.

    But there is no reason to suspect that ALL the proxy methods have the same systematic error, therefore over all proxies the mean bias or error should be much smaller than the error of any one of them.

    The whole idea of statistics is to smooth a LOT of noisy, inaccurate data to produce an output result that is more accurate than the ANY of the input data. So what exactly is Briggs on when he says that we are NEVER allowed to smooth data? Has he stopped being a statistician?

    It seems to me that in much of the highly technical debate around the hockey stick has lost sight of some very basic notions.

  • bill // September 8, 2008 at 5:53 pm

    My point in response to JM was quite simple: there are physical actions that, as an ensemble, lead to what we call a market, just as there are physical actions that lead to a “climate”. In the former, there are exchanges of goods (or promises of exchanges). Both systems are composed of local actors reacting to local information, and contrary to Elpsi’s assertion: they aren’t perfectly rational “Chicago men”.

    Markets are not an “abstract processes” at all. There are rules (SEC anyone?) and constraints. Any macro-level smoothness is an emergent phenomena. Your conception of them can be quite abstract, but they exist outside of your model.

    As I said above, “The abstract model is something we make up in both cases, so that we can understand what’s happening”. The climate models are also abstractions. A fully detailed climate model is beyond computing. The modeler determines what to include and exclude.

  • kevin // September 8, 2008 at 6:04 pm

    Greg and Nanny:

    Yes, of course there is some chance of spurious correlation. I.e., spurious correlations do exist, so the chance that this is the case with these proxies is non-zero. But, given that the results they yield jives with the EIV results, that doesn’t seem terribly likely, certainly not in all cases.

  • bill // September 8, 2008 at 6:10 pm

    Patrick,
    Well I am a statistician ( but not a time-series guy.) Briggs point seems to be a special case of the general rule “don’t smooth/change the data without an error model”. All of the smoothing methods I’m familiar with assume that data are signal+noise. Using a smooth as an input to another model implies that the residual component somehow disappears in the real world.

    Changing the data is a great way to boost your apparent ability to “predict”. If the data really are “almost all signal” then smoothing removes the signal, and a model of the smooth departs from the process that generated the data. If there is measurement error, then quantify it and include it in the model.

  • dhogaza // September 8, 2008 at 6:25 pm

    … or the correlation is spurious.

    Which is why a bunch of proxies are looked for and analyzed, because the odds of all of them being a) spurious and b) spurious in the same way becomes lower with each proxy you add.

  • BoulderSolar // September 8, 2008 at 6:31 pm

    Yikes! I just read Ian Jolliffe’s comment on Open Thread #5. Looks like you owe him an apology as your whole argument about non centered PCA used him as the authority. Are you going to respond?

  • Ray Ladbury // September 8, 2008 at 6:35 pm

    Nanny posits that the correlation between temperature and proxies might be spurious.

    Really? On all ~1200 proxies, which all seem to agree relatively well, despite involving very different phenomena? Care to speculate on the odds of that?

  • george // September 8, 2008 at 6:47 pm

    Patrick says:

    my reading of Brigg’s post is that temperature data does not contain noise.

    If that is what Briggs is claiming, then I’d have to say that he is simply wrong — but I actually don’t believe that is what he is claiming.

    All data contains “noise” of one type or another. Proxy data is certainly no exception. Not only that, it is possible that one person’s “noise” may be another person’s signal. It really depends what one is focussed on in the data.

    It is often the case (as with explosives detection) that one can filter out the noise and leave the information of interest basically intact.

    Noise sometimes looks very different from the features that one is interested in and one can use this information to one’s advantage.

    In many spectra, for example, the peaks of interest are significantly broader than the noise, which may actually look something like “spikes” superimposed more rounded hills.

    With an appropriate choice of filtering, it is possible to filter out the noise without losing much in the way of the information you are interested in. It reduces peak height a little, but if you have chosen the width of the filter properly, not by much.

    And if you do not filter out the noise, it is far more likely that you will falsely identify noise peaks as the peaks you are interested in. This is particularly true if there is a lot of noise scattered about the spectrum. The more noise, the greater the likelihood that it will appear at the location(s) of interest.

    The latter causes false alarms, which can be a major problem. If the instrument frequently false alarms, the operators lose all faith in it (and rightly so) and it becomes essentially useless.

    I am less familiar with climate data, but I do know that in that case, effects like volcanic eruptions cause narrow (short duration) “spikes” on the data. If one is interested in broader-scale “features” in the temperature data related to changes in climate (defined on a multi-decade time scale), then my guess is that one can probably safely filter out such short term spikes (essentially noise) and not lose much in the way of information about features that last (and change) over decades.

    I simply must take issue with Briggs’ blanket statement that

    you absolutely on pain of death do NOT use the smoothed series as input for other analyses!”

    It may be true in certain cases, but is certainly not true in all, as it implies.

  • JM // September 8, 2008 at 7:16 pm

    Ray, I understand what you’re saying, but I think (and this is a personal view only) that it’s a confusion of levels.

    (I apologize for the sketchiness of the following comment - I’ve only been thinking a bit about this for the last couple of hours and haven’t fully formulated what I want to say. Consider this an outline from a lay person.)

    If I look at a temperature measure of 106F (say) there is an uncertainty due to precision and another possible error simply due to the observer misreading the thermometer.

    If, on the other hand I sell a bond at 106 there is no uncertainty, neither in precision nor observation.

    What there is (and this is your point I think) is an uncertainty in the investors estimate of the “true” value of the bond, which is unknowable at the time and cannot be determined until maturity. What I think you’re saying is that the true value represents the model, while the transaction price represents the observation of the market (or at least of the two participants in the transaction).

    The evolution of the market price is the stochastic process, while the end result at maturity is the model.

    Things are different in physics (or at least they are in my humble opinion). We know the model and have validated it in simple, abstract lab experiments. What we are doing in climate is trying to extract the signal of that model from noisy data confounded by an enormous number of real world factors.

    So when someone says “but you can’t backtest/hindcast/whatever, and must only evaluate against predictions you make (and in markets - monetize)” they are right in market situations because the underlying model is not known.

    But I’m not convinced that that purist approach applies in physics/climate. There is no reason I can see to prohibit hindcasting as a test of a model.

    Anyone who told me in a market that if stock X had shown 5 upticks in the previous 10 days, it would inevitably fall by over 2% within the next 10 days, well I’d regard them as nuts, as they’ve essentially bet on tails after a run of 5 heads. (This is actually a real-world example - a technical trader gave me this one a few years ago).

    But if someone in physics/chemistry/climate says the same thing - ie. that the historical data supports their model, I’d be impressed.

    Because physical laws don’t vary. Market behavior does.

  • Mark Zimmerman // September 8, 2008 at 7:38 pm

    “Are your graphs constructed using just the proxies, or the proxies + instrumental data of the 20th century?”

    The question is redundant, since only those proxies that correlate with the instrumental record in the calibration period pass the screening process

    I would like to see a graph of the proxies alone, to see of the hockey-stick shape is preserved.

  • cougar // September 8, 2008 at 8:20 pm

    [or the correlation is spurious] And thus we’ve come full-circle; the idea in the latest effort was to pull up enough observations to decide if the correlation is spurious, or no. Well, t’aint spurious, within agreed confidence limits. So long as we can evaluate the statistical method and not find fault, and the data itself is not found to be broken, then we can accept the results until new data/observations prove otherwise, or something else causes us to require even greater statistical confidence.

    This on whole allows us to *move the hell on already*, and for example wonder what the observations are trying to tell us about changes in the physical system in which we are embedded, and upon which we are 100% reliant for survival as individuals and a species. Frak, but you’d think this stuff would be obvious.

  • chopbox // September 8, 2008 at 8:42 pm

    If it seems odd that Tamino has not yet given a more substantive response to Chris Colose’s most reasonable request ( http://tamino.wordpress.com/2008/09/07/brand-new-hockey-sticks/#comment-21852 ) for a comment on Brigg’s posting that data should not be smoothed, perhaps one should keep in mind that Tamino may be a tad busy right about now. You may recall a recent post by Tamino on PCA that quotes expert Ian Joliffe on the reasonableness of using decentered PCA? It appears that Dr. Joliffe has finally heard of Tamino’s post, and in a post on this very blog (http://tamino.wordpress.com/2008/08/10/open-thread-5-2/#comment-21873 ) Dr. Joliffe disavows any implication that he endorses decentred PCA and indeed asks for an apology from Tamino for suggesting that he did. I am sure it will all work out very amicably, but these things do take time to work themselves out.

  • Peter C. Webster // September 8, 2008 at 9:42 pm

    Perhaps a step back is in order.

    The fact is that well-mixed long-lived greenhouse gases like carbon dioxide “trap” the solar energy that radiates back towards space from the Earth’s surface, which causes the lower atmosphere to be warmer than it would be otherwise. Appologies to/for the paraphrased NOAA description of the process.

    There is no doubt more carbon dioxide = more of the effect. On its own. The problem is quantifying it in the system, where not all else is held equal, to determine the net effect.

    ModelE having 9% of the greenhouse effect lost when removing carbon dioxide versus carbon dioxide alone supplying 26% is interesting, but computer simulations are not reality.

    What are the facts?

    Atmospheric carbon dioxide levels as measured at Mauna Loa show a 16 ppmv rise per decade since 1980 (.43 to 2.93 ppmv per year).

    The GHCH-ERSST global mean temperature anomaly shows a .16 C per decade trend rise since 1980 (At +.46 in 2007)

    The average of the TLT, TMT, TTS and TLS RSS processed MSU readings show a combined -.022 C per decade trend fall since 1980 (+.265 low/mid troposphere, -.019 troposphere/stratosphere, -.334 low stratosphere)

    Why 1980? Satellite records start then, and one would imagine that GHCN-ERSST is exclusivly temperature sensors and engine inlets after that.

    Going back to the start of direct carbon dioxide readings in 1969, the decadal trends are +17.9 ppmv and +.15 C per decade.

    Scintillating conversations of all else aside, operating under the assumption the anomaly trend reflects a rise in temperatures. The challenge is to show a direct causal relationship between the last 38 years of +1.8 ppmv/year in the air and +.015 C/year in the surface readings.

    As Roger Revelle put it in the 1950’s, about the accelerated burning of fossil fuels: ““This {geophysical} experiment, if adequately documented, may yield a far-reaching insight into the processes determining weather and climate. It therefore becomes of prime importance to attempt to determine the way in which carbon dioxide is partitioned between the atmosphere, the oceans, the biosphere and the lithosphere.”

  • David B. Benson // September 8, 2008 at 9:57 pm

    Glen Raphael // September 8, 2008 at 4:33 am — Here is a recent paper doing a temperature reonstruction using just a few (9?) boleholes. See Figure 1:

    http://www.geo.lsa.umich.edu/~shaopeng/2008GL034187.pdf

  • Gavin's Pussycat // September 8, 2008 at 9:59 pm

    > or the correlation is spurious.

    There is this thing called statistical significance. It isn’t ‘observed’ until it passes the test.

  • Gavin's Pussycat // September 8, 2008 at 10:02 pm

    I’m not sure precisely what Briggs is trying to say as he is rambling, but one thing he seems to miss is that a smoothed proxy is a proxy too — just a different one. With its own statistical properties. The process of calibration and reconstruction using these is just as legitimate as the original process. Only different.

  • nanny_govt_sucks // September 9, 2008 at 12:04 am

    Really? On all ~1200 proxies, which all seem to agree relatively well,

    I guess you haven’t seen the proxies. Many are white noise, and there is much variability. You’d only need a few hockey-stick shaped proxies inappropriately weighted to stretch out a hockey-stick shape from the noise.

  • nanny_govt_sucks // September 9, 2008 at 12:08 am

    But, given that the results they yield jives with the EIV results, that doesn’t seem terribly likely, certainly not in all cases.

    Spurious correlations with local temperatures are just as likely as with global temperatures. I don’t see what difference it makes with one method or another when it comes to this potential problem.

  • David B. Benson // September 9, 2008 at 12:27 am

    nanny_govt_sucks // September 9, 2008 at 12:08 am — Compare the new Mann et al. results with the last 2000 years of the borehole data.

    What do you see?

  • Lazar // September 9, 2008 at 12:56 am

    Check figure S6… amazing.
    The correlation coefficient is 0.72 for EIV over 700-1850AD (my calculation).

  • bill // September 9, 2008 at 12:58 am

    gavin,
    Check Briggs most recent blog entry. He demonstrates the biases caused by jointly smoothing two independent series. Smoothing is a projection operator onto a lower-dimensional space. It forces certain structure on the data and that structure can come through as correlations.

  • HankRoberts // September 9, 2008 at 1:14 am

    Has anyone tried rerunning Briggs’s (smooothed) hurricane paper data without the smoothing, to see if it changes the conclusion?

  • Glen Raphael // September 9, 2008 at 4:29 am

    Really? On all ~1200 proxies, which all seem to agree relatively well
    ClimateAudit recently posted a fantastic animated gif that browses some of the proxies to give a sense of the range of trends you can find in there. I’m not sure if this blog lets you embed gifs and this one’s wide enough that it would screw up the formatting anyway if it did, so I’ll just post the links. The gif you want to see is here:

    http://www.climateaudit.org/wp-content/uploads/2008/09/mwpproxy6.gif

    And this post gives some of the context:
    http://www.climateaudit.org/?p=3573

    Most of the hockey stick shape comes from relatively few of the proxies. For several of these few one could just as easily have selected alternative versions of the same or similar data that did not show the same trend. Since Mann didn’t state clear inclusion criteria, an anti-Mann who made slightly different choices could have done essentially the same study with the same statistical methods and the same “skill” metrics *without* producing a hockey stick. Which means the fact that Mann’s study does produce a hockey stick might only tell us that’s what he wanted it to show when he made his choices as to what to include.

    [Response: You've missed the point. The hockey stick shape comes from the proxies which correlate with the calibration data. Including only proxies which don't correlate with the calibration data would have no skill.]

  • Gavin's Pussycat // September 9, 2008 at 7:08 am

    bill, I think the problem is here:

    fit = cor.test(s0,s1)
    # store p-values and “lag”
    a[i,j-1] = fit$p.value
    d[i,j-1] = j

    Now I don’t know any R, but I am pretty sure the p levels here are computed assuming the absence of autocorrelation in s0 and s1. Obviously that doesn’t hold if these are moving averages. The R documentation seems to confirm this:

    If method is “pearson”, the test statistic is based on Pearson’s product moment correlation co-
    efficient cor(x, y) and follows a t distribution with length(x)-2 degrees of freedom if the
    samples follow independent normal distributions. If there are at least 4 complete pairs of observa-
    tion, an asymptotic confidence interval is given based on Fisher’s Z transform.

    So: correlation value, number of data points, but not any autocorrelation which would reduce the “effective” number of data points. Briggs’ plots look precisely what they would look like due to this error.

    This is the classical error Tamino has warned against also in connection with linear regression: doing it on autocorrelated data as if it were “white”, will give too optimistic results in testing.

    The problem is not smoothing, it’s knowing what you’re doing :-)

    [Response: You are quite correct.]

  • Greg // September 9, 2008 at 7:26 am

    Mark:
    “I would like to see a graph of the proxies alone, to see of the hockey-stick shape is preserved.”

    SM put an animated gif of the proxies up at Climate Audit, you could look for it there. You’ll need to keep in mind that negative correlations also exist - the decrease in a proxy may correlate to the increase in temperature - and those proxies contribute as well.

  • Richard // September 9, 2008 at 8:38 am

    The way I see it is that climate is regional. If you take proxies from a large number of regions and try to integrate them into one dataset, the errors are exacerbated.

    There seems to be this slavish devotion to generating a global average temperature. However merging disparate datasets from widely dispersed regions can only increase the level of uncertainty in the average you calculate.

    The level of agreement between the proxies is very low. Climateaudit.org has a very good series of sequential images showing the proxies by site. All I can say is, that I cannot see any consistent relationship between sites.

  • Barton Paul Levenson // September 9, 2008 at 9:33 am

    nanny writes:

    Perhaps you’d need to point to some study that shows warming always or most likely causes more rain,

    google “Clausius-Clapeyron.”

  • Chris O'Neill // September 9, 2008 at 10:01 am

    nanny_govt_rescues_mortgage_lenders:

    You’d only need a few hockey-stick shaped proxies inappropriately weighted to stretch out a hockey-stick shape from the noise.

    And leaving out maligned proxies gives a hockey-stick too.

  • george // September 9, 2008 at 11:42 am

    Briggs point seems to be a special case of the general rule “don’t smooth/change the data without an error model”.

    Unfortunately, it is clear that Briggs was not talking about a special case or even a general rule!

    He categorically stated that

    you absolutely on pain of death do NOT use the smoothed series as input for other analyses!”

    Of course, one should characterize the data with regard to noise and of course one should be careful about what one uses smoothed data for.

    Tamino has also pointed out the pitfalls of smoothing two independent series with a boxcar average of the same number of years and then applying correlation analysis to the smoothed series (namely, how it can introduce spurious correlation).

    But Briggs was making a blanket statement (”absolutely on pain of death do NOT use the smoothed series as input for other analyses!).

    In other words, he was claiming that one should “never” do it.

    I learned years ago from a high school math teacher ALWAYS to mistrust claims like ALWAYS, NEVER and ABSOLUTELY.

  • andy // September 9, 2008 at 12:48 pm

    And the Mann’s reference to this
    “The “cps” method is “composite-plus-scaling,” in which the available proxies are combined, then the composite is scaled to find the best match to the calibration data. In this method, data don’t have to match local temperature change, they can still be applied as long as they give information about global or hemispheric temperature change. ”
    method is, of course, some of his earlier work. Of the so and so many proxies, there are included those which quality is also so and so.

    Even Real Climate seems to guide the discussion about Mann’s HS to safer waters, ignoring the comments about the proxies and methods.

  • Ray Ladbury // September 9, 2008 at 1:11 pm

    Richard opines “The way I see it is that climate is regional. If you take proxies from a large number of regions and try to integrate them into one dataset, the errors are exacerbated. ”

    Yes, well, unfortunately, the way you see it is wrong. You have both regional and global climate. They interact. They interplay, but both exist. Your attempt to divert attention to regional climate is merely an attempt to sway the discission away from global climate where models are skillful, to regional climate, where they are less so.

    And you blatant assertion that the errors of many independent proxies will compound assumes the errors will be correlated. Now that would indeed be a remarkable occurrence. Do you have any mechanism by which that would occur or evidence that it has? Didn’t think so.

  • J // September 9, 2008 at 1:16 pm

    Peter Webster writes: “The average of the TLT, TMT, TTS and TLS RSS processed MSU readings show a combined -.022 C per decade trend fall since 1980 (+.265 low/mid troposphere, -.019 troposphere/stratosphere, -.334 low stratosphere)”

    That’s a very peculiar statement.

    One of the “signatures” of greenhouse gases as a cause of climate change is the combination of warming in the lower troposphere and cooling in the stratosphere. (In other words, exactly what we observe in the RSS data).

    This is in contrast to, for example, solar forcing, which would cause warming in both the troposphere and stratosphere.

    So why on earth would you average the tropospheric and stratospheric satellite temperature trends like that?

  • Lazar // September 9, 2008 at 3:52 pm

    GP;

    … you’re correct, autocorrelation is not accounted for by cor.test (I use R).

  • Peter C. Webster // September 9, 2008 at 5:27 pm

    This may be helpful. http://www.atmosphere.mpg.de/enid/20c.html

    J, why average the satellite trends? No real reason, just the data. Simply as an illustration of the atmosphere combined. Ignoring the 15 degrees (27.5 for lower trop) of poleward atmosphere of course.

    But that is why I also listed the combined trop, trop/strat, and strat channels on their own. How one interprets the fact considering only trop (TLT) and strat (TLS) channels alone (+.265 and -.334) the trend is basically flat at a change of less than .1 C per decade, or interprets that the satellites show the trop trending up and strat trending down, is not anything I commented upon. I simply put up the numbers.

    On the other hand, there is also the fact that the GHCN-ERSST trend of +.16/decade over the same period is about the same as the lower trop sat of +.169/decade. This matches well, so both enhance each other as far as reliability and correctness.

    But again, the real issue is causally linking a measured 16 ppmv to a sampled .16 C anomaly trend. Although as we know, not all of any actual warming would be due to just carbon dioxide in particular or even greenhouse gases in general.

    If one considers that the greenhouse effect provides an estimated 32 C of additional heat, and the current carbon dixoide equivalent is around 440 ppmv, it follows (ignoring any log behaviors) that gives us about 14 ppmv per 1 C This is versus the recent 100 ppmv to 1 C trends.

    Readers may parse that as they wish. It is simply something to think about.

    Also remember that GHCN-ERSST averages the more land-mass, more populated and more industrialized Northern Hemisphere with the less of each Southern, and also averages the samples of the lower heat capacity land surface with the higher heat capacity ocean surface. Why that is any more or less valid that combining all the satellite atmospheric readings?

    However, as we all know, we don’t live in water or in the stratosphere or upper atmosphere, we live in the lower troposphere on what are very often urbanized areas. Make of that what you wish.

  • Cougar // September 9, 2008 at 8:11 pm

    [we don’t live in water or in the stratosphere or upper atmosphere] Um… clue phone is ringing, and it’s for you. We live on the planet earth, third rock from the sun. You get the whole package starting at about 15 miles up when you enter the system. And all the bits “we don’t live in” interact constantly, reference any recent hurricane if you want a dramatic example. Reductionism becomes stupid, at some point, this opinion from a classicist. Learning to see the systems is harder but always more rewarding.

  • JM // September 9, 2008 at 8:32 pm

    Bill

    Perhaps I can express my earlier comments in a slightly more palatable form.

    Briggs, as I understand it, has a background in financial markets (as do I) and his point about not smoothing data (in the form of prices) has some validity in that context as the pricing of many securities depends on volatility (which is the term used in finance for variance).

    If you smooth a series you change the variance, and if you then feed that into further analysis you will end up miss-pricing a security and losing money.

    In climate the same thing applies, however the variance of the raw data (daily observations) is called “weather”, and it is precisely what we want to get rid of as we are trying to detect the longer term signal from the physical process of the change in the planets heat balance. A longer term process that we know is there from physical law, unlike financial markets where the “laws” are largely unknown and are not fixed.

    So in climate it is entirely appropriate to smooth - with due regard to statistical significance - as we are trying to separate the underlying long term change in heat balance from the day-to-day fluctuations caused by shorter term processes such as the day/night cycle.

    What is significant in financial markets (variance/volatility) is insignificant (variance/weather) in climate.

    When I drew a distinction between underlying processes grounded in physical law (climate) and underlying processes grounded in economics (market sentiment) I was trying to outline my view on the difference. Sorry if I didn’t do it very well, but I think personally that it is an important distinction.

    Sentiment changes, the laws of the universe do not.

  • JCH // September 10, 2008 at 12:57 am

    On Wall Street the sky can fall and not really have fallen. I don’t think the physical world is capable of any level of self-deception.

  • Ray Ladbury // September 10, 2008 at 1:48 am

    Peter C. Webster says: “… it follows (ignoring any log behaviors) that gives us about 14 ppmv …”

    Uh, Dude, given that it follows log(C/C0), why would we want to ignore “log behaviors”? That’s about as daft as averaging tropospheric and stratospheric temperature change.

  • Richard // September 10, 2008 at 2:58 am

    Ray Ladbury,

    “Yes, well, unfortunately, the way you see it is wrong. You have both regional and global climate. They interact. ”

    Well Ray, you are wrong. There is NO global climate. Climate by definition is regional. Check any definition of the subject you wish. None of them say that climate is global.

    My statistical advice is that unless each of the time series is properly calibrated combining 1200 or 1300 series from disparate sources and methodologies and then tacking them onto the instrumental record will always produce hockey sticks with high levels of uncertainty and error.

  • bill // September 10, 2008 at 3:39 am

    JM,
    Not ignoring you. I’ve a date with the surgeon tomorrow and am busy with the prep.

  • William Briggs // September 10, 2008 at 9:32 am

    Hi all,

    Briggs here. Everybody’s welcome to cruise on over to my blog and see if you can catch me out on my statistics. Comments like those by Gav’s Pussycat and Lazar and others show that there is something most of you can learn about some pretty basic statistics. (Just a hint: the autocorrelation between the two simulated series is 0, folks. A simple glance at the code shows this.)

    I explain everything over at my place, and correct the many errors I see here.

    You pal,

    Briggs

    [Response: The autocorrelation of the two simulated series may indeed be zero, but that of the *smoothed* series is not. That's pretty basic, and it's the reason for the result of your "demonstration" -- you failed to account for this when testing for correlation between the series. And by the way, there's no "auto"correlation *between* series -- "auto" refers to a single series.

    Were you not aware of this? Were you deliberately trying to deceive your readers?]

  • Barton Paul Levenson // September 10, 2008 at 10:36 am

    Gavin’s Pussycat writes:

    This is the classical error Tamino has warned against also in connection with linear regression: doing it on autocorrelated data as if it were “white”, will give too optimistic results in testing.

    I took the last 20 years of mean global annual Hadley CRU temperature anomalies and did time series regressions for them, from X year to 2007. I was able to show that you only get statistically significant regressions with 13 years of data or more, and that all the significant regressions indicated warming rather than cooling. (This was, of course, in response to the “global warming stopped in 1998″ nonsense).

    But one of the denialists correctly pointed out that I hadn’t accounted for autocorrelation in the residuals. I challenged him to do it and he declined, but I wasn’t able to figure out the proper procedure to use either, though I had vague memories of ADF tests and rho and so on. Tamino — is there a simple way I could analyze the same data, accounting correctly for possible residual autocorrelation? I reproduce the data I used and the results I got below. Hope it’s readable. If anyone wants the Excel spreadsheet I used, I’ll email it to them.

    Year Anom Slope p
    1988 0.180 0.020 0.000 *
    1989 0.103 0.021 0.000 *
    1990 0.254 0.020 0.000 *
    1991 0.212 0.023 0.000 *
    1992 0.061 0.025 0.000 *
    1993 0.105 0.022 0.002 *
    1994 0.171 0.019 0.011 *
    1995 0.275 0.016 0.044 *
    1996 0.137 0.016 0.092
    1997 0.351 0.007 0.424
    1998 0.546 0.005 0.643
    1999 0.296 0.017 0.084
    2000 0.270 0.012 0.279
    2001 0.409 -0.003 0.618
    2002 0.464 -0.012 0.095
    2003 0.473 -0.017 0.116
    2004 0.447 -0.020 0.270
    2005 0.482 -0.040 0.179
    2006 0.422 -0.020 0.000 **
    2007 0.402

    [Response: The proper compensation for autocorrelation in a linear regression can be found here; you'll need to use the final equation of that post to get the factors f_j to use in the 5th equation of that post. You'll also need to estimatethe autocorrelations \rho_j.

    The "usual" way to compensate for autocorrelation is to assume that the residuals follow an AR(1) process. In that case the autocorrelations go as \rho_j = (\rho_1)^j. But as shown here, that's not sufficient for global temperature data (see e.g. this graph).

    If you analyze monthly data, the AR(1) model just won't do. I suggest the autocorrelations are well approximated by \rho_j = \rho_1 \alpha^{j-1} (this corresponds to an ARMA(1,1) model). To estimate the parameter \alpha, you'll have to measure at least \rho_1 and \rho_2.

    Another suggestion: use all the data from 1975 on, after removing the linear trend, when estimating the autocorrelations. You'll need that much data to get good enough estimates of \rho_1,~\rho_2. Then use those values for all the time spans you test.

    If you stick with annual data, you might be able to get away with an AR(1) model.]

  • J // September 10, 2008 at 11:22 am

    Peter Webster writes: “If one considers that the greenhouse effect provides an estimated 32 C of additional heat, and the current carbon dixoide equivalent is around 440 ppmv, it follows (ignoring any log behaviors) that gives us about 14 ppmv per 1 C This is versus the recent 100 ppmv to 1 C trends.”

    You can’t ignore the logarithmic relationship between CO2 and temperature, unless you want to get wildly unrealistic results.

    If you insist on simplifying things, at least take the logs (base 2) of preindustrial and current CO2 concentration, subtract one from the other, and divide your 1C temperature rise by that difference. The difference in logs should be approximately 0.43, depending on what values you use for current and preindustrial CO2. That gives you an estimate of climate sensitivity of ~2.3 C per doubling of CO2, within the IPCC’s stated range of 2-4.5 C/doubling.

    But this is insufficient, for several reasons.

    (1) Not all warming from 1750-2000 was due to CO2.

    (2) There are also cooling forcings (e.g., aerosols) that mask some of the warming.

    (3) There is a lag in the climate’s response to forcing, so we aren’t at the equilibrium T that you’d expect for 385 ppmv CO2.

    Seriously, if you want to understand the relationship between CO2 and global temperatures, you’re far better off reading some of the actual science, rather than just tinkering with numbers and expecting to get a reasonable result. The summary of the IPCC report, or a textbook like Ruddiman’s “Earth and Climate”, would be a good place to start.

  • twawki // September 10, 2008 at 12:35 pm

    why are the last 10 years missing off the hockey stick - don’t they fit the belief?

  • William Briggs // September 10, 2008 at 12:40 pm

    Look here, old son, the point—the main reason—of simulating and then smoothing was to show that smoothing *induces* the autocorrelation (among other things; below). To say that I haven’t accounted for it is then to say something silly, no?

    I have talked about this at my blog. No doubt you missed it.

    *In fact, the smoothing induces a complicated structure, and not just an AR(1) series as you imply. The ARIMA structure induced depends on the smoothing method. Lag-k year smoothing induces an ARMA(k,1) or possibly ARMA(k,2) structure. The ‘k’ here does not mean the previous k-1 coefficients are in the model; just the kth autocorrelation is.

    [Response: A k-point moving average of a white-noise process induces an MA(k) structure. There's no ARMA(k,1) or ARMA(k,2) about it; to quote Gavin, "What rot!" You call yourself a statistician?

    You computed correlations of smoothed series, then applied tests for significance which were based on assuming white-noise series. That's when you FAILED to account for autocorrelation. This is either ignorance or deceptiveness on your part -- or both.

    My advice to those who want to learn about statistics: stay away from Briggs' blog.]

  • Ray Ladbury // September 10, 2008 at 1:42 pm

    Richard says, “Well Ray, you are wrong. There is NO global climate. Climate by definition is regional.”
    Uh, Richard, that’s about the dumbest thing I’ve ever read. OK, now, pay attention:
    1)Take your regional climate data (temperature, precipitation, wind, other energetic processes).
    2)Average it over the surface of the globe.
    3)Voila. Global climate.

  • J // September 10, 2008 at 3:24 pm

    As I see it, Briggs is just trying to make the (obvious) point that smoothing pairs of time-series data introduces autocorrelation into each smoothed data set, which in turn inflates the correlation coefficient between the two smoothed series.

    For some reason, he seems to think that’s an amazing revelation. I’m not sure why he thinks it’s so remarkable, but whatever.

    The obvious next step is to model the autocorrelation introduced by the smoothing process and compensate for the impact of that correlation on your significance tests.

    Briggs, however, draws the conclusion that you should never ever use smoothed data. That seems a bit extreme to me.

    In any case, his “demonstration” is silly, although from skimming through the comments it seems to have really, really impressed at least some of his readers!

  • Hank Roberts // September 10, 2008 at 3:46 pm

    How arcane is this question? Are there going to be people with backgrounds in statistics taking sides? How about some of the other statisticians who’ve spoken up here, care to comment?

    [Response: Smoothing series induces autocorrelation. Calculating the induced autocorrelation is straightforward. It affects correlation between the series. Testing their correlation as though they were white noise series is bogus. Arcane? Not at all.]

  • William Briggs // September 10, 2008 at 3:57 pm

    I’m getting you there, old sweetheart. First you and others hyped “autocorrelation” and now I have you admitting an MA process but denying the AR one. I think you’ll find this amusing: if you look at the ACF/PACF plots you will see the nice little spike a lag k. (Of *course* it’s not real!)

    Short-term memory loss is a problem, and often the first indication for a more severe problem. I’d have that checked out if I were you, just to be sure. But for now, let me remind you, the point of the exercise, my simulation, was to show how you could be misled, and become too confident if you smoothed one (or two) series and than analyzed the smoothed series. J, this comment is for you too (surf over to my place and find the spot where I say that smoothing a time series to predict new values of that series is okey dokey, and the part where I discuss measurement error as an indication for smoothing, then say you are sorry).

    Tell you what. We can solve this easily. There are two problems: (1) your odd claim that I don’t know statistics, and (2) smoothing doesn’t make you too confident.

    For (1), I’d agree to meet you and have an independent third party devise a statistical exam covering each major area of the field. We’d both take it. May the best man win. The loser has to, as you earlier suggested, insert a hockey stick where the sun don’t shine. Deal?

    For (2), let’s do this. How about we both apply a filter—say a padded (so we don’t lose n) 10 point Butterworth low pass—to two randomly generated series. We then compute the errors in variable regression (or ordinary, whichever you prefer) between the two. To check for significance, we can run a bootstrap procedure whereby we simulate the models (accounting for the “autocorrelation”, naturally). We’ll have to repeat the whole process, say, 1000 times to be sure. If it turns out that the confidence intervals are too narrow, I win. If not, you do. Deal?

    [Response: I repeat: a k-point moving average of a white-noise process has an MA(k) structure. There's no AR about it. If you maintain otherwise, then you sure as hell shouldn't call yourself a statistician.

    The only thing your "demonstration" demonstrates is that if you ignore the autocorrelation structure of two series, instead testing their correlation as though they were white noise, then you get false results. You've succeeded in impressing the ignorant.

    As for the game you want to play, I think you know where you can put it.]

  • Dano // September 10, 2008 at 4:25 pm

    This post certainly is flypaper for totem-sniffing denialists.

    At any rate, it is Dano’s long experience that folk such as Briggs who bring the bluster (you rrrrawk, dude! Awesome smackdown! F- yeah G!) are hiding something. Not 100% effective indicator, but certainly 95%.

    Sometimes the bluster is to hide something personal, other times it is to hide factual content (or lack thereof); on The Internets there is no facial or physical cues to pick up on, so I go with the latter.

    Way back when, the best comment board was on Tech Central Station, and I used to give such bluster a different font face and color to call it out.

    As for the game you want to play, I think you know where you can put it.]

    This same game is being played in the presidential campaign. It has a long history in this country - a generation now. That doesn’t make it right, but unfortunately it is a part of our public discourse. Sadly, we must deal with it.

    Best,

    D

  • TCO // September 10, 2008 at 4:57 pm

    Is there a use, a benefit, to processing the two series (in Briggs’s toy example) by smoothing, then regressing, than doing a significance test (even with reduced DOF)? I mean is it “better” to do it this way versus just regressing the data themselves? Or is it the same (and if so, why add the step)? Or is there some disadvantage (even when adjusting the DOF)?

    P.s. For both, please drop the bravado (where to stick the stick). Both of you know some cool stuff. Both are willing to engage somewhat with commenters.

  • Otto Kakashka // September 10, 2008 at 4:58 pm

    Is it not possible to adjust degrees of freedom for significance testing of two autocorrelated seriies? Something like: N’ = N (1-r1)/(1+r1)

  • Peter C. Webster // September 10, 2008 at 5:02 pm

    I don’t know what you are trying to say that’s not obvious, Cougar. Of course the system interacts at all levels of the atmosphere. I am only pointing out the satellite shows the stratosphere on a downward decadal trend, the satellite and surface readings show the troposphere on an upward decadal trend. Perhaps it doesn’t need to be pointed out. If so, you may ignore it. My point is not to tell you anything, but rather bring up things to think about that may not be on the conscious mind, or to get folks thinking about some issues they may take for granted. I’m just bringing things up, no need to attribute some agenda to my doing so.

    Yes Ray Labury, but I was only saying that a simple 280 ppmv and 32 C would give us 14 ppmv per 1 C. Of course, you can’t really do that because it is a log, which is why I mentioned I was ignoring that fact. This is clear from observation of a 100 ppmv rise since 1880 and a likely related rise in the anomaly trend of .8 C over the same period; certainly not 14 ppmv per 1 C.

    But averaging the entire atmosphere to get a picture of the system overall is no more daft than averaging the average gridded anomalies of the N/S hemispheres, day and night side, land and water, and urban and rural. Of a mean of min and max for the day, adjusted for TOBS and other factors, et al.

    J. Of course you can’t ignore the log in practice; as I said, it is simply something to think about. What we know is the last 130 years have a rise in measured atmospheric carbon dioxide of 100 ppmv and during the same time the anomaly trend is up .8 C

    The “what if” is 1) All the trend rise is from CO2. 2) None of the trend rise is masked. 3) There is no lag. And the big one 4) The relationship is linear.

    In that case, ignoring the time before 1880, at best going from 300 ppmv to 600 ppmv would give us 3 * .8 or 2.4 C for a doubling. The problem is decoupling 1-3 from each other, especially given that all 4 of those are known to be false. The point is that all else equal (and decidedly unphysical in the conditions) the data shows at best 2.4 C for a doubling. Unless of course the lag and negative forcings are larger than thought.

    What “what if” it? Prove the anomaly trend reflects an actual rise of total energy levels in the system. Establish a causal relation between greenhouse gas levels and the anomaly. Then we get into your points: Quantify the contribution of the greenhouse gases, “cooling forcings” and lag in the system overall and in relation to each other.

    I understand the relationship between CO2 and global temperatures just fine. The fact it absorbs outgoing long-wave infrared radiation is well known. The overall net effect is an unknown, although lacking it in ModelE loses us 9% of the greenhouse effect. 10% of the net effect seems like a reasonable number.

    The IPCC gives 2005 CO2 radiative forcing levels as 1.49 to 1.83 Wm-2, and the other 3 long-lived GHG as .88 to 1.08 Wm-2, in the AR4 WGI SPM on page 4. http://www.ipcc.ch/pdf/assessment-report/ar4/wg1/ar4-wg1-spm.pdf If 1.75 gives us 10%, 1 would give us 5.7% How that plays out in the system overall? Good question. I will not attempt to answer it.

    If all else was held equal, increases in the levels of the long-lived GHG increase radiative forcings, and therefore temperature. Since the weather system moves things around the globe, this would have a global impact. As you’ve mentioned however, not all else is held equal, so the relationship is, well, what it is.

    Again, make of the data what you will, and come to your conclusions as to what it means, calculate it as desired.

    Good luck.

  • dean_1230 // September 10, 2008 at 5:06 pm

    Ray,

    Is that truly the definition of “global climate”? IF so, then really is it nothing more than an amalgamation of regional climates?

    If it is, and if as has been stated elsewhere that GCMs are not useful in determining regional climates, then exactly what is it that a GCM shows? Is there any way to go from GCM to regional climate? If that’s the case, then how with any confidence can we claim anything (arctic melting, for example) is a direct result of AGW?

  • george // September 10, 2008 at 5:14 pm

    Brigg’s claim that “you absolutely on pain of death do NOT use the smoothed series as input for other analyses!” would be much more accurately (and usefully) stated as “one should be very careful about using smoothed data in later analysis”

    But even the latter is not a particularly profound revelation (though it is one that is too often ignored) and one need not be a statistician to understand that.

    Any time you smooth a data series, you are losing information — just as you lose information in images when you compress them (as jpegs, for example)

    But the real question is whether the loss of information significantly impacts the later analysis (or in the case of the image, whether the compression significantly degrades the image from a viewing standpoint, or even from a “recognition” standpoint.)

    In some cases it might. In others, it might not. It really depends on the details of the case.

    There is a trade-off to be had by smoothing. One might lose a relatively small part of the signal one is interested in (eg, the long term [multidecade] “climate” part of the temperature signal), but at the same time, one might eliminate a lot (or even most) of the part of the signal that may be less relevant to one’s analysis (eg, short term weather noise)

    For the case of an image with lots of noise, smoothing might actually make the image (of a face, for example) more readily recognizable by a person viewing it. The unimportant information from the standpoint of recognition (even the zits on their face) can be suppressed, leaving and the relevant information (the shape and proportions of facial features, distance between eyes, etc) essentially intact.

    While it is important to understand how the smoothing might affect the later analysis, it is simply incorrect to claim that “you absolutely on pain of death do NOT use the smoothed series as input for other analyses!”

    As I indicated above, there are lots of cases that belie the above categorical claim.

  • William Briggs // September 10, 2008 at 5:43 pm

    Tamnio, my old pal, I give up. But I tell you what. Next time you are in NYC, let me know and I will be very happy to treat you to a beer. We’ll go to the Ginger Man on 36th. They always have at least two casks going.

    I’m serious about this. Email me.

  • Hank Roberts // September 10, 2008 at 5:46 pm

    > devise a statistical exam covering
    > each major area

    The trucks are here to move the goalposts. How far did you want them moved?

  • Hank Roberts // September 10, 2008 at 5:55 pm

    http://www.nature.com/nature/journal/v451/n7176/fig_tab/nature06589_F1.html#figure-title

    Ice age climate and solar variability, long term

  • Timothy Chase // September 10, 2008 at 7:58 pm

    Tamino wrote in response to Hank Roberts:

    Response: Smoothing series induces autocorrelation. Calculating the induced autocorrelation is straightforward. It affects correlation between the series. Testing their correlation as though they were white noise series is bogus. Arcane? Not at all.

    Thank you — that helps a lot. I had begun looking things up earlier this morning, but it would have taken a bit to make it that far.

  • luminous beauty // September 10, 2008 at 9:02 pm

    Dr. Briggs makes a trivially true argument that statistically naive observers can make a serious mistake in seeking to discover correlation between smoothed series.

    The serious bullshit in Dr. Brigg’s argument is in creating the insinuation that such a bonehead error was actually committed by Mann, et al. in their recent paper.

    Given that the proxies are chosen precisely because they are, a priori, known to be strongly correlated with historical instrumental data, it would seem the prime condition of Dr. Briggs hypothesis is not satisfied.

  • David B. Benson // September 10, 2008 at 11:14 pm

    Peter C. Webster // September 10, 2008 at 5:02 pm — The equilibrium climate sensitivity consists of two parts: a very fast response (less than seven years), about 60%; the rest, which takes many centuries, at least a dozen, maybe two, before being close to equilibrium.

    So use the standard formula from “Global Warming: Understanding the Forecast” by David Archer;
    sample chapter 4 on greenhouse gases available as a pdf here:
    http://forecast.uchicago.edu/samples.html

    plug in your start and end values of CO2 concentration and the temperature change over that time to determine an approximate climate sensitivity. Don’t forget that 60% factor.

  • Ray Ladbury // September 11, 2008 at 1:32 am

    Dean_1230, that is purely my own definition. In any case, the important thing is that Earth’s climate is a system, and the behavior of that system at one energy state will be different than at another energy–even if we cannot say with 100% certainty where every joule of energy will go.

  • Hank Roberts // September 11, 2008 at 2:08 am

    OK, repeating the question above (I tried to ask it at Briggs’s website and couldn’t post, for whatever reason). Has anyone redone Briggs’s hurricane papers that did use smoothing — that he says now was a mistake? Is the conclusion different? (I don’t have cites to the papers so haven’t looked for papers citing them myself to see if a correction was needed in the journal.)

  • Richard // September 11, 2008 at 2:43 am

    Well if you think that you can average all the regional climates and come up with a blancmange global average climate then go ahead. It means nothing, it is nothing and it has no application in the real world. It might have application for modellers but they have ceased to live in the real world for years now.

  • Richard // September 11, 2008 at 2:44 am

    Sorry my previous was addressed to Ray Ladbury.

  • Gavin's Pussycat // September 11, 2008 at 6:57 am

    Tamino: It affects correlation between the series.

    Actually, no. The correlation is and remains zero. It affects the procedure for establishing this fact — offering a golden opportunity to get it all wrong…

    Just a semantics nit ;-)

  • Gavin's Pussycat // September 11, 2008 at 7:29 am

    TCO:

    Is there a use, a benefit, to processing the two series (in Briggs’s toy example) by smoothing, then regressing, than doing a significance test (even with reduced DOF)? I mean is it “better” to do it this way versus just regressing the data themselves? Or is it the same (and if so, why add the step)? Or is there some disadvantage (even when adjusting the DOF)?

    These are actually very good questions. It depends. If done both correctly, they should produce the same test result.

    I suppose it depends on what you’re after. If you know, say, that you have time series consisting of a slowly varying signal (the stuff you’re interested in) and a high frequency noise (for your purpose data error that you want to get rid of), then you’re gonna smooth anyway. And then you have a choice of doing the test on the raw or on the smoothed data.

    But the important thing is that in either case, you must use the correct autocovariance structure for the data you’re testing. Note that some data may be autocorrelated without any smoothing at all! Then you’re in trouble too if you use the test for white time series, and no use blaming smoothing…

    More generally, I would be rather not rely too much on the r^2 statistic, which is rather quick and dirty. What I would do with data, is fit a model (physical hypothesis) to it in the least-squares (parametric model) sense, using a realistic a priori variance-covariance model for the data as a full matrix. That gives you an estimator for the variance of unit weight that is Fisher, or chi-square, distributed and is a good diagnostic for the general validity of your model assumptions. And then there are ways of finding individual gross errors or “outliers” if there is any suspicion of that.

    But many smoothing algorithms are actually ad hoc, semi-empirical models of just this kind…

  • andy // September 11, 2008 at 10:25 am

    Some aspects Tamino / Mann could have a look:

    Just replace GISS temps with satellite measurements, and ~0.2 C of the warming during 1978 - 2008 disappears.

    Replace old Briffa Torneträsk series with Grudd 2008 series.

    Replace the old bristlecone series with Abayanah thesis. Etc., etc.

  • Patrick Hadley // September 11, 2008 at 11:59 am

    While I am no fan of the hockey stick in any of its forms, Andy is mistaken if he believes that there has been a 0.2C higher rise in GISS than in the satellite measurements since 1979.

    GISS - RSS shows no trend at all over the period, whereas GISS - UAH has risen by just around 0.1C over the last 30 years.

  • Ray Ladbury // September 11, 2008 at 12:24 pm

    Richard, actually since energy balance applies to the planet as a whole–it being the closest thing we have to a closed system–average global climate is indeed quite meaningful. How energy flows is quite variable. The global flux less so, and the energy in the system determines the phase space available to it.

  • Andy // September 11, 2008 at 2:22 pm

    Why do the instrumental records end in 2000, not 2008?

    [Response: They don't. The axis labels end at 2000, but the data go to 2006.]

  • Andy // September 11, 2008 at 2:47 pm

    OK. Thanks. Was just expecting to see a small downtick at the end of the graph.

  • dean_1230 // September 11, 2008 at 2:53 pm

    Ray,

    THe energy balance does apply to the overall planet, but all that does is set the total…. it seems that it gives little insight as to what to expect. In simplistic form, what it says is that

    X + Y = 7

    But what are X & Y???

  • Ross // September 11, 2008 at 3:52 pm

    Mr. Pussycat,

    You state that smoothing is ok, as long as you know what you are doing. Matt Briggs differs. Hu McCullock (an experienced econometrician) appears to take a middle ground, but believes that Mann et al don’t know what they are doing (did not adequately reduce the degrees of freedom).

    http://www.climateaudit.org/?p=3594 (comment 23)

    Any thoughts on Hu’s analysis?

  • Barton Paul Levenson // September 11, 2008 at 3:55 pm

    Richard writes:

    Climate by definition is regional.

    Does the WMO define climate as “mean regional or global weather over a period of 30 years or more?”

  • Barton Paul Levenson // September 11, 2008 at 3:58 pm

    Thanks, Tamino!

    -BPL

  • Barton Paul Levenson // September 11, 2008 at 4:07 pm

    Richard writes:

    Well if you think that you can average all the regional climates and come up with a blancmange global average climate then go ahead. It means nothing, it is nothing and it has no application in the real world.

    Is the climate on Venus hotter than the climate on Earth?

  • Wade Michaels // September 11, 2008 at 4:37 pm

    Hank,

    I could be wrong, but I thought Briggs replied to Gavin on his post that his “original” paper included what you’re looking for, but the “published” version it was cut out for brevity or some such.

  • Ray Ladbury // September 11, 2008 at 4:52 pm

    Energy balance determines the phase space–the states the climate could be in, and in a large system, the volume of phase space increases rapidly with energy. More phase space means less predictability for one thing, more fluctuations. And that’s without knowing anything about the system. If we also know forcings and some of the dissipative mechanisms, … Global climate will not tell us with confidence what will happen in Western Australia. It is what we can simulate at present and it does at least give us an inkling of what’s going on on sub-grid sizes.

  • Lee // September 11, 2008 at 5:02 pm

    Richard and dean,

    No one is averaging regional climates to arrive at a global climate.

    People are averaging regional climate CHANGES to arrive at a global climate CHANGE.

    This is a critical and simple fact, too often distorted.

  • Dano // September 11, 2008 at 5:13 pm

    Ross at 9-11 3.52 PM

    Any thoughts on Hu’s analysis?

    We’ll think about it when he sends his analysis into the PNAS, thanks. If its valid, it’ll show up there.

    Best,

    D

  • Ross // September 11, 2008 at 8:53 pm

    Dano,

    Come on, man. Who wants to wait for the PNAS to publish? One of the fun things about blogs and associated discussion is the ability to discuss in real time. AGW proponents generally like nothing better to tear down shoddy arguments of the other side. Tamino does it all the time. So does RC - both of blog posts and peer reviewed articles. So let’s not get all snotty here. Hu’s comments appear very simple for someone with a strong basis in this type of statistics to confirm or refute. Gavin’s Pussycat clearly has that basis (I don’t) so it would be interesting to me (and possibly others) to get an expert’s view. D, learn to embrace instant gratification - you’ll enjoy life far more.

    So Mr. Pussycat, please ignore Dano’s interruption. Thanks.

  • Peter C. Webster // September 11, 2008 at 10:00 pm

    Of course, David Benson. The ballpark estimate of what the climate sensitivity might be is certainly someplace between +2 and +4.5 C. Or +/- 0 0r +/-1 or +/-100 or +/- infinity et al.

    I’d peg it at -10 and +10 C per doubling of carbon dioxide, after you incorporate all the other variables of future unknown sign and magnitude that contribute to it in the system for a net effect. Sadly, I can’t prove any of those empirically. If past performance is any indicator of the future, we are on the way to seeing about a 2 C total increase in the anomaly trend for another 200 ppmv of carbon dioxide added to the last 100 ppmv, which would be a doubling from 300 to 600. Since we know that the next 100 and next 100 will be a log, even 2 C might be a little high. But that also ignores what other events may happen on the way to that 600 ppmv in the atmosphere.

    But what carbon dioxide does on its own is of very little use if you can’t factor out all the other variables in some quantitative physical manner. What the models suggest may be a good idea of what might happen. So split the difference and put it as 3.25 C per doubling.

    Pactick Hadley, you are correct to a point on GISS vs RSS, but off a bit on the numbers. As I’ve mentioned, since 1979, RSS shows the satellites per decade trend in C of

    Lower troposphere +.169
    Upper troposphere +.096
    Troposphere/stratosphere -.019
    Lower stratosphere -.334

    Over the same period, GISS shows a +.2 C trend per decade.

    So if you are looking to compare lower troposphere, my take is that a difference of .031 is inconsequential. If .2 is consequential or not, or what X% of that .2 C per decade is directly related to 17.3 ppmv per decade of carbon dioxide, that is another matter.

    You ask if it’s hotter on Venus, Mr. Levenson. Excuse my confusion on the matter, but are you actually trying to compare:

    Planet 1: Oxygen-nitrogen atmosphere that has water vapor, water clouds, water seas, carbon cycle, life, plate tectonics, an axial tilt, a magnetic field, a 24 hour rotation, 1 ATM at surface and a 7-17 KM troposphere.

    Planet 2: Carbon dixoide-nitrogen atmosphere with no water vapor, sulfur clouds, all land, no carbon cycle, no life, no plate tectonics, no axial tilt, no magnetic field, 243 day rotation, 92 ATM at surface, and a 65 KM troposphere.

    So your answer is; yes it’s hotter on Venus. Which has nothing to do with Earth, so is a non-sequiter.

  • Dano // September 11, 2008 at 11:09 pm

    Ross, the point is that if it’s groundbreakingly wonderific, the scientific community needs to know about it.

    The community knows about it via the journals and listservs.

    If it’s worth sharing with the community, it’ll show up there. Otherwise it’s entertainment.

    RC & Tamino clarify for the lay public and advanced amateurs, they don’t publish their new work on their blogs. The practitioners know this already and know where to go.

    The amateurs try to find crumbs on blogs to feast upon, and some declare them picnics.

    Best,

    D

  • David B. Benson // September 11, 2008 at 11:48 pm

    Peter C. Webster // September 11, 2008 at 10:00 pm — You missed my point. I’ve given you a correct way to use the data for the past 158 years, or part thereof, to obtain an estimate of climate sensitivity. Or you could simply use the LGM stable climate estimate of 3–4 K, explained in a thread on RealClimate.

  • Chris O'Neill // September 12, 2008 at 1:49 am

    Peter C. Webster:

    So your answer is; yes it’s hotter on Venus. Which has nothing to do with Earth,

    apart from both planets obeying the same laws of physics and the fact that Venus has less solar radiation at its surface than the earth. Which of your variables explains why it is hotter at the surface with less solar radiation?

  • dhogaza // September 12, 2008 at 4:39 am

    Who wants to wait for the PNAS to publish? One of the fun things about blogs and associated discussion is the ability to discuss in real time.

    In other words, the revelation is so significant that the author doesn’t want to claim fame by actually submitting it to the normal science community vetting process?

    He’s so altruistic that he’s giving up his place in science just so he can publish as a comment on a blog that a large percentage of climate scientists won’t read because the owner is a denialist shithead fanatic?

    Wow. It’s a pity that people on your side have to publish this way, without peer review (they’d hate that, right?), without any possibility of enhancing their career, etc.

Leave a Comment