Inspired by the statistics classes I’m currently teaching, and a conversation I recently had in the pub with some colleagues (because I’m just that exciting), I’ve been wondering about why p < 0.05 is the most common threshold for statistical significance, at least in the psychological sciences. I realised that the choice of threshold was probably arbitrary to a certain extent, but I thought that maybe it was at least a useful arbitrary value for whatever purpose p values were first used for. I had been teaching about t-tests, so they were on my mind. I knew that the Student’s t-test was created by William Gosset to help quality control at the Guinness brewery (the brewery forced him to publish under a pseudonym – Student – to conceal from competitors that they were using statistics). Perhaps a false positive rate 1 in 20 was considered to be a reasonable error rate in brewery quality control? Apparently not…
The threshold, or indeed any threshold, doesn’t seem to have arisen with Gosset. P values certainly pre-date Gosset and the t-test anyway, but the publication of his tables of the t-statistic (or rather, what he referred to as the z-statistic), and the tables of his colleague Pearson’s χ2 distribution, provided precise p values to 4 decimal places for a given value of t or χ2. Instead, our fixation on p < 0.05 seems to be at least in part due to the issues between Pearson, and another statistician, R.A. Fisher. Fisher had created more statistical tests, and wanted to reproduce Gosset’s tables. However, permission was refused because of financial issues over granting copyright and disagreements over theory between Pearson and Fisher, so Fisher had to re-create the tables. Fisher rearranged the data, and instead of providing exact p values for a value of t he provided t values for values of p.
Sections of Student’s (top) and Fisher’s (bottom) tables. From Clauser (2008) Chance, 21(4):6-11
Although it is apparently “a matter of historical fact that Fisher was the first to have published tables in this form”, there is evidence pre-dating Fisher and Pearson that p values were considered as an indication of findings of further interest, and the threshold of interest was usually around 0.05. Warnings about the overuse of thresholds of significance were also surfacing as early as 1919 — 6 years before Fisher’s tables. So it seems unfair to lay the blame for p values obsession at Fisher’s door, but the publication and widespread use of his tables in a form that focused on round p values seems to have helped to reinforce the habit. Fisher doesn’t appear to have recommended the use of absolute thresholds of significance; he considered p values above 0.2 to be indicative of no effect, but values between 0.05 and 0.2 to be a suggestion that an effect might be detectable with sufficient modification of the experiment. Most of his tables reflected this; they provided values of several test statistics for a range of p values. However, when he produced tables for his newly introduced F statistic, values were only produced for p = 0.05 for simplicity. Although later versions expanded to include other p values, people seemed to have latched on to 0.05 as an important value.
Perhaps because the tables opened up the arcane world of statistics to a wider audience, or maybe because of some historical tendency towards 1 in 20 as an intuitive compromise of sensitivity and false-positives, Fisher’s tables seem to have left us with the one thing that everyone who knows anything about statistics ‘knows’. Maybe if Fisher and Pearson had been on better terms, undergraduate statistics might have been very different…
Clauser, B. (2009). War, enmity, and statistical tables CHANCE, 21 (4), 6-11 DOI: 10.1007/s00144-008-0004-8
Stigler, S. (2009). Fisher and the 5% level CHANCE, 21 (4), 12-12 DOI: 10.1007/s00144-008-0033-3
Also, see Gerald Dalall’s article Why P = 0.05? for more detail, or if you can’t access the papers.
Mathematics is vital to neuroscience, biology, and science as a whole. Whether it is statistical examination of experimental data, or differential equations describing the behaviour of dynamical systems, maths is involved. And increasingly so: as the processing power of computers increases, new theoretical fields that examine vast swathes of data, such as bioinformatics, can open up. As subjects that are covered on introductory degree courses increasingly rely on maths, the skills taught in GCSE maths become increasingly inadequate. When students get to degree level there is often a resistance to doing maths; I’ve personally taught students who have stonewalled at the first mention of the mathematics behind statistics. While you might be able to learn how to perform a particular maths based method without understanding the calculations behind it, without understanding at least the concepts they are based on you won’t understand why you perform that particular method. Or more importantly, in which circumstances you shouldn’t perform that method, and how to tell which is which.
So do we need to teach GCSE mathematics better to our under 16s? Or make A-level maths compulsory for everyone, or at least those who intent to take a maths centred subject? Perhaps not. Part of the problem with maths is that it is so abstract. The students that I taught, after a bit of goading, understood the concepts they needed to as long as they were explained in concrete terms.
Stephen Curry hits on this in his THE article and blogpost: The way to get science students to learn and understand maths is not to teach abstract maths better, but to teach maths as an integral part of science. Although maths can be studied as an abstract field, for science it is a tool. Mathematics is a way of describing quantitative situations and problems, and applying it to concrete concepts makes it easier to understand for most people.
Note that this is not a different way to teach maths to those who don’t ‘get’ maths the ‘normal’ way. Curry points out that Fourier developed his analysis method to determine the conduction of of heat along an iron bar, and Alan Key in Doing with Images Makes Symbols relates an informal survey that found the vast majority of top mathematicians (all but a few out of 100) worked on maths problems by visualising or feeling the problem, rather than symbolic abstractions. Practice is key too: no one expects to be able to play piano with any degree of competency without practising, but people switch off early with maths, because they don’t ‘get’ it. Yet Curry reports that on his Imperial course there is no detectable effect of having a maths A-level on final results – the key is the ability to learn and practice applied maths.
Some see a problem beyond scientists using maths, and suggested that we need new ways of representing mathematical concepts in general; that abstract notation makes no sense in a world where we have the technology to create interactive visual and visceral representations. Teaching maths by using equations, the article suggests, is like teaching everyone to use computers through the command line.
No reasonable person would expect to be able to pick up most subjects immediately, or to commit to memory abstract rules and theories without having an understanding of how they related to concrete examples. Somehow, mathematics has remained at least partially separated from this reasonable expectation.
Perhaps it’s time for a change.
No, no superstitions about the number 13 delaying this post, I’ve just been doing other things (thesis writing, resting from thesis writing until my brain cools down, feeling guilty about doing anything other than thesis writing or “resting”). Hopefully I’ll pick up again a little bit. Here are some links in the meantime:
The R blog Revolutions reports on a researcher who used R to plot the similarity of journals to guide him on where to submit. He provides the code for those of you who are interested in doing the same thing.
How can we change the culture of science education? Alison Campbell reviews an article in Science.
A fine introduction to MRI analysis. “I don’t think a more concise, accessible explanation of fMRI statistics exists” (Micah Allen – @neroconscience)
Researchers are getting creative with optogenetics – a group at Georgia Tech used an ordinary LCD projector to control a modified worm, and it’s starting to make its way into studies using fMRI. There was a pretty good article in SciAm on optogentics (pdf) at the end of last year recapping what it is and what it’s used for.
And more MRI – for those of us with little experience in the field, four things to keep in mind when you’re reading about fMRI.
Finally, a (former) department colleague Mark Humphries has a publication in J. Neuroscience on a new method to discover different patterns in neural activity – Spike-Train Communities: Finding Groups of Similar Spike Trains. I think he used my data!
At the moment I’m trying to learn the R language, which can be used for analysing and graphing data. It’s not going quite as well as I’d hoped, mostly because I’ve only ended up trying it out when I’m analysing data from my thesis. This means that 1) the applications that I’m learning are very limited, and 2) I tend to end up trying to sort out problems in my R code and my stats at the same time. Not a good way to learn. To help me along a little bit, I’ve decided to start blogging summaries of one or two chapters at a time as I work my way through Everitt, An R and S-PLUS Companion to Multivariate Analysis (Holy Christ it’s £60! Thank you university library…). I’ll flesh it out with posts on other applications of R that I try out and when I use R to do my thesis analysis, so it won’t all be multivariate stats. Hopefully it’ll give me motivation to get through the book as well as giving me a chance to reflect on what I’ve read. I’ll also have something I can use as a resource in the future. I’ve got a couple of collegues who are getting into R as well, so they might be interested, and even chip in!
I got carried away yesterday sorting out some figures for my thesis, like so:
An example of a substantia nigra recording site. Ooooo, ahhhh!
Here are some slightly late links to make up for things.
Three new blogs on the roll: Firstly, There and (Hopefully) Back Again by @biochembelle, on “the adventure and challenge of science and academia”. It’s nice to read other postgrads/docs sharing their research experience, and this one is sprinkled with the odd piece of useful research advice, like the Martha Stewart inspired Six Things to do in the Lab Everyday. Conveniently, Biochem Belle has just posted a quick tour of her favourite posts.
Secondly, Oscillatory Thoughts, where Bradley Voytek writes about all things neurosciencey, including the connection between Mike the headless chicken and orgasms and a fantastic post on how to be a neuroscientist and spot neuro nonsesnse.
And , who I’ve linked to a couple of times before. Rediscovered it through a post about statistical thresholds, sloppy reporting and biased publishing.
And some other stuff:
Some interesting thoughts by DrugMonkey on Diversifying Your Laboratory by spreading your research interests. It’s targeted more at young PIs, so I don’t know quite whether I should be thinking in that way at the moment, but I feel like quite the academic butterfly at the moment; every new postdoc position I find sounds exciting and interesting.
It’s a PhD, not a Nobel Prize! (pdf) Gerry Mullins and Margaret Kiley present research on how experienced examiners assess research theses in Studies in Higher Education. It’s like they read my mind – one of my biggest worries is not knowing what’s expected of me as a postgraduate.
An article from Linux Format sent to me by my brother detailing the stages of burnout (pdf). When I posted it on Facebook, someone replied with “but it just describes doing a PhD”. It’s probably not OK that it’s assumed that PhD=burnout. Read it, recognise it in yourself, do something about it, for your PhD, for the people around you, but most of all for yourself.
A study in PLoS ONE reports that the Journal of the American Medical Association published fewer industry funded trials after introducing a requirement for independent statistical analysis, a decrease that was not seen in control journals. An unsurprising but disappointing finding.
I’m dying to do something that calls for a nomograph (Wikipedia article on nomographs).
And more graphs – ggplot2, the graphing package for the analysis language R (Wiki article on R), has a plot builder with a GUI!
And on a personal note, I’m relaxing between thesis writing sessions by trying to teach myself the tin whistle. So far it’s mostly Fairytale of New York and Whiskey in the Jar. Any suggestions?
A links digest! It’s been a while, so some of these might not be as hot off the press as they could be…
Honesty with statistics – An article by Elizabeth Wager and colleagues in PLoS ONE shows that the Journal of the American Medical Association published fewer industry funded studies after they introduced a requirement for independent statistical analysis, whilst more were published in The Lancet and New England Journal of Medicine.
Boys and girls equally good at maths; ‘gap’ is a self-fulfilling prophecy says long term study…
…And also at science blogging! After a conversation late last year about the presence of female bloggers, ResearchBlogging.org breaks down its blogs by author gender.
Lots of interesting links from Michelle Greene at NeuRealism, including the roulette of paper rejection, and an article on postgraduate researchers and procrastination.
Bradley Voytek at Oscillatory Thoughts has a post on why people writing about connectomics research should be careful not to exaggerate what it hopes to deliver.
Some interesting discussion over at Skepchick with the Afternoon Inquisition: What area of social science most interests you? Will social science(s) ever become methodologically similar to the natural sciences (i.e. make testable predictions, unveil natural laws, etc.)? I always liked the idea that the error bars tend to be larger on social psych research mostly because we haven’t pinned down all the contributing variables. Whether we ever can or will is another matter.
Hi, my name is Craig, and I have an admission. I’ve come to realise that Microsoft Excel isn’t that bad.
I know, I know. For years I’ve treated it as an oversized calculator, good only for storing tables of data, basic mathematics and knocking up a quick graph. If I wanted to do anything more complicated, I’d use dedicated statistical packages, like GraphPad Prism, or recently, R. However, I’ve recently started using a whole range of tools that I didn’t know Excel had, like array maths, and the LOOKUP, IF, COUNTIF, INDEX and MATCH functions. The fact that Excel and similar spreadsheet programs are so widely used should have probably tipped me off to the fact I was only aware of a tiny fraction of Excel’s capabilities, and used a smaller fraction of those. Instead, its many facets remained unknown unknowns – like the built-in functions in the Spike2 language that cut dozens of lines of clumsy code out of my scripts – automatic and easy ways of doing things that were waiting to be discovered by some idle clicking through the manual, or serendipitous choice of search terms.
So I’ve warmed to Excel now, and I’m going to use it as more than just glorified CSV storage. What’s more, now that I’ve discovered Sparklines for Excel, which puts cell sized graphs of data into the spreadsheet, I’m actually pretty happy with it. But perhaps more importantly, I’ve had another reminder of the benefits of idle curiosity.
I’ve stalled a little bit in my thesis writing – I’ve only got reanalysis and in-depth rewriting to be done on this chapter, which I’m not keen to do at this time of night. So I thought I’d bash out a quick post before the new year.
When dealing with sets of data from familiar experiments, it might be tempting to throw the numbers into your favourite statistics software package, and report the coefficient and p value. But researcher beware! Strange things may be hiding in your data… Anscombe’s quartet is a fine example of this. The quartet is four sets of data that have the same sample statistics (mean, variance, correlation coefficient and regression equation), but when graphed, they are clearly very different.
Anscombe's quartet plotted, from Wikipedia
The quartet is only an illustrative example of what is possible; the Wikipedia article has links to other similar data.
But graphing your data doesn’t just guard against mistakes, it can also allow you to see patterns in your data that you hadn’t thought to look for. If you use R, there are plenty of snippets of code that make a summary plot of data, with frequency distributions and Q-Q plots. So give it a go. Work with your data from the bottom up – you never know what you might find.
P.S. Happy New Year everyone – There’s a surprise coming for Neuromancy next year…
I thought I’d already done a proper introduction of my move into using the R language to analyse the data for my thesis, but I can’t find anything now… I’d previously been using GraphPad Prism, which was easy to use and everything was customisable at a click, but this meant it was a bit slow and bloated. MSExcel hasn’t even entered into the equation after the effort needed to display standard error bars, and also that some of the inference test calculations aren’t even right. Instead, I decided to try to teach myself R.
So here’s the start of my journey into R analysis, a “jouRney” if you will. R is an open source language developed from the commercial language S, which in turn derives from C. Straight out of the box it can do some reasonably advanced data extraction, analysis and visualisation. It’s real strength, however, comes from its openness. Being open source means that thousands of stat geeks around the world have developed add-on “packages” that develop R’s abilities into all sorts of areas of research.
So far, I’m just beginning to learn how to read data in, manipulate it and visualise it, but here’s the result of my first attempt to produce a heat map/contour plot of the firing rate of a dopamine neuron in response to a stimulus, across trials:
#reads in raw data from text file produced by heatplot.s2s
contourplot <- read.table("c:\\documents and settings\\craig\\my documents\\my dropbox\\R stuff\\rawoutput.txt", header=TRUE, sep="\t")
#trims off empty columns
contourplot <- contourplot[,1:40]
#sets table to contain only the data columns
contourplot <- contourplot[,c(2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40)]
#transposes the data
contourplot <- t(contourplot)
#makes table into data matrix
#Plots heatmap of matrix "contourplot_matrix"
filled.contour(x=seq(1, 300, length=20), y=seq(-0.5, 1.5, length=20), z=contourplot_matrix[1:20,1:20], nlevels=20, axes=TRUE, color.palette=topo.colors)
Which produces this:
A comparison of contour plot in R (right) and the raw data from Spike2 - Conotur plot rotated 90 degrees anticlockwise
Which is essentially a grid of the mean firing rate in 100 ms by 15 trial bins. Hopefully, I should be able to use these data in a principal component analysis, and see if there are separate components which change over time at different rates.
Eventually, I’m aiming to do all of my statistics in R, so we’ll see how it goes.
OK, so I’m probably not going to keep up with my promise. It’s a busy old time at the moment, lots going on. Trying to fit real life stuff around finishing off my thesis (which requires more finishing that I would have liked) and trying to find a job! It’s probably a little unreasonable of me to expect a department who has no knowledge of me to take me on as a research assistant/associate when I haven’t submitted my thesis, and even more of a stretch when I’m applying for places which would prefer a slightly different skill set (making cultures of neurons then doing in vitro electrophys), but I’m enthusiastic, more than willing to learn, and above all optomistic. I’ve applied for a few lab technician jobs too, which, if I get them, would provide some valuable experience in a slightly broader range of techniques. Plus, they pay the bills.
For some reason, I seem to be on my way to being a statistics geek. Maybe I’m just not happy with being a regular science geek… I’m trying to learn the R language to do some of my thesis analysis, and I’ve applied for some data entry/analyst jobs. I’m also on the Civil Service Fast Stream scheme for Assistant Statistican! Erk, it’s all a little bit scary. I’m down in London on the 11th of November for an assessment day, don’t really know what to expect.
So that’s it. Now that I’ve said I won’t be around, I’ll probably end up being around more than when I said I was going to try and make the effort to blog more…
Bye for now!