Understanding p-values

A couple of days ago I was trying to nail down p-value, significance levels and null hypotheses. I know that if the p-value is lower than 0.05, the result is significant and the null hypothesis is rejected – but that’s just memorization. I wanted to understand it, know when it should be applied, and be able to explain it to others. I turned to my trusty co-worker, R., to help in this endeavor. (If you prefer you can skip the dorky “Brianne comes to understand p-values” conversation and go right to the “why this shit is important” section below.)

Me: R – do you have a good understanding of the concepts underlying p-values, significance levels and null hypotheses?

R: Not really a great understanding. I use stats software to calculate p-values. Anything less than 0.05 is significant.

Me: Yeah, I know that, but can you help me understand something? C’mere and look at my screen.

[R, who is used to me pulling us off-topic, sighs good-naturedly and comes over] 

Me: So look at this the bell curve in Wikipedia:

Me: So, a significance level is just some arbitrary number that we usually set at 0.05 or 0.01…

R: I don’t think it’s arbitrary.

Me: Well, it’s not set by …err…  &*^% … I don’t know. Let’s tackle that one another day. So, let’s say we set it at 0.05 (me mumbling under my breath “arbitrarily”). That means 95% of the time (everything not green) we’re not going to observe the null hypothesis?

R: I hate double negatives.

Me: So?

R: No. It’s saying that there is a 5% probability of making our observation purely by chance. The smaller the number the less likely it is that our observation was made by chance. The greater the number the more likely it was made by chance. A  small number rejects the null hypothesis.

Me: Okay. It’s an estimate of how likely we are to observe our result by chance. So…let’s say that we’re trying to determine if a chemical is stable over time. We test it several times over a month, then look at the difference over time in how the chemical performs. We do a linear regression and get a slope of 0.1* The p-value on the data is calculated to be 0.05. That means there’s a 5% probability of obtaining the observed result (0.1) or greater by chance if no real effect exists (that is, if our chemical is stable).

Him: Yes.

A long pause.

Me: Sweet. I think I get it. But, that means that p-value is sort of the opposite of what we observe…

R: Yeah, it’s counter-intuitive.

Me: It’s math. It’s all counter-intuitive to me.

Biodork: helping you lose confidence in the scientists behind the development of the technologies you use every day!

I’m kidding. Statistics is not what I do for my company. But understanding statistical analyses is important in my field and it feels good to finally have a grasp of the concept.

I was a little hesitant to write this because it feels like p-value is something that should be understood by A Real ScientistTM.  You see p-values all over the literature, so of course everyone understands them, right? But it’s not an easy concept. It is counter-intuitive. And I thought I’d throw it out there that I for one was struggling with it, and it’s okay if you are too. Mathphobes unite!

There’s another good reason to understand p-values and statistical analyses in general – It’s a huge part of not getting fooled by shoddy statistics put out by proponents of bullshit. I’m looking at you, alternative medicine. Some readers of this blog are probably also fans of Science-Based Medicine. SBM has many articles that discuss the statistics gymnastics that alt med proponents perform to make it seem as if their [insert bullshit product here] has an effect on [insert condition here].

I’m reading a book right now called Intuitive Biostatistics by Dr. Harvey Motulsky.  I’m finding it hard to put down, and I don’t often say that about math books. Here’s how Dr. Motulsky explains his book:

Unlike statistics texts that emphasize mathematics, Intuitive Biostatistics focuses on proper scientific interpretation of statistical tests. The book is perfect for researchers, clinicians and students who need to understand statistical results published in biological and medical journals. Intuitive Biostatistics covers all the topics typically found in introductory texts, and adds several topics of particular interest to biologists – including multiple comparison tests, nonlinear regression and survival analysis.

The first chapter is all about how we trick ourselves into seeing patterns where none exist, and the importance of correctly analyzing data so that it can speak for itself without our dumb brains getting in the way. Motulsky gives illuminating examples throughout the book and leaves the math formulae to the statistics textbooks. You can find it free online or on Amazon if you want a paper copy. I recommend this book for every skeptic who has to deal with data sets or statistical defenses of woo.

And finally, if you have any rules or tricks to help explain p-value in a concise manner, I’d love to see them in the comments below. Also, I am a self-admitted mathphobe and doubter of my own mathematical skills, so if you catch me in an error, please do call it out. I’ll be happy to learn and grateful for the chance to clear up any misinformation that I might be spreading.

*For my mathphobe friends – a simple explanation for a linear regression is that it’s a line drawn through the points that we’ve just plotted, and a slope refers to…well…the slope of the line we’ve drawn. A slope of 0.1 in the example means there is about a 10% difference between how the chemical performed on the last day as compared to how it performed on the first day we tested it.

Le hand-drawn by moi linear regression.

{advertisement}
Understanding p-values
{advertisement}

278 thoughts on “Understanding p-values

  1. 1

    A *tiny* clarification:

    “It’s an estimate of how likely we are to observe our result by chance.” Assuming that the null hypothesis is true.

    That’s why we reject the null hypothesis if we get a very small p-value. That does not mean the null hypothesis is false – just that this particular experimental outcome has a low likelihood of occurring if it is true.

    See:

    XKCD

    for another illustration…

  2. 2

    I have my stats students actually flip coins a few hundred times to get a *feel* for probability first. And we go into the relative costs of type 1 and type 2 error, and *why* a cutoff criteria might differ from the “standard” .05 or .01.

    Love this stuff…

  3. 4

    Yeah, I think you basically get it. A few additional comments.

    First, as you say in part of your post and felicis notes in a comment, the probability dealie assumes the Null hypothesis, for instance, that you really are tossing coins (or whatever). In fact, your expectation (the hypothesis) in a strongly determined system may be way different.

    The 0.05 value is pretty much arbitrary. There may or may not be valid reasons to put the value there, but I would count most of that as arm waving. In the 70s and 80s, in Archaeology, we used 0.10 because otherwise nothing would ever be significance. Large scale studies on the efficacy of dangerous substances that used a 0.05 level might not be convincing to a panel of evaluators prepare to give the substance to babies. In some fields under certain conditions the arbitrary number is lower, like 0.01

    Now, two somewhat more subtle and often ignored aspects of this. First, we assume in science that we are replicating and reproducing and re-trying results. With a 0.05 level of acceptance, this means that we will frequently make a mistake. In your workplace, consider the number of times someone used a p value in the last year. One in 20 of those (roughly) would be one kind of mistake. That’s a lot.

    But really, it’s not a lot because it is not the case that every single experiment was the only experiment done in relation to a particular system. It is absolutely possible to get a “significant” result when you “know” you shouldn’t, or the opposite; a result that doesn’t fit the overall pattern of results. No matter what, you are always doing a kind of internal meta-study of everything.

    The second thing has to do with that graph and its exact shape. In classic statistics, you calculate F statistics or regressions, or moments (mean, stand. dev. etc) and then make underlying assumptions about the nature of the numbers, which ultimately leads to a p value. In the really old days, people actually looked up some of these numbers on tables using a test statistic, degrees of freedom, etc. But those distributions that are used to ultimately get to a p-value are well understood theoretical distributions (based on a combination of empirical understanding and inference). All phenomena are divided into a handful of these theoretical distributions (even with distribution free stats) and we just assume that this all makes sense. But the actual distributions are anywhere from a little bit to a lot different than those, thus modern techniques like bootstrapping.

    So imagine taking all the data you ever got from a certain lab test. Then you randomly draw 100 cases to make a subsample, and calculate the mean. Repeat 100,000 times. Make a distribution. Now, run your new experiment and find that result on that distribution.

    Now THAT’s your p value. If 1000 of the previously calculated values (out of 100,000) are to the right of your new number, then you’ve got a p value of 1000/100,000. And, if you look at the exact shape of that distribution you made (with the 100,000 samples) you’ll see, perhaps, that is it not exactly like the one you use above in the diagram. It will have a shape that more accurately reflects reality. Bounded on one side, not the other. Discontinuous at high value because you use larger beakers with different graduations on them for larger amounts. Some values are missing because one of your instruments had a spot of dirt on it for three years.

  4. 5

    I recommend Gotelli and Ellison’s A Primer of Ecological Statistics and Zar’s Biostatistical Analysis as well. I also just purchased the 2nd edition of Intuitive Biostatistics and it looks good, but in the early printings of the 2nd edition, the formula for the standard deviation is missing the radical sign, and is therefore actually the formula for the variance.

  5. 7

    Re:

    I wanted to understand it, know when it should be applied, and be able to explain it to others.

    Or as Einstein supposedly said:

    You do not really understand something unless you can explain it to your grandmother.

  6. 8

    A comment on the linear regression example.

    The slope in your drawing is declining, so the slope of the line should be -0.1, and not 0.1. More importantly, the interpretation of the slope is also wrong. A slope of -0.1 does not mean the activity of the chemical will be 10% lower for each day (the unit on the x-axis), but will be 0.1 lower than the day before, measured in the units on the y-aksis. Say if the activity is 1 on the first day, then the second day it will be 0.9, the third day it will be 0.8, 0.7 the fourth day, etc. 0.7 surely is not not 10% less than 0.8. To get a “10 % decline each day” interpreation, you would have to use some sort of log-transformation.

    1. 8.1

      Correction on the last bit:

      If your data showed a 10% declining trend, you would have to log transform your data to fit a straight line. If your data is linear already, doing a log transform would ruin the whole thing. I have to admit i’m not quite sure how i would proceed it if i wanted a “10% decline” interpreation of the slope, but i would have made the interpreation of the model more complicated than neccecary.

    2. 8.2

      Yes – it would be 0.1. Thank you. The intent was to create an line that would show a 10% trend over time (over all points). I’m not sure if an exponential curve would be the appropriate tool to use here, as confusopoly suggests. I’ll find out and report back!

  7. 9

    Adding to Greg’s comment: In the area of transcriptomics studies (quantifying as many as possible transcripts in a tissue/cell sample) this is an important consideration because one is measuring thousands of things for each sample. For every 2000 transcripts I am comparing I should expect about 100 to vary by 2 standard deviations or more just by chance. So if I am comparing 2 samples I need to see more differences than that to be impressed that indeed the two are expressing a different set of transcripts. That is counter-intuitive for people used to old-fashioned molecular biology where one only compared a handful of genes/transcripts at a time.

  8. 10

    The problem with p-values is that they do not indicate the magnitude of the effect you are testing, which is the thing you really want to know. It is essential to understand that a highly significant p-value does not mean the effect you are testing is big or strong. It just means your sample size was large enough to detect some nonrandom signal in your data. In biology, where there is always some natural variation between groups, p-values generally don’t give you useful information.

    Consider an example: suppose you are testing the impact of herbicide drift on the plant diversity of forests that are next to crop fields. You might have data sets on the diversities of forests next to sprayed fields, and forests next to unsprayed fields. Then you might test the null hypothesis that the diversities are the same in both groups of forests. You test your data and find a p-value of 0.0001. You can confidently reject the null hypothesis. Should you get excited about your success? No, because WE KNOW IN ADVANCE that any such exact null hypothesis is false, without taking any data whatsoever. It is virtually impossible that the two groups of real forests would have exactly the same diversities, to five decimal places, even if herbicide drift had no effect whatsoever. This means you can always obtain whatever p-value you want–all you need to do is make sure your sample size is large enough. The p-value framework you are using turns science into an empty game which can always be won if the investigator has enough resources to take big enough samples. (Using a directional null hypothesis would improve things, and multiple independent tests of a directional null hypothesis can give better info, but they still won’t tell you the magnitude of the effect.)

    Null-hypothesis-testing (with its associated p-values) is only the right model if the mere presence of an effect is noteworthy. For example, in physics, the mere presence of objects that travel faster than light would be newsworthy (recall the recent neutrino claims). It didn’t really matter how much faster than light they traveled; a fundamental law was at stake. In this situation, p-values based on a null hypothesis of “no effect” are appropriate. But I think this situation almost never occurs in biology. Almost always, then, the null-hypothesis-testing model, with its p-values, is not appropriate. What you really want to know is not “Is there an effect?” but rather “How big is the effect?” P-values don’t tell you that (they depend strongly on sample size as well as the magnitude of the effect).

    The correct approach is usually the parameter-estimation model, with confidence intervals expressing the statistical uncertainties in the estimates. This means more work– you need to find a measure whose actual magnitudes are interpretable. But it answers the real question.

    All statisticians know this, and any good stat book discusses it. Yet biologists (and many others) continue to use the inappropriate model with its p-values, and most editors and reviewers actually order biologists to keep making this mistake.

  9. 12

    They phone the author Shonda however she doesn’t truly like being referred to as like that. He is presently a creation and distribution officer but he programs on transforming it. What me and my family appreciate is to perform rock and roll and now I have time to acquire on new items. I’ve usually cherished residing in New Jersey.

  10. 75

    Hello.
    This time is a precious time.
    I want to have many memories and good memories with you.
    I always want to support you and remain a good memory.
    We hope to help you a lot in the future, and I always want to be with you. I will always do my best and support you! 토토사이트

  11. 77

    Hello.
    We always share happiness with many people with good writing.
    I want to become a good friend with you and share a lot of things with you.
    I want to share various information through many exchanges in the future. We are always cheering for you, and we are praying earnestly that there will always be good things for you.
    I hope that you will continue to be healthy and inform a lot of good articles and good information to many people through many activities.
    I wish you always happy and good things. Fighting.! 사설토토사이트

  12. 78

    Hello.
    I always see and feel a lot from you.
    Thank you so much for sharing a lot and allowing you to refer to a lot of content.
    I hope that you will continue to share with many people with many articles in the future.
    I want to be a good friend with you and do a lot of things together.
    I think you give hope to many people.
    I will pray that you will always be strong and be full of only good things.
    We look forward to your continued activities.
    Thank you. 사설토토사이트

  13. 82

    Have you ever thought about adding a little bit more than just your
    articles? I mean, what you say is valuable and everything. Nevertheless just imagine if you added some great graphics or videos to
    give your posts more, “pop”! Your content is excellent but with images and video clips, this
    blog could certainly be one of the very best in its niche.
    Very good blog!

  14. 85

    I have been exploring for a little bit for any high-quality articles or blog posts on this kind
    of space . Exploring in Yahoo I finally stumbled upon this site.

    Reading this information So i’m happy to express
    that I’ve a very good uncanny feeling I came upon exactly what I
    needed. I such a lot indisputably will make certain to do not omit this website and
    give it a look regularly.

  15. 93

    Does anyone know if I can purchase Just Delta 8 Cartridges (justdeltastore.com) at Lizard Juice Bruce B Downs, 19561 Bruce B Downs Blvd, Suite C-6, Tampa, FL, 33647?

  16. 103

    By way of introduction, I am Mark Schaefer, and I represent Nutritional Products International. We serve both international and domestic manufacturers who are seeking to gain more distribution within the United States. Your brand recently caught my attention, so I am contacting you today to discuss the possibility of expanding your national distribution reach.We provide expertise in all areas of distribution, and our offerings include the following: Turnkey/One-stop solution, Active accounts with major U.S. distributors and retailers, Our executive team held executive positions with Walmart and Amazon, Our proven sales force has public relations, branding, and marketing all under one roof, We focus on both new and existing product lines, Warehousing and logistics. Our company has a proven history of initiating accounts and placing orders with major distribution outlets. Our history allows us to have intimate and unique relationships with key buyers across the United States, thus giving your brand a fast track to market in a professional manner. Please contact me directly so that we can discuss your brand further. Kind Regards, Mark Schaefer, [email protected], VP of Business Development, Nutritional Products International, 101 Plaza Real S, Ste #224, Boca Raton, FL 33432, Office: 561-544-0719

  17. 105

    I look after a vape shop website directory and we have had a listing from a vape store in the USA that likewise advertises CBD items. A Calendar month afterwards, PayPal has contacted use to claim that our account has been restricted and have asked us to take out PayPal as a payment solution from our vape store website directory. We do not sell CBD products such as CBD oil. We simply provide online marketing services to CBD firms. I have had a look at Holland & Barrett– the UK’s Reputable Health Merchant and if you take a good peek, you will see that they offer for sale a somewhat comprehensive variety of CBD product lines, primarily CBD oil and they also happen to accept PayPal as a settlement method. It seems that PayPal is employing contradictory standards to different firms. Because of this limitation, I can no longer take PayPal on my CBD-related web site. This has limited my payment choices and now, I am seriously contingent on Cryptocurrency payments and direct bank transfers. I have checked with a lawyer from a Magic Circle law firm in The city of london and they stated that what PayPal is doing is entirely against the law and discriminatory as it ought to be applying a consistent criterion to all firms. I am yet to get in touch with a different attorney from a US law office in The city of london to see what PayPal’s legal position is in the USA. For the time being, I would be highly appreciative if anybody here at targetdomain could offer me with substitute payment processors/merchants that work with CBD companies.

  18. 108

    By way of introduction, I am Mark Schaefer, and I represent Nutritional Products International. We serve both international and domestic manufacturers who are seeking to gain more distribution within the United States. Your brand recently caught my attention, so I am contacting you today to discuss the possibility of expanding your national distribution reach.We provide expertise in all areas of distribution, and our offerings include the following: Turnkey/One-stop solution, Active accounts with major U.S. distributors and retailers, Our executive team held executive positions with Walmart and Amazon, Our proven sales force has public relations, branding, and marketing all under one roof, We focus on both new and existing product lines, Warehousing and logistics. Our company has a proven history of initiating accounts and placing orders with major distribution outlets. Our history allows us to have intimate and unique relationships with key buyers across the United States, thus giving your brand a fast track to market in a professional manner. Please contact me directly so that we can discuss your brand further. Kind Regards, Mark Schaefer, [email protected], VP of Business Development, Nutritional Products International, 101 Plaza Real S, Ste #224, Boca Raton, FL 33432, Office: 561-544-0719

  19. 109

    I look after a vape store submission site and we have had a listing from a vape store in the USA that likewise offers for sale CBD product lines. A Month later, PayPal has contacted use to say that our account has been limited and have asked us to get rid of PayPal as a payment solution from our vape shop website directory. We do not offer CBD products such as CBD oil. We solely offer internet marketing solutions to CBD firms. I have had a look at Holland & Barrett– the UK’s Well known Health Store and if you take a close peek, you will see that they promote a fairly comprehensive variety of CBD goods, specifically CBD oil and they also happen to take PayPal as a settlement solution. It seems that PayPal is administering double standards to many different firms. As a result of this restriction, I can no longer take PayPal on my CBD-related site. This has limited my payment possibilities and presently, I am intensely dependent on Cryptocurrency payments and straightforward bank transfers. I have spoken with a barrister from a Magic Circle law practice in London and they explained that what PayPal is undertaking is absolutely not legal and inequitable as it ought to be applying a systematic criterion to all companies. I am yet to speak to another attorney from a US law firm in London to see what PayPal’s legal position is in the United States. In the meantime, I would be very appreciative if anybody here at targetdomain could offer me with different payment processors/merchants that deal with CBD firms.

  20. 114

    I’m really impressed together with your writing abilities and also with the layout in your weblog. Is this a paid subject matter or did you customize it your self? Either way stay up the nice high quality writing, it is rare to see a nice weblog like this one today..

  21. 116

    Excellent beat ! I would like to apprentice even as you amend your web site, how can i subscribe for a weblog website? The account aided me a applicable deal. I were a little bit acquainted of this your broadcast offered shiny clear concept

  22. 117

    I have been exploring for a little bit for any high quality articles or blog posts in this kind of space . Exploring in Yahoo I ultimately stumbled upon this web site. Studying this info So i am glad to show that I’ve an incredibly just right uncanny feeling I found out just what I needed. I such a lot certainly will make certain to don’t fail to remember this website and give it a look regularly.

  23. 118

    I was wondering if anyone knows what happened to Dimepiece Los Angeles celebrity streetwear brand? I cannot check out on Dimepiecela site. I have read in The Sun that they were acquired by a UK-based hedge fund in excess of $50 m. I have just bought the Dimepiece 90’s Sport Utility Duffle from Amazon and totally love it xox

  24. 194

    You really make it seem so easy with your presentation but I find this topic to be really something which I think I would never understand. It seems too complicated and extremely broad for me. I’m looking forward for your next post, I will try to get the hang of it!|

  25. 195

    Great blog! Do you have any hints for aspiring writers? I’m hoping to start my own blog soon but I’m a little lost on everything. Would you suggest starting with a free platform like WordPress or go for a paid option? There are so many choices out there that I’m totally confused .. Any recommendations? Appreciate it!|

  26. 211

    Greetings, I do think your web site may be having web browser compatibility problems. When I look at your website in Safari, it looks fine however when opening in Internet Explorer, it’s got some overlapping issues. I just wanted to give you a quick heads up! Apart from that, wonderful website!

  27. 217

    Hi there! This post couldn’t be written any better! Looking at this post reminds me of my previous roommate! He always kept talking about this. I’ll forward this information to him. Pretty sure he’s going to have a great read. I appreciate you for sharing!

  28. 222

    Hi, I do think this is a great site. I stumbledupon it 😉 I will return yet again since i have saved as a favorite it. Money and freedom is the greatest way to change, may you be rich and continue to guide others.

  29. 223

    Right here is the perfect blog for anyone who wants to understand this topic. You realize so much its almost hard to argue with you (not that I personally would want to…HaHa). You certainly put a new spin on a subject that has been discussed for a long time. Wonderful stuff, just great.

  30. 230

    Can I simply just say what a relief to uncover a person that genuinely understands what they are discussing on the web. You definitely know how to bring an issue to light and make it important. More people need to read this and understand this side of the story. I was surprised you aren’t more popular because you surely have the gift.

  31. 235

    That is a great tip particularly to those fresh to the blogosphere. Simple but very accurate information… Appreciate your sharing this one. A must read post.

  32. 236

    Hi! I could have sworn I’ve visited your blog before but after looking at some of the articles I realized it’s new to me. Nonetheless, I’m certainly delighted I stumbled upon it and I’ll be book-marking it and checking back regularly!

  33. 237

    You’ve made some decent points there. I looked on the net to learn more about the issue and found most individuals will go along with your views on this website.

  34. 240

    You have made some decent points there. I checked on the web for additional information about the issue and found most individuals will go along with your views on this web site.

  35. 244

    Excellent web site you have got here.. It’s difficult to find high-quality writing like yours nowadays. I seriously appreciate individuals like you! Take care!!

  36. 253

    Somebody essentially lend a hand to make seriously articles I might state. This is the first time I frequented your web page and so far? I surprised with the analysis you made to make this actual submit amazing. Wonderful task!
    “밤의부산”Fantastic website.
    A lot of useful info here. I am sending it to a few buddies ans also sharing in delicious. And obviously, thank you for your effort!

  37. 263

    เกมที่ได้รับความนิยมมากๆ ตอนนี้ในหมู่นักเดิมพัน ลุ้นเร็ว รวยเร็ว จี คลับ 888 เกมออนไลน์บนมือถือ เล่นง่ายได้เร็ว เว็บเดิมพันออนไลน์ รวดเร็วด้วยระบบฝาก-ถอนตลอด 24 ชั่วโมง

  38. 264

    ทดลองเล่น naga slot game สล็อตเว็บตรง แตกง่าย ไม่มีขั้นต่ำ ไม่ผ่านเอเย่นต์ เว็บแท้100% รวมเกมแตกง่ายได้เงินจริง สมัครเปิดยูสไม่มีขั้นต่ำ ที่นี่

  39. 269

    ทดลองเล่น naga slot game สล็อตเว็บตรง แตกง่าย ไม่มีขั้นต่ำ ไม่ผ่านเอเย่นต์ เว็บแท้100% รวมเกมแตกง่ายได้เงินจริง สมัครเปิดยูสไม่มีขั้นต่ำ ที่นี่

Leave a Reply

Your email address will not be published. Required fields are marked *