Class 2: One Health, BioStatistics I: The Power and Crisis

1. Introduction and Recap of Last Class

1.1. Recap of Last Class

  • Introduction to Biostatistics
  • First Part Content
  • Second Part Content
  • Laboratory

1.2. Lecture Outline

  • Motivations
  • The Power
  • Probability, Expectations & Variance
  • Intervals, Testing & p-Values
  • StatsBiol
  • Laboratory

2. Motivations

2.1. The Crisis

2.1.1. The Crisis

reproducibility-crisis.jpg

2.1.2. Objectives & Agenda

  • Critical Thinking & Objectivity
  • Bias in Knowledge & Believes
  • Customs & Best Practice
  • Societal Pressures

2.1.3. Where we are

  • Ask yourself

    reproducibility-graphic-online1.png

    • 90% Recognize a Crisis on REPRODUCIBILITY
  • Trust you field?

    reproducibility-graphic-online2.jpg

    • Quantification makes a difference?.
    • Physicist & chemists more confident
  • Have you?

    reproducibility-graphic-online3.jpg

    • Fail to reproduce results:
      • Someone else 60-80%
      • My own 40-60%
    • Publishing difficulty
      • of failing reproduction 13%
      • vs successful reproduction 24%
  • Why?

    reproducibility-graphic-online4.jpg

    • ~70% fraud
    • >80% poor design
    • Selective reporting & pressure ~90%
  • What to Change?

    reproducibility-graphic-online5.jpg

    • ~90% Better Statistical undersanding
    • Robust design
    • Mentoring
    • Better practices
  • Did you?

    reproducibility-graphic-online6.jpg

    • 34% did take actions
    • 33% last 5 yrs
    • 7% more than 5 yrs
    • 26% From the beginning
  • 2.1.4. Replication Studies

    • Replicability crisis is a serious issue in which many scientific studies are difficult to reproduce or replicate.
    • Cancer research, only about 10–25% of published studies could be validated or reproduced.
    • In psychology only about 36% were reproduced.
    • Other
      • Medicine.
      • Genetics.
      • Economics.
      • Neuroscience.

    2.1.5. Reasons

    • Inappropriate practices of science,
      • HARKing (Hypothesizing After the Results are Known)
      • p-hacking.
      • Selective reporting of positive results.
      • Poor research design.
      • Lack of raw data.

    2.1.6. Pharma

    • Bayer

      BayerRep.png

      • Oncology, woman health, cardiovascular.
      • 65 % where not reproducible.
  • Amgen

    AmgenPharma.png

    • Oncology and hematology
    • From 53 works, only 6 (11%) where confirmed.
  • 2.1.7. Give me the Power

    • Power Failure
      • Median statistical power is 18-21%
      • Neuroimaging studies 8%.
      • Animal models 18-31%.

      NeuroPowerDist.png

  • Sample Size

    NeuroPowerTable.png

  • Power Effects

    NeuroPower.png

    • Low Power
      • Discovering effects that are genuinely true is low.
      • Produce more false negatives than high-powered studies.
    • Low PPV
      • Positive Predictable Value
      • PPV = ([1 – β] × R) ⁄ ([1− β] × R + α)
    • Effect inflation
      • Effect inflation occur whenever claims of discovery are based on thresholds of statistical significance
        • for example, p < 0.05, or other selection filters.
  • Low power and other biases
    • Low-powered studies are more likely to provide a wide range of estimates of the magnitude of an effect.
    • Publication bias, selective data analysis and selective reporting are more likely to affect low-powered studies.
    • Small studies may be of lower quality in other aspects of their design as well.
  • More Power

    The probability that a research finding is indeed true depends on:

    • the prior probability of it being true (before doing the study),
    • the statistical power of the study,
    • and the level of statistical significance.
  • PPV
    • After a research finding has been claimed based on achieving formal statistical significance, the post-study probability that it is true is the positive predictive value, PPV.
  • Graphical Assessment

    PowerGraph.png

  • Power & Bias
    Finding True Relationship    
      Yes No Total
    Yes \[c(1-\beta)R/(R+1)\] \[c\alpha/(R+1)\] \[c(R+\alpha-\beta R)/(R+1)\]
    No \[c\beta R/(R+1)\] \[c(1-\alpha)/(R+1)\] \[c(1-\alpha+\beta R)/(R+1)\]
    Total \[cR/(R+1)\] \[c/(R+1)\] \[c\]
           
    • (c=relationships are being probed in the field)
  • Bias

    A combination of various factors that tend to produce research findings when they should not be produced including:

    • Design
    • Data
    • Analysis
    • Presentation factors
  • Corollaries

    PPV.png

    1. The smaller the studies conducted in a scientific field, the less likely the research findings are to be true.
    2. The smaller the effect sizes in a scientific field, the less likely the research findings are to be true.
    3. The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true.
    4. The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.
    5. The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true.
    6. The hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true.
  • Some Estimates
    \(1-\beta\) R Bias, u Example PPV
    0.80 1:1 0.10 Powered RCT with little bias a 1:1 pre-study odds 0.85
    0.95 2:1 0.30 Confirmatory meta-analysis of good quality RCTs 0.85
    0.80 1:10 0.30 Adequately powered exploratory epidemiological study 0.20
    0.20 1:1000 0.80 Discovery oriented exploratory research with massive testing limited bias 0.0015
    • Randomized control Trial
  • 2.1.8. Erroneous Interactions

    • Even Best Families Top-ranking journals
      • Behavioural, Systems Neuroscience.
        • ~50% correct comparison procedures for two experimental effects.
        • 2/3 of erroneous cases it may have had serious consequences.

      ErrInteractions1.png

    Even Best Families Top-ranking journals
    • Cellular and molecular neuroscience.

      • From 120 additional articles in, none uses correct statistical procedure to compare effect sizes.
      • 25 used incorrect procedures to compared significance levels.
  • Comparison Errors

    ErrInteractions2.png

    • Tree situation where effect size comparison are incorrectly made.
  • 2.1.9. DATA sets

    • Raw Data Withdraw

      RawDataRequest.png

  • Absence of Raw Data means absence of science
  • Open Science Open Data
  • 3. The Power

    • After a break?

    3.1. Objectives

    • Probability & Statistics
    • Descriptive/Exploratory
    • Inference
    • Hypothesis Testing
    • Some Recommendations for Biology

    3.2. Intro

    3.2.1. Basic Definition

    • Statistical inference is the process of drawing formal conclusions from data.
    • Statistical inference occurs where one wants to infer facts about a population using noisy statistical data where uncertainty must taken into account.
    • Statistical inference requires assessment of assumptions and tools and thinking how to draw conclusions from data.

    3.2.2. Some Inference Goals

    • Benchmarking
      • Effectiveness of a treatment
    • Quantify
      • Proportion of voting
    • Relationship
      • Slope of Hoocke’s law
    • Impact
      • Confinements
    • Probability
      • Raining tomorrow

    3.2.3. Some tools in Inference

    • Randomization.
      • Unobserved variables may confound inferences of interest.
    • Random sampling.
      • Data representative of a population.
    • Sampling models.
      • Creating a model for the sampling process.
      • Independent Identically Distributed (i.i.d).
    • Hypothesis testing.
      • Decision making under uncertainty.
    • Confidence intervals.
      • Quantify uncertainty in estimation.
    • Probability Models.
      • Formal connection between the data and population of interest.
    • Study Design.
      • Experiment to minimize biases and variability.
    • Nonparametric bootstrapping.
      • Using data to create inference with minimal probability model assumptions.
    • Permutation.
      • Randomization and exchangeability testing to perform inferences.

    3.2.4. Schools Styles

    • Frequentist Probability & Inference
      • Long run proportion of times an event occurs in independent, identically distributed repetitions.
      • Interpretations of probabilities to control error rates.
      • Given my data controlling the long run proportion of mistakes I make at a tolerable level.
    • Bayesian Probability & Inference
      • Estimate or calculate of beliefs, which follow certain rules.
      • Inference is performed by Bayesian probability representation of beliefs.
      • Subjective beliefs and the objective information from the data to infer.

    4. Probability, Expectations & Variance

    4.1. Probability

    4.1.1. Probability Definition

    Given a random variable (experiment; say rolling a die) a probability measure is a population quantity that summarizes the randomness.

    • number between 0 and 1.
    • probability that something occurs is 1 (the die must be rolled) and
    • The probability of the union of any two sets of outcomes that have nothing in common (mutually exclusive) is the sum of their respective probabilities.

    4.1.2. Rules probability must follow

    The Russian mathematician Andrey Nikolaevich Kolmogorov formalized these rules.

    • The probability that nothing occurs is 0
    • The probability that something occurs is 1
    • The probability of something is 1 minus the probability that the opposite occurs
    • The probability of at least one of or more things that can not simultaneously occur, mutually exclusive, is the sum of their respective probabilities.
    • More interestingly
      • If an event “A” implies the occurrence of event “B”, then the probability of “A” occurring is less than the probability that “B” occurs.
      • For any two events the probability that at least one occurs is the sum of their probabilities minus their intersection.

    4.1.3. Simple Example

    • Event/Condition X with incidence of 3% in the population
    • Whereas 10% of the population with Event/Condition Y.
    • Does this imply that 13% of people will have at least one these Event/Condition?
      • Answer: If the events can simultaneously occur; they are not mutually exclusive so NO.
    • lets:
    \begin{eqnarray*} A_1 & = & \{\mbox{Event X}\} \\ A_2 & = & \{\mbox{Event Y}\} \end{eqnarray*}
    • Then
    \begin{eqnarray*} P(A_1 \cup A_2 ) & = & P(A_1) + P(A_2) - P(A_1 \cap A_2) \\ & = & 0.13 - \mbox{Probability of having both} \end{eqnarray*}
    • Likely, some fraction of the population has both.

    4.1.4. Random variables

    • A random variable is a numerical outcome of an experiment.
    • The random variables come in two varieties, discrete or continuous.
      • Discrete random variable take on only a countable number of possibilities; the probability takes specific values.
      • Continuous random variable can take any value on the real line, or some subset; the probability they take within some range.

    4.1.5. Quantiles

    • Famous sample quantiles.
      • The 95th percentile on an exam, 95% of people scored worse than 5% scored better.
    • Population analogs.
    • Definition
      • The \(\alpha^{th}\) quantile of a distribution with distribution function \(F\) is the point \(x_\alpha\) so that \[F(x_\alpha) = \alpha\]
      • A percentile is simply a quantile with \(\alpha\) expressed as a percent
      • The median is the \(50^{th}\) percentile
  • For example
    • The \(75^{th}\) percentile of a distribution is the point so that:
      • The probability, that a random variable from the population, is less is 75%.
      • The probability, that a random variable from the population, is more is 25%.
  • 4.1.6. Conditional Probability

    • Motivating example
      • The probability of getting a one when rolling a (standard) die is usually assumed to be one sixth.
      • Suppose you were given the extra information that the die roll was an odd number (hence 1, 3 or 5).
      • Conditional on this new information, the probability of a one is now one third.
  • Definition
    • Let \(B\) be an event so that \(P(B) > 0\)
      • Then the conditional probability of an event \(A\) given that \(B\) has occurred is \(P(A ~|~ B) = \frac{P(A \cap B)}{P(B)}\)
      • Notice that if \(A\) and \(B\) are independent, then \(P(A ~|~ B) = \frac{P(A) P(B)}{P(B)} = P(A)\)
      • \(\cap =\mbox{ intersection}\)
  • Little Example
    • Consider our die roll example, \(P(\mbox{one given that roll is odd})=P^*\).
      • \(A = \{1\}\) and \(B = \{1, 3, 5\}\). Then

        \begin{eqnarray*} P^* & = & P(A ~|~ B) \\ \\ & = & \frac{P(A \cap B)}{P(B)} \\ \\ & = & \frac{P(A)}{P(B)} = \frac{1/6}{3/6} = \frac{1}{3} \end{eqnarray*}
  • 4.1.7. Bayes’ rule

    • Baye’s rule allows us to reverse the conditioning set provided that we know some marginal probabilities.
      • \[ P(B ~|~ A) = \frac{P(A ~|~ B) P(B)}{P(A ~|~ B) P(B) + P(A ~|~ \neg B)P(\neg B)}\]
      • Where \[P(\neg B)\] is the initial degree of belief in not-B (B is false), and \(P(\neg B)=1-P(B)\)
    • Diagnostic tests
      • Let \(+\) and \(-\) be the events that the result of a diagnostic test is positive or negative respectively.
      • Let \(D\) and \(D^c\) be the event that the subject of the test has or does not have the disease respectively.
      • The sensitivity is the probability that the test is positive given that the subject actually has the disease, \(P(+ ~|~ D)\).
      • The specificity is the probability that the test is negative given that the subject does not have the disease, \(P(- ~|~ D^c)\).
  • More definitions
    • The positive predictive value is the probability that the subject has the disease given that the test is positive, \(P(D ~|~ +)\)
    • The negative predictive value is the probability that the subject does not have the disease given that the test is negative, \(P(D^c ~|~ -)\)
    • The prevalence of the disease is the marginal probability of disease, \(P(D)\)
  • 4.1.8. Using Bayes’ formula

    \begin{eqnarray*} P(D | +) & = &\frac{P(+|D)P(D)}{P(+|D)P(D) + P(+|D^c)P(D^c)}\\ \\ & = & \frac{P(+|D)P(D)}{P(+|D)P(D) + \{1-P(-|D^c)\}\{1 - P(D)\}} \\ \\ & = & \frac{.997\times .001}{.997 \times .001 + .015 \times .999} = 0.062 \end{eqnarray*}
    • Then,
      • A positive test result only suggests a 6% probability that the subject has the disease.
      • The positive predictive value is 6% for this test.

    4.1.9. Likelihood ratios, using Bayes rule

    \[ P(D|+) = \frac{P(+|D)P(D)}{P(+|D)P(D) + P(+|D^c)P(D^c)} \] \[P(D^c|+) = \frac{P(+|D^c)P(D^c)}{P(+|D)P(D) + P(+|D^c)P(D^c)}\]

    • Therefore
      • \[\frac{P(D|+)}{P(D^c|+)} = \frac{P(+|D)}{P(+|D^c)}\times \frac{P(D)}{P(D^c)}\] ie \[\mbox{post-test odds of }D = DLR_+\times\mbox{pre-test odds of }D\]
        • DLR, Diagnostic Likelihood Ratio test
        • Similarly, \(DLR_-\) relates the decrease in the odds of the disease after a negative test result to the odds of disease prior to the test.

    4.2. Expectations

    4.2.1. Expected values

    • Expected values are useful for characterizing a distributions.
    • The mean is a characterization of its center.
    • The variance and standard deviation are characterizations of how spread out it is.
    • Our sample expected values (the sample mean and variance) will estimate the population versions.

    4.2.2. The population mean

    • The expected value or mean of a random variable is the center of its distribution
    • For discrete random variable \(X\) with PMF \(p(x)\), it is defined as follows \[ E[X] = \sum_x xp(x). \] where the sum is taken over the possible values of \(x\)
    • \(E[X]\) represents the center of mass of a collection of locations and weights, \(\{x, p(x)\}\)

    4.2.3. The sample mean

    • The ample mean estimates this population mean.
    • The center of mass of the data is the empirical mean.

    \[ \bar X = \sum_{i=1}^n x_i p(x_i) \] where \(p(x_i) = 1/n\)

    4.2.4. Example

    • Find the center of mass of the bars

      galton.png

    4.2.5. What about a biased coin?

    • Suppose that a random variable, \(X\), is so that
    • \(P(X=1) = p\) and \(P(X=0) = (1 - p)\)
    • (This is a biased coin when \(p\neq 0.5\))
    • What is its expected value?
    • \[E[X] = 0 * (1 - p) + 1 * p = p\]

    4.2.6. Continuous random variables

    • For a continuous random variable, \(X\), with density, \(f\), the expected value is again exactly the center of mass of the density.

    4.2.7. Summary of Expected Values

    • Expected values are properties of distributions.
    • The average of random variables is itself a random variable and its associated distribution has an expected value.
    • The center of this distribution is the same as that of the original distribution.
    • Therefore, the expected value of the sample mean is the population mean trying to estimate.
    • When the expected value of an estimator is what its trying to estimate, we say that the estimator is unbiased

    4.3. Variance

    4.3.1. The variance

    • The variance of a random variable is a measure of spread
    • If \(X\) is a random variable with mean \(\mu\), the variance of \(X\) is defined as
      • \(Var(X) = E[(X - \mu)^2] = E[X^2] - E[X]^2\)
    • The expected (squared) distance from the mean
    • Densities with a higher variance are more spread out than densities with a lower variance
    • The square root of the variance is called the standard deviation
    • The standard deviation has the same units as \(X\)

    4.3.2. Examples Variance

    • Example 1
      • What’s the variance from the result of a toss of a die?
        • \(E[X] = 3.5\)
        • \(E[X^2] = 1 ^ 2 \times \frac{1}{6} + 2 ^ 2 \times \frac{1}{6} + 3 ^ 2 \times \frac{1}{6} + \\ 4 ^ 2 \times \frac{1}{6} + 5 ^ 2 \times \frac{1}{6} + 6 ^ 2 \times \frac{1}{6} = 15.17\)
      • \(Var(X) = E[X^2] - E[X]^2 \approx 2.92\)

  • Example 2
    • What’s the variance from the result of the toss of a coin with probability of heads (1) of \(p\)?
      • \(E[X] = 0 \times (1 - p) + 1 \times p = p\)
      • \(E[X^2] = E[X] = p\)

    \[Var(X) = E[X^2] - E[X]^2 = p - p^2 = p(1 - p)\] —

  • 4.3.3. The sample variance

    • The sample variance is
      • \(S^2 = \frac{\sum_{i=1} (X_i - \bar X)^2}{n-1}\)
      • (almost, but not quite, the average squared deviation from the sample mean)
    • It is also a random variable
      • It has an associate population distribution
      • Its expected value is the population variance
      • Its distribution gets more concentrated around the population variance with more data
    • Its square root is the sample standard deviation

    4.3.4. Recall the mean

    • Recall that the average of random sample from a population is itself a random variable
    • We know that this distribution is centered around the population mean, \(E[\bar X] = \mu\)
    • We also know what its variance is \(Var(\bar X) = \sigma^2 / n\)
    • This is very useful, since we don’t have repeat sample means to get its variance; now we know how it relates to the population variance
    • We call the standard deviation of a statistic a standard error

    4.3.5. To summarize

    • The sample variance, \(S^2\), estimates the population variance, \(\sigma^2\)
    • The distribution of the sample variance is centered around \(\sigma^2\)
    • The variance of the sample mean is \(\sigma^2 / n\)
      • Its logical estimate is \(s^2 / n\)
      • The logical estimate of the standard error is \(S / \sqrt{n}\)
    • \(S\), the standard deviation, talks about how variable the population is
    • \(S/\sqrt{n}\), the standard error, talks about how variable averages of random samples of size \(n\) from the population are

    4.3.6. Summarizing what we know about variances

    • The sample variance estimates the population variance
    • The distribution of the sample variance is centered at what its estimating
    • It gets more concentrated around the population variance with larger sample sizes
    • The variance of the sample mean is the population variance divided by \(n\)
      • The square root is the standard error

    5. Intervals, Testing & p-Values

    5.1. Hypothesis Testing

    5.1.1. Hypothesis testing

    • Hypothesis testing is concerned with making decisions using data.
    • A null hypothesis is specified that represents the status quo, usually labeled \(H_0\).
    • The null hypothesis is assumed true and statistical evidence is required to reject it in favor of a research or alternative hypothesis.

    5.1.2. Hypothesis testing decision

    • The alternative hypotheses are typically of the form \(<\), \(>\) or \(\neq\)
    • Note that there are four possible outcomes of our statistical decision process
    Truth Decide Result
    \(H_0\) \(H_0\) Correctly accept null
    \(H_0\) \(H_a\) Type I error
    \(H_a\) \(H_a\) Correctly reject null
    \(H_a\) \(H_0\) Type II error

    5.1.3. General rules

    • The \(Z\) test for \(H_0:\mu = \mu_0\), versus
      • \(H_1: \mu < \mu_0\)
      • \(H_2: \mu \neq \mu_0\)
      • \(H_3: \mu > \mu_0\)
    • Test statistic \(TS = \frac{\bar{X} - \mu_0}{S / \sqrt{n}}\)
    • Reject the null hypothesis when
      • \(TS \leq Z_{\alpha} = -Z_{1 - \alpha}\)
      • \(|TS| \geq Z_{1 - \alpha / 2}\)
      • \(TS \geq Z_{1 - \alpha}\)

    5.1.4. Notes

    • We:
      • Fix \(\alpha\) to be low, so if we reject \(H_0\): our model is wrong or there is a low probability that we have made an error.
      • Not fixed the probability of a type II error, \(\beta\); we tend to say Fail to reject \(H_0\) rather than accepting \(H_0\).
      • Statistical significance is no the same as Scientific significance.

    5.1.5. Connections with confidence intervals

    • Consider testing \(H_0: \mu = \mu_0\) versus \(H_a: \mu \neq \mu_0\).
    • Take the set of all possible values for which you fail to reject \(H_0\), this set is a \((1-\alpha)100\%\) confidence interval for \(\mu\).
    • The same works in reverse; if a \((1-\alpha)100\%\) interval contains \(\mu_0\), then we fail to reject \(H_0\).

    5.2. p-Values

    5.2.1. P-values

    • Most common measure of statistical significance.
    • Their ubiquity, along with concern over their interpretation and use makes them controversial among statisticians.

    5.2.2. What is a P-value?

    Idea: Suppose nothing is going on - how unusual is it to see the estimate we got? Approach:

    1. Define the hypothetical distribution of a data summary (statistic) when “nothing is going on” (null hypothesis)
    2. Calculate the summary/statistic with the data we have (test statistic)
    3. Compare what we calculated to our hypothetical distribution and see if the value is “extreme” (p-value)

    5.2.3. P-values

    • The P-value is the probability under the null hypothesis of obtaining evidence as extreme or more extreme than that obtained
    • If the P-value is small, then either \(H_0\) is true and we have observed a rare event or \(H_0\) is false
    • Suppose that you get a \(T\) statistic of \(2.5\) for 15 df testing \(H_0:\mu = \mu_0\) versus \(H_a : \mu > \mu_0\).
      • What’s the probability of getting a \(T\) statistic as large as \(2.5\)?

    • Therefore, the probability of seeing evidence as extreme or more extreme than that actually obtained under \(H_0\) is 0.0123

    5.2.4. The attained significance level

    • Our test statistic was \(2\) for \(H_0 : \mu_0 = 30\) versus \(H_a:\mu > 30\).
    • Notice that we rejected the one sided test when \(\alpha = 0.05\), would we reject if \(\alpha = 0.01\), how about \(0.001\)?
    • The smallest value for alpha that you still reject the null hypothesis is called the attained significance level
    • This is equivalent, but philosophically a little different from, the P-value

    5.2.5. Notes

    • By reporting a p-value the reader can perform the hypothesis test at whatever \(\alpha\) level.
    • If the p-value is less than \(\alpha\) you reject the null hypothesis.
    • For two sided hypothesis test, double the smaller of the two one sided hypothesis test P-values.

    6. StatsBiol

    Network and Science Compexity

    6.1. Descriptive Statistics

    Term Meaning Common Uses
    Standard deviation The typical difference between each value and the mean value. Describing how broadly the sample values are distributed. \[s.d.=\sqrt{\sum(X-\bar{X})^2/(N-1)}\]
    Standard error of the mean (s.e.m) An estimate how variable the means will be if the experiment is repeated multiple times. Inferring where the population mean is likely to lie or whether set of samples are likely to come from the sample population. \[s.e.m.=s.d./\sqrt{N}\]
    Confidence Interval (CI:95%) with 95% confidence, the population mean will lie in this interval. Top interfere where the population mean lies, and to compare two populations \[CI=mean\pm s.e.m. \times t_{(N-1)}\]
    Independent Data Values from separate of the same type that are not linked Testing hypothesis about population.
    Replicate data Values from experiment where everything is linked as much as possible. Serves as an internal check on performance of an experiment.
    Sampling error Variation caused by sampling part of a population rather than measuring the whole population. Can reveal bias in the data or problems with conduct of experiment. In binomial distributions the expected is \[\sqrt{Np(1-p)}\]; in Poisson the expected s.d. is \[\sqrt{mean}\]
         

    6.2. Statistical Hypothesis Testing

    • Null hypothesis
      • Pearson’s correlation test is that there is no relationship between two variables.
      • The null hypothesis for the Student’s t test is that there is no difference between the means of two populations.

    6.3. p-value (p)

    • A p-value, which is the probability of observing the result given that the null hypothesis is true.
      • not the reverse, as is often the case with misinterpretations.
    • \(p <= \alpha\): reject \(H_0\), different distribution.
    • \(p > \alpha\): fail to reject \(H_0\), same distribution.

    6.4. Errors

    • There are two types of errors:
    • Type I Error. Reject the null hypothesis when there is in fact no significant effect - false positive.
      • The p-value is optimistically small.
    • Type II Error. Not reject the null hypothesis when there is a significant effect -false negative.
      • The p-value is pessimistically large.

    6.5. What Is Statistical Power?

    • Statistical power, or the power of a hypothesis test is the probability that the test correctly rejects the null hypothesis.
      • Power = 1 - Type II Error
      • Pr(True Positive) = 1 - Pr(False Negative)

    More intuitively, the statistical power can be thought of as the probability of accepting an alternative hypothesis, when the alternative hypothesis is true.

    • Low Statistical Power:
      • Large risk of committing Type II errors.
    • High Statistical Power:
      • Small risk of committing Type II errors.

    6.6. Statistical Power

    • The statistical power of a hypothesis test is the probability of detecting an effect, if there is a true effect present to detect.

    6.7. Power Analysis

    • Effect Size.
      • The quantified magnitude of a result present in the population.
      • Effect size is calculated using a specific statistical measure, such as Pearson’s correlation coefficient for the relationship between variables.
    • Sample Size.
      • The number of observations in the sample.
    • Significance.
      • The significance level used in the statistical test, e.g. alpha. Often set to 5% or 0.05.
    • Statistical Power.
      • The probability of accepting the alternative hypothesis if it is true.

    7. Class Recap

    • Motivations
    • The Power
    • Probability, Expectations & Variance
    • Intervals, Testing & p-Values
    • StatsBiol
    • Laboratory

    8. Laboratory