Research Methodology from Theory to Practice

In Bayesian probability, we begin with a hypothesis, such as the likelihood of an athlete sustaining an injury, and then we observe some evidence, such as recent performance data or injury history. The goal is to determine the probability that the hypothesis is true, given the new evidence. This is expressed as P(Hypothesis | Evidence), where the symbol “ | ” denotes that we are only considering scenarios where the evidence is true. For example, if we want to assess the probability that an athlete is likely to get injured given a history of previous injuries, we would update our initial belief (prior probability) using the new injury data (evidence) to calculate a revised probability (posterior probability). Bayesian probability thus provides a powerful method for updating our beliefs as new information becomes available, allowing for more accurate predictions and better-informed decisions in sports studies.

Bayesian probability offers a powerful and flexible framework for updating our beliefs in light of new evidence. Unlike frequentist probability, which relies on the frequency of events over many trials, Bayesian probability allows for the incorporation of prior knowledge or beliefs into the calculation of probabilities. This approach is particularly valuable in dynamic fields like sports, where new data is constantly emerging, and decisions often need to be made under uncertainty. In this chapter, we will explore the principles of Bayesian probability, its applications, and how it can be used effectively in sports studies.

Tomas Bayes (1702-1761) never publishes his theorem. After he died his close friend (Richard Price) found the famous formula in Bayes’ “An Essay towards solving a Problem in the Doctrine of Chances”, and published it for him posthumously in 1763. When Richard Price's published the Bayes theorem, he gave the example of a cave-man seeing the sun-rise for the first time and wondering if it happened every day.

The Sun Rising Example: Imagine someone emerging from a cave and seeing the sun rise for the first time. They might wonder if this is a one-off event or if the sun rises every day. Each subsequent sunrise provides new evidence, gradually increasing their confidence that the sun will rise again. This scenario reflects how Bayesian probability allows us to update our beliefs with each new piece of evidence.

Bayes' Original Thought Experiment: Bayes conceived a thought experiment where he imagined sitting with his back to a perfectly square table, asking his assistant to throw a ball onto the table. Without seeing the ball, Bayes would ask for more balls to be thrown and for the assistant to indicate the relative positions of the balls. Over time, by updating his beliefs with each new piece of evidence, he could increasingly pinpoint the position of the first ball, though he would never be entirely certain. This experiment underscores the iterative nature of Bayesian probability: each new piece of evidence refines our understanding, though some uncertainty always remains.

Bayes' Theorem: At the heart of Bayesian probability lies Bayes' Theorem, which provides a mathematical way to update the probability of a hypothesis based on new evidence. The theorem is expressed as:

$$P(H|E) = \dfrac{P(E|H) \times P(H)}{P(E)} $$ $$\qquad $$ $$\text{Posterior Probability} = \dfrac{\text{Likelihood} \times \text{Prior}}{\text{Marginal Probability}} $$ $$\qquad $$ $$\text{The probability of A being True, given B is True} = \\ \qquad \\ = \frac{\begin{matrix} \text{The probability of B being True,} \\\text{given the A is True} \end{matrix} \times \begin{matrix} \text{The probability A being True} \\\text{[the knowledge]} \end{matrix}}{\text{The probability of B being True}} $$ $$\quad $$

$P(H|E)$ – Posterior Probability: The probability that the hypothesis (H) is true given the evidence (E). This represents our updated belief after seeing the evidence.
$P(H)$ – Prior Probability: The initial probability that the hypothesis (H) is true before any evidence is considered. This is often based on previous knowledge or assumptions.
$P(E|H)$ – Likelihood: The probability of observing the evidence (E) given that the hypothesis (H) is true.
$P(E)$ – Marginal Probability: The overall probability of observing the evidence (E), regardless of the hypothesis.

Doping Test in Sports (Example 1)

Suppose an athlete undergoes a doping test, and the test returns positive. The doping test is designed to correctly identify 99% of athletes who are actually using banned substances (sensitivity) but also incorrectly identifies 1% of clean athletes as positive (false positive rate). Imagine that the prevalence of doping in the population of athletes is quite low, at around 0.1%.

Step 1: Establish the Prior Probability

The prior probability $P(Doping)$ is the probability that an athlete is doping before considering the test results. Based on the given prevalence, this is 0.1 %, or 0.001.

Step 2: Calculate the Posterior Probability

When the athlete tests positive, we want to calculate the posterior probability, $P(Doping∣Positive)$, which is the probability that the athlete is actually doping given the positive test result. To do this, we use Bayes' Theorem:

$$ P(Doping∣Positive) = \frac{P(Positive∣Doping) \times P(Doping)}{P(Positive)} $$

Where:

$P(Positive∣Doping) = 0.99$ (The probability that the test is positive given the athlete is doping, also known as the test's sensitivity).
$P(Doping) = 0.001$ (The prior probability that an athlete is doping).
$P(Positive) = $

$= P(Positive∣Doping) \times P(Doping) + P(Positive∣NoDoping) \times P(NoDoping) =$

$= (0.99 \times 0.001)+(0.01 \times 0.999) = 0.001+0.00999=0.01099 $

Now, calculate the posterior probability:

$$P(Doping∣Positive)= \frac{0.99 \times 0.001}{0.01099} \approx \frac{0.00099}{0.01099} \approx 0.09\,[\%] $$

So, despite the positive test, the probability that the athlete is actually doping is approximately 9%.

Step 3: Consider a Second Test

If the athlete undergoes a second doping test and again tests positive, we can update our probability using the new evidence. The prior probability for the second test becomes the posterior probability from the first test, which is 9 % or 0.09 (instead of the original 0.1 % or 0.001).

Using the same Bayes' Theorem:

$$ P(Doping∣Positive,\,Positive) = \frac{P(Positive∣Doping) \times P(Doping)}{P(Positive)} $$

Where $P(Doping∣Positive) = 0.09$ is now our updated prior. Recalculating the overall probability of a positive result:

$$ P(Positive) = (0.99 \times 0.09)+(0.01 \times 0.91) = 0.0891+0.0091 = 0.0982 $$

Thus, the updated posterior probability after two positive tests:

$$P(Doping∣Positive,\,Positive) = \frac{0.99 \times 0.09}{0.0982} \approx \frac{0.0981}{0.0982} \approx 0.907\, [\%]$$

After two positive tests, the probability that the athlete is doping increases significantly to approximately 90.7 %.

This example illustrates the importance of considering prior probabilities when interpreting test results, particularly in situations where the prevalence of the condition being tested for is low, as is often the case with doping in elite sports. Even with a highly accurate test, the initial likelihood of doping plays a crucial role in determining the final probability. With each subsequent test that returns positive, our confidence that the athlete is doping increases, demonstrating the power of Bayesian updating in sports science.

Imagine you are assessing the probability of rain on a cloudy day (Example 2)

Posterior Probability $[P(Rain|Cloud)]$: Probability of rain given it is cloudy.

Likelihood $[(Cloud|Rain)]$: Probability of cloudiness given it rains (50 %).

Prior Probability $[P(Rain)]$: General probability of rain (10 %).

Marginal Probability $[P(Cloud)]$: General probability of cloudiness (40 %).

$$\text{Probability of rain given it is cloudy} = \frac{50 \times 10}{40}0=12.5\,[\%]$$

Using Bayes’ Theorem, the probability of rain given that it is cloudy can be calculated as 12.5 %. This demonstrates how new information (cloudiness) updates our prior belief about the likelihood of rain.

Probability of Fire When There is Smoke (Example 3)

Posterior Probability $P(Fire|Smoke)$: Probability of fire given there is smoke.
Likelihood $[P(Smoke|Fire)]$: Probability of smoke given a fire (90 %).
Prior Probability $[P(Fire)]$: General probability of fire (1 %).
Marginal Probability $[P(Smoke)]$: General probability of smoke (10 %).

The calculation shows there is a 9% chance of fire if smoke is observed. While the probability is relatively low, it is still significant enough to warrant action. Reminder: it is necessary to know the context and consequences before making a decision (1.5.3.1 Cherry-Picking Data).

2.12.1 Applications of Bayesian Probability

Bayesian probability is used across various fields, including science, artificial intelligence, and personal decision-making. Here are some key examples:

Treasure Hunting and Bayesian Inference: In 1988, treasure hunter Tommy Thompson used Bayesian methods to locate the wreck of the SS Central America, which contained $700,000,000 worth of gold. By continually updating his estimates based on new findings, Thompson was able to refine his search area and eventually discover the treasure. This example illustrates how Bayesian probability can be applied in real-world situations to make decisions under uncertainty.

Artificial Intelligence: Programmers often use Bayesian probability in developing AI systems, particularly in areas like machine learning and spam filters. For instance, a spam filter may assess the probability that an email is spam based on the occurrence of certain words, or phrases. By updating its belief as it processes more emails, the filter becomes more accurate over time.

Personal Decision-Making: Bayesian probability also applies to how individuals update their beliefs. For example, how you view yourself and your opinions may change as you encounter new evidence or perspectives. This process of updating beliefs is fundamental to the Bayesian approach and is relevant in sports when considering athletes' self-assessment and adaptation to new training techniques.

Injury Risk Assessment: Suppose a coach wants to assess an athlete's risk of injury. The prior probability might be based on the athlete's injury history. As new data (e.g., from training sessions or physiological tests) becomes available, the coach can update the probability of injury, allowing for more informed decisions about training intensity and rest periods.

Game Strategy: Coaches can use Bayesian probability to refine game strategies based on the performance of their team and opponents. For example, if a basketball team’s shooting percentage improves after adopting a new offensive strategy, the coach can update their prior beliefs about the effectiveness of this strategy using Bayesian methods.

Performance Prediction: Bayesian probability can also help in predicting an athlete's future performance based on past data. If an athlete has a history of performing well under certain conditions, new data from recent performances can be used to update the probability of future success.

Conclusion

Bayesian probability provides a robust framework for updating beliefs and making decisions in the face of uncertainty. By incorporating prior knowledge and adjusting to new evidence, Bayesian methods offer a dynamic approach to problem-solving in sports studies. From injury risk assessments to game strategy, Bayesian probability allows researchers and practitioners to refine their predictions and improve their decision-making processes.

Review Questions

Explain the difference between prior probability and posterior probability in the context of Bayesian probability.
How can Bayesian probability be applied to assess the risk of injury in athletes? Provide a hypothetical example.
Discuss the relevance of Bayesian probability in making game strategy decisions. How does it differ from traditional methods?

Exercise

Using the Bayesian approach, analyse a sports-related problem, such as predicting the outcome of a match or evaluating the likelihood of an injury. Start with a prior probability based on historical data, and then update your probability as new evidence emerges. Discuss how your belief changes with each new piece of data, and what implications this has for decision-making in sports.

2.11 Probability in Sports Studies

References