Week 8 : Bayesian modeling Introduction to Bioinformatics (LF:DSIB01) Adobe Systems Bayes’ theorem 2 Adobe Systems Bayes’ theorem 3 Adobe Systems Bayes’ theorem 4 Adobe Systems Bayes’ theorem 5 Adobe Systems Bayes’ theorem - example 6 •You might be interested in finding out a patient’s probability of having liver disease if they are an alcoholic. ‒Past data tells you that 10% of patients entering your clinic have liver disease. P(A) = 0.10. ‒Five percent of the clinic’s patients are alcoholics. ‒Among those patients diagnosed with liver disease, 7% are alcoholics. Adobe Systems Bayes’ theorem - example 7 •You might be interested in finding out a patient’s probability of having liver disease if they are an alcoholic. ‒Past data tells you that 10% of patients entering your clinic have liver disease. P(A) = 0.10. ‒Five percent of the clinic’s patients are alcoholics. ‒Among those patients diagnosed with liver disease, 7% are alcoholics. • • • •P(A|B) = (0.07 * 0.1) / 0.05 = 0.14 •If the patient is an alcoholic, their chances of having liver disease is 0.14 (14%). This is a large increase from the 10% suggested by past data. Adobe Systems Bayes’ theorem - example 8 •What is the probability that a woman has cancer if she has a positive mammogram result? ‒One in 1000 of women have breast cancer. ‒98 percent of women who have breast cancer test positive on mammograms. ‒1 percent of women without breast cancer have a positive mammogram. • • Adobe Systems Bayes’ theorem - example 9 •What is the probability that a woman has cancer if she has a positive mammogram result? ‒One in 1000 of women have breast cancer. ‒98 percent of women who have breast cancer test positive on mammograms. ‒1 percent of women without breast cancer have a positive mammogram. • • •P(A)=0.001, P(-A)=0.999, P(B|A)=0.98,P(B|-A)=0.01 •(0.98 * 0.001) / ((0.98 * 0.001) + (0.01 * 0.999)) = 0.0893 •The probability of a woman having cancer, given a positive test result, is ~9%. Adobe Systems Bayes’ theorem - applications 10 •Bayesian statistics ‒Data modeling ‒Parameter estimation •Bayesian networks ‒Naïve Bayesian classifier ‒Dynamic Bayesian networks •Hidden markov models •Can be used in many different context ‒Neural networks ‒ Adobe Systems Bayesian statistics 11 •Bayesian statistical methods use Bayes' theorem to compute and update probabilities after obtaining new data. • • Adobe Systems Bayesian vs. frequentists statistics 12 •Bayesian interpretation of probability where probability expresses a degree of belief in an event •In the Bayesian view, a probability is assigned to a hypothesis, whereas under frequentist inference, a hypothesis is typically tested without being assigned a probability. •The frequentist interpretation that views probability as the limit of the relative frequency of an event after many trials • Adobe Systems Bayesian vs. frequentists statistics 13 •Two main sticking points ‒Prior believe ‒Small amount of data situations ‒ • Adobe Systems Bayesian statistics – parameter estimation 14 •Data from some probability distribution function (PDF) •The goal is to get PDF of the parameter of this function • Adobe Systems PDF estimation with sampling 15 •Impossible (very hard) to compute analytically •Markov chain Monte Carlo (MCMC) ‒This allowed usage of Bayesian statistics • • A close up of a logo Description automatically generated Adobe Systems PDF estimation with sampling 16 •Impossible (very hard) to compute analytically •Markov chain Monte Carlo (MCMC) ‒This allowed usage of Bayesian statistics • • A picture containing drawing Description automatically generated A screenshot of a cell phone Description automatically generated Adobe Systems Bayesian statistics - example 17 •Student t-test Adobe Systems Bayesian statistics – Generalized linear model 18 y = sig(β0 + β1x1 + β2x2 ) y = β0 + β1x1 + β2x2 A close up of a map Description automatically generated A close up of a map Description automatically generated Adobe Systems Bayesian statistics – Generalized linear model 19 y Scale Type Link Function pdf metric identity normal dichotomous logistic Bernoulli ordinal thresholded cumulative normal categorical count exponential Poisson • Adobe Systems Bayesian statistics – Generalized linear model 20 Response variable type Explenatory variable type Example test type Categorical Categorical Fisher test Categorical (two groups) Continuous t-test Categorical (multiple groups) Continuous ANOVA Continuous Continuous Linear regression Continuous Categorical (two groups) Logistic regression Adobe Systems Bayesian statistics – hierarchical models 21 A close up of a map Description automatically generated Adobe Systems Bayesian networks 22 •Diracted acyclic probabilistic graph •Represents a set of variables and their conditional dependencies Adobe Systems Bayesian networks - example 23 E1: symptom1 E2: lab result E4: lab result2 C: diagnosis E3: symptom2 CPT(E1|C,E6) CPT(E2|C) CPT(E3|C,E7) CPT(E4|C,E5,E7) E5: Family genentics E6: Smoker E7: Hot weather P(E6) CPT(C|E5,E6) P(E5) P(E7) Adobe Systems Dynamic Bayesian networks 24 •Hidden markov models (hmm) are a (simple) special case of DBN Adobe Systems Naive Bayesian Classifier 25 Adobe Systems Naive Bayesian Classifier – simple example 26 E1: symptom1 E2: lab result E4: lab result2 C: diagnosis E3: symptom2 CPT(E1|C) CPT(E2|C) CPT(E3|C) CPT(E4|C) Adobe Systems Naive Bayes classifiers - properties 27 •highly scalable, requiring number of parameters linear in the number of variables ‒curse of dimensionality •training can be done by evaluating a closed-form expression which takes linear time • •Assumption of independence ‒Simple model •Successfully being used •Best if you have poor understanding of the problem. • • • • Adobe Systems Naive Bayesian Classifier - example 28 •Pancreatic tumor classification based on miRNA levels •miRNA sequencing from plasma samples ‒~300 miRNAs •Select ~20 miRNAs to classify tumor types • • • • •Bayesian model - 24 miRNAs -> 85% success rate Log expression ratio (miRNA12,miRNA9) tumor classification … Log expression ratio (miRNA25,miRNA60) Log expression ratio (miRNA24,miRNA51) Adobe Systems Adobe Systems Adobe Systems Adobe Systems Adobe Systems Adobe Systems 29 www.ceitec.eu CEITEC @CEITEC_Brno Thank you for your attention! 60 minutes lunch break. >