Week 1 : Introduction; Algorithm Basics Introduction to Bioinformatics (LF:DSIB01) Panos Alexiou panagiotis.alexiou@ceitec.muni.cz Adobe Systems Introduction to LF:DSIB01 - Course Goals •Introductory course for Bioinformatics • •Students will: ‒become familiar with fundamental concepts ‒able to design, implement, and run basic analyses ‒decipher domain specific language in publications ‒understand the field and opportunities for development of skills • •Not goals: ‒solve specific research questions ‒cover entirety of Bionformatics field ‒teach programming skills • 2 Adobe Systems Introduction to LF:DSIB01 - Course Material •Lecture Notes • •Useful Books ‒An Introduction to Bioinformatics Algorithms, Jones and Pevzner ‒Bioinformatics for Biologists, Pevzner and Shamir ‒ • 3 Adobe Systems Course Schedule 1.Introduction; Algorithm Basics 2.Sequence analysis introduction/common file formats 3.Sequence Alignment / Sequence pattern recognition 1/3 4.Sequence Alignment / Sequence pattern recognition 2/3 5.Sequence Alignment / Sequence pattern recognition 3/3 6.NGS / galaxy -------------------------------------------------------------------------------------------- 7.Introduction to Data Analysis 8.Principles of Data visualisation 9.Clustering / PCA 10.Basic Statistics 11.Bayesian Inference/Bayesian classifier -------------------------------------------------------------------------------------------- 12.Current Developments in Machine Learning 13.Colloquium Evaluation • 4 Adobe Systems Introduction to LF:DSIB01 - Student Responsibilities • •Attend Classes and Practicals • •Complete Practical Exercises • •Demonstrate Understanding of Material • • 5 Adobe Systems 6 Adobe Systems Basics of Algorithms •Definition of an algorithm • •Pseudocode Notation • •Exercise: The Coin Change Problem • •Brute force, Iterative, Recursive • •Big-O notation • 7 Adobe Systems What is an algorithm •A sequence of instructions one must perform to solve a well formulated problem •A step-by-step method of solving a problem •A set of instructions designed to perform a specific task • • 8 Sequence of instructions Step-by-step method Set of instructions Solve Perform Well formulated problem Specific task Adobe Systems 9 Sequence of instructions Step-by-step method Set of instructions Solve Perform Well formulated problem Specific task Adobe Systems Pseudocode Notation 10 Adobe Systems Pseudocode Notation 11 Adobe Systems Pseudocode Notation 12 Adobe Systems Pseudocode Notation 13 Adobe Systems Pseudocode Notation 14 Adobe Systems Pseudocode Notation 15 Adobe Systems Pseudocode vs Computer Code 16 If you were to build a machine that follows these instructions, you would need to make it specific to a particular kitchen and be tirelessly explicit in all the steps (e.g., how many times and how hard to stir the filling, with what kind of spoon, in what kind of bowl, etc.) This is exactly the difference between pseudocode (the abstract sequence of steps to solve a well-formulated computational problem) and computer code (a set of detailed instructions that one particular computer will be able to perform). Adobe Systems Pseudocode Exercise: Coin Change (Euro coins) 17 Convert an amount of money into the fewest number of coins Input: Amount of money (M) Output: the smallest number of 50c (a), 20c (b), 10c (c), 5c (d), 2c (e) and 1c (f) such that 50a+20b+10c+5d+2e+1f = M Try: M=60c; M=55c; M=40c Adobe Systems Pseudocode Exercise: Coin Change (Generalised) 18 Adobe Systems Pseudocode Exercise: Coin Change (US coins) 19 Try M = 40; c1=25, c2=10, c3=5, c4=1 NB: Division = “floor” Adobe Systems Pseudocode Exercise: Coin Change (US coins) 20 M = 40; c1=25, c2=20, c3=10, c4=5, c5=1 Discontinued 1875 for being too confusing NB: Division = “floor” Adobe Systems Pseudocode Exercise: Coin Change (US coins) 21 BetterChange 40= 1x25 + 1x10 + 1x5= 3 coins Incorrect! 40 = 2x20= 2 coins M = 40; c1=25, c2=20, c3=10, c4=5, c5=1 Discontinued 1875 for being too confusing NB: Division = “floor” Adobe Systems Pseudocode Exercise: Coin Change 22 Tries every combination Guaranteed to find optimal Slow Adobe Systems 23 Tries every combination Guaranteed to find optimal Slow Spoiler: (There is a better solution: Stay tuned for Week 4) Brute Force Algorithm Adobe Systems Recursive Algorithms 24 https://www.mathsisfun.com/games/towerofhanoi.html 1 disc = 1 move 2 discs = 3 moves (1-2, 1-3, 2-3) 3 discs = 7 moves (1-3, 1-2, 3-2, 1-3, 2-1, 2-3,1-3) … Adobe Systems Towers of Hanoi (3 disks) 25 https://www.mathsisfun.com/games/towerofhanoi.html More generally, to move a stack of size n from the left to the right peg, you first need to move a stack of size n − 1 from the left to the middle peg, and then from the middle peg to the right peg once you have moved the nth disk to the right peg. To move a stack of size n − 1 from the middle to the right, you first need to move a stack of size n − 2 from the middle to the left, then move the (n − 1)th disk to the right, and then move the stack of n − 2 from the left to the right peg, and so on. 7 moves (1-3, 1-2, 3-2, 1-3, 2-1, 2-3,1-3) Adobe Systems Towers of Hanoi: N disks 26 https://www.mathsisfun.com/games/towerofhanoi.html ? Adobe Systems Towers of Hanoi: N disks 27 https://www.mathsisfun.com/games/towerofhanoi.html Adobe Systems Towers of Hanoi: N disks 28 https://www.mathsisfun.com/games/towerofhanoi.html Adobe Systems Towers of Hanoi: 4 disks 29 https://www.mathsisfun.com/games/towerofhanoi.html Adobe Systems Towers of Hanoi: 4 disks 30 https://www.mathsisfun.com/games/towerofhanoi.html Adobe Systems Iterative Algorithms – Fibonacci Series 31 Adobe Systems Iterative Algorithms – Immortal Rabbits 32 A baby pair of rabbits takes the same time to mature as a mature pair of rabbits takes to produce a new pair. How many rabbits are there after N iterations? PS: rabbits cannot die! 1,1,sum of previous two, … Adobe Systems 33 Iterative: Fast (linear) Recursive: Slow (exponential) Iterative Algorithms vs Recursive Algorithms Adobe Systems 34 Recursive: Slow (exponential) Adobe Systems Algorithms •Brute force : Try Everything, slow but always correct • •Recursive : To Solve for n, first solve for n-1 • •Iterative : Loop on something, can be faster 35 Adobe Systems Fast vs Slow algorithms 36 •How many operations does an algorithm take as N increases? • •Is the relationship linear? Quadratic? Exponential? • •What is the upper limit of the running time of an algorithm as N increases? Adobe Systems Guess random number (up/down) 37 Simple search: For i in 1 to N If i == the number return i Binary search: Range-min = 1 Range-max = N While () i = middle number of range if i == the number; return I elsif i < number; Range-max=i; elsif i > number; Range-min=i; 1ms / check N = 100 Adobe Systems Guess random number (up/down) 38 ??? Guess ??? ~15 times faster Adobe Systems Guess random number (up/down) 39 ~450ms ? ~15 times faster ~15 times faster ? Adobe Systems Guess random number (up/down) 40 ~15 times faster Adobe Systems Guess random number (up/down) 41 Linear 1 ms/element O(n) Logarithmic 1 ms/log2(element) O(log n) (Worst case scenario) Adobe Systems 42 Adobe Systems Common Big-Os 43 https://www.freecodecamp.org/news/big-o-notation-simply-explained-with-illustrations-and-video-87d5 a71c0174/ "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city?" Adobe Systems Week 1 Summary 44 -I know what an algorithm is - -I can write pseudocode - -I understand -Brute force -Iterative -Recursive - -Big-O = how slow Adobe Systems Adobe Systems Adobe Systems Adobe Systems Adobe Systems Adobe Systems 45 www.ceitec.eu CEITEC @CEITEC_Brno Panos Alexiou panagiotis.alexiou@ceitec.muni.cz Thank you for your attention! 60 minutes lunch break. >