1 M7777 Applied Functional Data Analysis 2. From Data to Functions — basis systems Jan Koláček (kolacek@math.muni.cz) Dept. of Mathematics and Statistics, Faculty of Science, Masaryk University, Brno n Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 1/36 From Data to Functions How do we go from St. Johns 10H 3 o o. E ,0 Ol data A to functions? 100 200 Days 300 St. Johns 15 10 O a. 5 E -5 100 200 Days Jan Koláček (SCI MUNI) M7777 Applied FDA Basis Expansions We consider and Let us denote • 0*(t) = (• K = 2 0*(t) = (Bspll.l(t), Bspll.2(t)) Bsplines Basis St. Johns co CQ 0 1C )0 200 3C )0 10 cd H—' co CD q. E .cd 4 if V k 100 200 Days 300 Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 22/36 The Bsplines Basis Knots monthly (nknots = 13), local constants (m = 1) =>- K = 12 0*(t) = (ßsp/l.l(i), Bspll.2(t),Bspll.l2(t)) Bsplines Basis 15 10 co CÜ 100 St. Johns 10 cd H—' co cd q. E .cd 200 Days 300 100 200 Days 300 Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 23/36 The Bsplines Basis Knots monthly (nknots = 13), local linear (m = 2) =>- K = 13 **(t) = (Bspl2.1(t), Bspl2.2(t),..., Ssp/2.13(t)) Bsplines Basis St. Johns The Bsplines Basis Knots monthly (nknots = 13), local quadratic (m = 3) =>- K = 14 0*(t) = (ßsp/3.1(i), Bspl3.2(t),Bspl3.U(t)) Bsplines Basis St. Johns 10 CO cc CQ 100 10 cd H—' co CD q. E ,cd 200 Days 300 100 200 Days 300 Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 25 / 36 The Bsplines Basis Knots monthly (nknots = 13), local cubic (m = 4) =>- K = 15 0*(t) = (ßsp/4.1(i), BsplA.2(t),ßsp/4.15(i)) Bsplines Basis St. Johns 10 w 5 co CÜ 100 10 cd H—' co CD q. E .cd 200 Days 300 100 200 Days Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 26/36 The Bsplines Basis Knots monthly (nknots = 13), splines of order m = 6 =>- K = 17 0*(t) = (ßsp/6.1(r), Bspl6.2(t),ßsp/6.17(r)) Bsplines Basis St. Johns co CÜ -4 - c 100 200 300 10 cd H—' co CD q. E ,cd 100 200 Days Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 27/36 The Bsplines Basis Summary • Number of basis functions: #/interior knots + order • Derivatives up to m — 2 are continuous. • B-spline basis functions are positive over at most m adjacent intervals fast computation for even thousands of basis functions. • Sum of all B-splines in a basis is always 1; can fit any polynomial of order m. • Most popular choice is order 4, implying continuous second derivatives. Second derivatives have straight-line segments. Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 28/36 The Bsplines Basis Choosing Knots and Order • The order of the spline should be at least k + 2 if you are interested in k derivatives. • Knots are often equally spaced (a useful default) But there are two important rules: • Place more knots where you know there is strong curvature, and fewer where the function changes slowly • Be sure there is at least one data point in every interval. • Later, we'll discuss placing a knot at each point of observation. • Co-incident knots reduce the number of continuous derivatives at each point. This can be useful (more later). Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 29/36 The Bsplines Basis Other The f da library in R also allows the following bases: Constant d>*(t) = 1, the simplest of all. Power 0*(t) = (tAl, tA2, £As,..., tXK), powers are distinct but not necessarily integers or positive. Exponential 0*(t) = (eAlt, eA2t, eAst,..., eXf