M7777 Applied Functional Data Analysis 11. Registration Jan Koláček (kolacek@math.muni.cz) Dept. of Mathematics and Statistics, Faculty of Science, Masaryk University, Brno Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 1/17 Registration Berkeley Growth Data • Heights of 54 girls taken from ages 0 through 18. • Growth process easier to visualize in terms of acceleration (2nd derivative). • Peaks in acceleration = start of growth spurts. < ^ -2 5 10 Age [years] 15 5 Sample of 10 girls. 10 Age [years] 15 Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 2/17 Registration The Registration Problem • Most analyzes only account for variation in amplitude. • Frequently, observed data exhibit features that vary in time - phase variation. Phase variation Av \ \ \ S At V / * \ \ \ \ \ V \ \ \\ \ \ / /* / J Amplitude variation -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 • Mean of unregistered curves (dashed) has smaller peaks than any individual curve. • Aligning the curves reduces variation. Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 3 / 17 Registration Berkeley Growth Data Observed Sample of 10 girls Jan Koláček (SCI MUNI) M7777 Applied FDA Registration Defining a Warping Function Requires a transformation of time. • time-warping function is a continuous function h,(t) defined on [0, T], which is strictly increasing and /7/(0) = 0, h,(T) — T • aligning function /7/~1(t) is the functional inverse of h;(t), i.e. hr\h,{t)) = t • the registered curves X?(t)=Xi(h7\t)) Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 5 / 17 Registration Landmark Registration • For each curve x,-(t) we choose points t/i,..., t,x • We define a reference (usually one of the curves or mean etc.) *di? • • • ? *bK • We define constraints • Finally we find a constrained smooth function. Berkeley Growth example • Just one reference point to = H-7 .. . average time of maximal pubertal growth spurt (acceleration crosses 0) • t\ ... maximal pubertal growth spurt for /-th curve • Thus the constraints •_-i ■_-i / J.5... 5 A75 J.5... /»/(11.7) = t,- /' = 1,...,54. Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 6 / 17 Registration girl 3 girl 7 Result — 0 .... A aserved gned 0- íi \ \\ i ■ i 1 i - Vif - 1 '• 1 1 • \ 1 *i \ 1 V \ \ / / \ 1 /' \ 1 / 1 1 1 1 1 1 1 1 0 1 5 1 0 1 5 Age [years] 15 5 girl 3 girl 7 / y y/ y / >y y y y y y y y ■ y / y y / y y y y y / / / y ' y y / 1 7 y / * f y /y /y /- 5 10 1 5 5 1 0 1 5 Age [years] Aligned data with warping functions. Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 7 / 17 Registration Age [years] Age [years] Aligned data for 10 girls with their warping functions. Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 8 / 17 Registration Identifying Landmarks Major landmarks of interest: • where curve x,-(t) crosses some value • location of peaks or valleys • location of inflections Almost all are points at which some derivative of x,-(t) crosses zero. In practise, zero-crossings can be found automatically, but usually still require manual checking. For landmark registration in R, the procedure landmarkreg is used. Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 9 / 17 Registration Continuous Registration • Let xo(t) be a reference curve (dashed) Phase variation ' '/ / \ \ \ // / l \M \ \ A \ \\ \ \ - \ \ V \ \ \ \ \\ \ V \ \ \N\/ s f r / / /' / / t i* Amplitude variation - -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 • Phase variation: curves are less correlated with xq(£), FPCA the first 3 PC's explain 55%, 39% and 5% of the variation • Amplitude variation: curves are high correlated with xq(£), FPCA just the first PC explains 100% of the variation Main idea: Find h(t) to maximize correlation with xo(t) Jan Koláček (SCI MUNI) M7777 Applied FDA Fall 2019 10 / 17 Registration Practical Using O Do the preprocessing of data by landmark registration © Do the continuous registration (we can repeat). as as 2