The purpose of this volume is to take three fundamental ideas from standard linear model theory and exploit their properties in examining multivariate, time series and spatial data. In decreasing order of importance to the presentation, the three ideas are: best linear prediction, projections, and Mahalanobis distance” (from the preface to the first edition).
This book can be seen as a companion volume to Plane Answers to Complex Questions, by the same author. In fact, numerous references to Plane Answers are made throughout the volume, so it is probably better to read the previous volume first. Or, at the very least, one should have a solid understanding of the material covered in that book, including topics such as: generalized least squares, identifiability, estimability, and mixed models with a simple covariance matrix.
The first three chapters cover some topics in statistical learning: nonparametric regression (including regression trees), penalized estimation (e.g. Ridge and Lasso), and reproducing Kernel Hilbert spaces.
Best linear prediction is introduced in chapter 4 and its main properties are collected, with proofs, in an appendix at the end of the book. The Mahalanobis distance is mostly used in the last chapters, where multivariate linear models are introduced.
Chapter 4 discusses the problem of covariance parameter estimation, for a linear model with a covariance matrix depending on a parameter. The results in chapter 4 are then applied to the theory of mixed linear models (chapter 5), time series (chapters 6, 7), and spatial data (chapter 8). So the fourth chapter is a key chapter, and it is important to spend quite some time on it before proceeding with the next chapters.
Chapters 9 through 14 deal with multivariate linear models and their applications. They are introductory in nature but present nonetheless very important applications like profile analysis, growth curves, and principal component analysis.
The R code used to produce the plots in the examples is available online in a pdf file; however, I could not find the corresponding datasets (the R scripts contain file paths referring to local drives). Hopefully, these datasets (and the code) will be made available to download from an online repository like github in the future.
Some sections contain a list of exercises, sometimes they are open-ended questions, like: “Analyze the repeated measures data given by (…) in Biometrics on pages 562 and 562”. Other exercises are of a more usual type, like proving a formula or a property of an estimator.
This book is in my opinion a very valuable resource for researchers since it presents the theoretical foundations of linear models in a unified way while discussing a number of applications. Selected chapters might be used to build an advanced course on advanced linear modeling but I don’t think the author’s goal was to write a textbook, and there is no guidance on recommended sequences or dependencies among chapters. This book is definitely worth considering for anyone looking for an extensive and thorough treatment of advanced topics in linear modeling.
Fabio Mainardi (
fabio.mainardi@rd.nestle.com) is a mathematician working as senior data scientist at Nestlé Research, Switzerland. His mathematical interests are number theory, functional analysis, discrete mathematics and probability.