Follow me on TwitterMy Tweets
Canonical Correlation Analysis (CCA)
Canonical correlation analysis (CCA) finds the maximally correlated lower dimensional representations of two data sets and . CCA can be applied to any multidimensional data set to investigate co-variation. In my field, it is applied to brain signals in order to determine relationship between two regions of interests. There are plenty of sources out there explaining the theory of CCA but I thought it would be a good idea to explain it using my own words and notation in my blog. I will try to be consistent with the notation in http://www.csee.umbc.edu/~adali/pubs/IEEEpubs/SPM2010correa.pdf and try to fill the gaps in derivations. In CCA, we want to maximize the correlation between linear transformation of and : where and are canonical coefficient matrices and . Let’s analyze the method starting with the first columns of the canonical coefficient matrices: where . This is equivalent to solving with constraints . Let and to simplify the constraint. The optimization problem now turns out to be: such that where . By the method of Lagrangian multipliers, the maximum is attained at a stationary point of for constants and . At such a stationary point, we have: From these two equations, we can conclude that and . Hence the vectors and are the eigenvectors of and with the largest eigenvalue, respectively. One can also consider them as left and right singular vectors of the matrix with the largest singular value. The first canonical vectors are then defined as and . We want the second canonical vectors to be orthogonal to the first ones which implies orthogonality between and ; and between and . The new optimization problem for the second canonical vectors is defined as such that and . We can use singular value decomposition of the matrix to solve this problem: where and are orthogonal matrices and is a diagonal matrix of singular values in decreasing order. Let’s define the transformations: and so that the optimization function can be written as such that and yielding the value of the second largest singular value of the matrix (Eckart-Young theorem). Therefore the optimum vectors and are the right and left singular vectors of with the second largest singular value. I hope this derivation clarifies the concept of CCA. Please contact me if you have any questions or comments.