Forums  > Basics  > Maximize Correlation  
Page 1 of 1
Display using:  


Total Posts: 68
Joined: Nov 2009
Posted: 2019-10-15 12:06

this should be very simple, but somehow I could not work out a proper solution to the following problem.

It is known how to chose the weights w of a portflio of stocks to minimize the portfolio's variance for an estimated covariance matrix S:

min_w (w' S w) s.t. A w <= b

Assume I'm interested in maximizing the correlation to some benchmark, whose weights are given by w_0. The first thing that comes to mind is to find the optimal active weights w_a := w-w_0, which solve this

min_w_a -(w_a' C w_a) s.t. A_a w <= b_a

Although this seems intuitively about right, I don't think this procedure yields the optimal solution to the problem of correlation maximization.

I searched long and hard and was surprised not to find any papers addressing this simple problem.

Any pointers would be much appreciated.



Total Posts: 585
Joined: May 2006
Posted: 2019-10-15 15:45
Principal component analysis (PCA).

I could write a long post about PCA, but it is a pretty mainstream topic - better just google it.

"There is a SIX am?" -- Arthur


Total Posts: 64
Joined: Jun 2006
Posted: 2019-10-15 23:19
Am I missing something? If the universe of stocks in the benchmark is the same as your active portfolio universe, the the correlation maximizing weights would be simply w_0. The correlation would be perfect. Obviously, w - w_0 is minimized when the two are equal.

If the two universes are disjoint, then I don't think you can solve it with the two covar matrices alone -- you would need mean vectors as well. If you have the historical returns matrix X from which your active sample covar matrix is estimated, as well as the historical returns vector y of the benchmark over the the same period, then the disjoint case max corr solution is trivial:

a <- cov(X)
b <- cov(X,y)
w <- solve(a, b)[, 1]


Total Posts: 68
Joined: Nov 2009
Posted: 2019-10-16 07:26
@schmitty - the universe of stocks in the benchmark and in the portfolio is the same. Yes, w_0 would be optimal, but it wouldn't satisty the constraints, i.e. A w_0 <= b wouldn't hold.

In my case I want to construct a portfolio consisting of low vol stocks which has the highest possible correlation to the benchmark. So I would put my vol estimates in A, and my upper bounds in b and optimize.

@ronin - thank you for you reply. I am familiar with PCA, but I can't quite imagine how I would use it in this case. Do mind expanding a bit? How would I even factor in my constraints?


Total Posts: 585
Joined: May 2006
Posted: 2019-10-16 13:55
The rough-and ready way is to create the principal components of your portfolio and project the benchmark on the first few PCs (i.e. largest eigenvalue, second largest eigenvalue etc).

The PCs are orthogonal, and the eigenvalue corresponding to each PC is its variance. The idea is that the PCs with the largest variance (eigenvalue) contribute most to the variance of the index, so you should be able to get good accuracy with only a handful of components.

The more convoluted way is to throw your benchmark into the portfolio, and look at the PCs of this enlarged portfolio. Now you want those with the smallest eigenvalue (variance). If there is an exact replication, there will be a PC with zero variance and non-zero weight of your benchmark. In the real world, that probably won't be the case - you'll have to use a few PCs, but again hopefully not too many.

Constraints - the eigenvectors will have unit norm. Most of the time, that is your constraint.

If you have a serious problem hitting the constraints - e.g., your eigenvectors have negative component weights and the benchmark is supposed to be long only - then you have to make some decisions at any rate. Maybe the replication you want is impossible, maybe you are happy enough with some projection. But that is a decision for you, not some calculation routine.

"There is a SIX am?" -- Arthur


Total Posts: 1
Joined: Aug 2014
Posted: 2019-10-16 17:42
If I'm understanding you correctly, your goal can be understood as trying to reconstruct the benchmark using a set of selected stocks. This can be understood as a regression problem and can be accomplished by regressing the excess return (over the riskfree rate) of the benchmark against the excess return (over the riskfree rate) of the selected stocks. If there are additional constraints, such as no shorting, you can use a Quadratic Programming solver such as OSQP.

As an aside, examining the closed-form solutions of the mean-variance optimal portfolio and the closed-form solution of linear regression reveals that the mean-variance portfolio is simply the portfolio that best reconstructs an excess return of "1". The minimum-variance portfolio best reconstructs an excess return of "0" with the constraint that all weights must sum to 1. It can be implemented by attempting to reconstruct one stock using the rest, flipping the weights to "cancel out" that stock, and then rescaling. See "A regression representation" [1].

Edit: mean-variance optimization as linear regression only works if you form the covariance matrix using the riskfree rate as the mean rather than the sample mean, and minimum-variance optimization as linear regression uses return differences. See the equation in [1] for the precise formula.



Total Posts: 5
Joined: Aug 2009
Posted: 2019-10-16 19:10
I tried to do this in the past with stepwise regression to find dispersion baskets. In the end I wondered whether to target index return or index return squared.


Total Posts: 6
Joined: May 2017
Posted: 2019-10-16 21:25
I think your stated goal of "maximizing correlation" is probably not exactly what you want. For example, if b > 0, you can set w = epsilon*w_0 for some tiny epsilon and achieve a correlation of 1, which is clearly optimal but I assume not what you are looking for in a solution.

Your original suggestion of minimizing the active vol is probably a better criteria and I think is a more standard "constrained replication" approach. As another way to understand this, consider minimizing the active vol relative to k*w_0 for all positive k. There's a one-to-one equivalence between this family of problems and the problems {
max_w w' C w_0
s.t. w' C w <= s^2 and Aw <= b
}, as we vary the risk target s.

In this form, the objective maximizes covariance with w_0, and the risk constraint should hold with equality at the optimum (except in degenerate cases, eg. where A spans all of w_0's risk), so you are fixing the vol to equal s; among portfolios with volatility s, the one with maximum covariance with w_0 also maximizes the correlation. So you can think of this as finding the most correlated portfolio as a function of the level of risk desired, and another natural choice might be s^2 = var(w_0).


Total Posts: 68
Joined: Nov 2009
Posted: 2019-10-17 07:57
Thank you all for your suggestions.

Maybe it would help, if I restated the problem:

Find the portfolio of stocks, which has the maximum correlation to the benchmark at 80% of the benchmark's volatility (all of which is measured by some covraiance matrix estimate S) AND satisfies the full investment and long only constraints.

@ mktmkr - how would your formulate a quadratic program, which solves the above problem?

@ xamuh - I honestly don't know how this is relevant. Do you mind expanding?

@ ETwode - just reducing the benchmark weight would breach the full investment constraint. Assuming C stands for covariance in your suggested QP, it is a formulation of tracking error minimization. That is precisely what I do NOT want to do. The point is that, by reducing the risk target, I increase the TE. The optimizer would prefer stocks, whose vol is higher than the target (closer to the benchmark) with low correlation, over stocks which are perfectly correlated and have a verly low volatility (further away from the benchmark). I would like to prevent this.

@ ronin - OK, so I would run a PCA on the stock returns, scaled by the benchmark's weights, pick the first n PC's and run a regression of the benchmark against those n PC's (projection). Choose n, such that sum of explained variance is at (80%)^2, which corresponds to my risk target. Is this about right so far?


Total Posts: 585
Joined: May 2006
Posted: 2019-10-17 09:39
> @ ronin - OK, so I would run a PCA on the stock returns, scaled by the benchmark's weights, pick the first n PC's and run a regression of the benchmark against those n PC's (projection). Choose n, such that sum of explained variance is at (80%)^2, which corresponds to my risk target. Is this about right so far?

So far so good. Not sure what you mean by "stock returns, scaled by the benchmark's weights". PCA on the plain old stock returns, regress the benchmark against first n PCs, if not sufficiently close increase n and regress again.

"There is a SIX am?" -- Arthur


Total Posts: 68
Joined: Nov 2009
Posted: 2019-10-17 11:17
@ ronin - OK, I see, thanks for the reply. Forget the scaling part, that's me beeing stupid.

It's an interesting approach. It doesn't ensure optimality however. I could just as well solve the QP using the correlation matrix instead of the covariance and put my constraints directly in the QP. I don't see the benefit of doing a PCA...
Previous Thread :: Next Thread 
Page 1 of 1