
yanko


Total Posts: 68 
Joined: Nov 2009 


Hi,
this should be very simple, but somehow I could not work out a proper solution to the following problem.
It is known how to chose the weights w of a portflio of stocks to minimize the portfolio's variance for an estimated covariance matrix S:
min_w (w' S w) s.t. A w <= b
Assume I'm interested in maximizing the correlation to some benchmark, whose weights are given by w_0. The first thing that comes to mind is to find the optimal active weights w_a := ww_0, which solve this
min_w_a (w_a' C w_a) s.t. A_a w <= b_a
Although this seems intuitively about right, I don't think this procedure yields the optimal solution to the problem of correlation maximization.
I searched long and hard and was surprised not to find any papers addressing this simple problem.
Any pointers would be much appreciated.
Thanks, yanko 




ronin


Total Posts: 585 
Joined: May 2006 


Principal component analysis (PCA).
I could write a long post about PCA, but it is a pretty mainstream topic  better just google it. 
"There is a SIX am?"  Arthur 


schmitty


Total Posts: 64 
Joined: Jun 2006 


Am I missing something? If the universe of stocks in the benchmark is the same as your active portfolio universe, the the correlation maximizing weights would be simply w_0. The correlation would be perfect. Obviously, w  w_0 is minimized when the two are equal.
If the two universes are disjoint, then I don't think you can solve it with the two covar matrices alone  you would need mean vectors as well. If you have the historical returns matrix X from which your active sample covar matrix is estimated, as well as the historical returns vector y of the benchmark over the the same period, then the disjoint case max corr solution is trivial:
a < cov(X) b < cov(X,y) w < solve(a, b)[, 1]





yanko


Total Posts: 68 
Joined: Nov 2009 


@schmitty  the universe of stocks in the benchmark and in the portfolio is the same. Yes, w_0 would be optimal, but it wouldn't satisty the constraints, i.e. A w_0 <= b wouldn't hold.
In my case I want to construct a portfolio consisting of low vol stocks which has the highest possible correlation to the benchmark. So I would put my vol estimates in A, and my upper bounds in b and optimize.
@ronin  thank you for you reply. I am familiar with PCA, but I can't quite imagine how I would use it in this case. Do mind expanding a bit? How would I even factor in my constraints? 



ronin


Total Posts: 585 
Joined: May 2006 


The roughand ready way is to create the principal components of your portfolio and project the benchmark on the first few PCs (i.e. largest eigenvalue, second largest eigenvalue etc).
The PCs are orthogonal, and the eigenvalue corresponding to each PC is its variance. The idea is that the PCs with the largest variance (eigenvalue) contribute most to the variance of the index, so you should be able to get good accuracy with only a handful of components.
The more convoluted way is to throw your benchmark into the portfolio, and look at the PCs of this enlarged portfolio. Now you want those with the smallest eigenvalue (variance). If there is an exact replication, there will be a PC with zero variance and nonzero weight of your benchmark. In the real world, that probably won't be the case  you'll have to use a few PCs, but again hopefully not too many.
Constraints  the eigenvectors will have unit norm. Most of the time, that is your constraint.
If you have a serious problem hitting the constraints  e.g., your eigenvectors have negative component weights and the benchmark is supposed to be long only  then you have to make some decisions at any rate. Maybe the replication you want is impossible, maybe you are happy enough with some projection. But that is a decision for you, not some calculation routine.

"There is a SIX am?"  Arthur 



mktmkr


Total Posts: 1 
Joined: Aug 2014 


If I'm understanding you correctly, your goal can be understood as trying to reconstruct the benchmark using a set of selected stocks. This can be understood as a regression problem and can be accomplished by regressing the excess return (over the riskfree rate) of the benchmark against the excess return (over the riskfree rate) of the selected stocks. If there are additional constraints, such as no shorting, you can use a Quadratic Programming solver such as OSQP.
As an aside, examining the closedform solutions of the meanvariance optimal portfolio and the closedform solution of linear regression reveals that the meanvariance portfolio is simply the portfolio that best reconstructs an excess return of "1". The minimumvariance portfolio best reconstructs an excess return of "0" with the constraint that all weights must sum to 1. It can be implemented by attempting to reconstruct one stock using the rest, flipping the weights to "cancel out" that stock, and then rescaling. See "A regression representation" [1].
Edit: meanvariance optimization as linear regression only works if you form the covariance matrix using the riskfree rate as the mean rather than the sample mean, and minimumvariance optimization as linear regression uses return differences. See the equation in [1] for the precise formula.
[1] http://comisef.wikidot.com/tutorial:minimumvariance 



xamuh


Total Posts: 5 
Joined: Aug 2009 


I tried to do this in the past with stepwise regression to find dispersion baskets. In the end I wondered whether to target index return or index return squared. 




ETwode


Total Posts: 6 
Joined: May 2017 


I think your stated goal of "maximizing correlation" is probably not exactly what you want. For example, if b > 0, you can set w = epsilon*w_0 for some tiny epsilon and achieve a correlation of 1, which is clearly optimal but I assume not what you are looking for in a solution.
Your original suggestion of minimizing the active vol is probably a better criteria and I think is a more standard "constrained replication" approach. As another way to understand this, consider minimizing the active vol relative to k*w_0 for all positive k. There's a onetoone equivalence between this family of problems and the problems { max_w w' C w_0 s.t. w' C w <= s^2 and Aw <= b }, as we vary the risk target s.
In this form, the objective maximizes covariance with w_0, and the risk constraint should hold with equality at the optimum (except in degenerate cases, eg. where A spans all of w_0's risk), so you are fixing the vol to equal s; among portfolios with volatility s, the one with maximum covariance with w_0 also maximizes the correlation. So you can think of this as finding the most correlated portfolio as a function of the level of risk desired, and another natural choice might be s^2 = var(w_0). 



yanko


Total Posts: 68 
Joined: Nov 2009 


Thank you all for your suggestions.
Maybe it would help, if I restated the problem:
Find the portfolio of stocks, which has the maximum correlation to the benchmark at 80% of the benchmark's volatility (all of which is measured by some covraiance matrix estimate S) AND satisfies the full investment and long only constraints.
@ mktmkr  how would your formulate a quadratic program, which solves the above problem?
@ xamuh  I honestly don't know how this is relevant. Do you mind expanding?
@ ETwode  just reducing the benchmark weight would breach the full investment constraint. Assuming C stands for covariance in your suggested QP, it is a formulation of tracking error minimization. That is precisely what I do NOT want to do. The point is that, by reducing the risk target, I increase the TE. The optimizer would prefer stocks, whose vol is higher than the target (closer to the benchmark) with low correlation, over stocks which are perfectly correlated and have a verly low volatility (further away from the benchmark). I would like to prevent this.
@ ronin  OK, so I would run a PCA on the stock returns, scaled by the benchmark's weights, pick the first n PC's and run a regression of the benchmark against those n PC's (projection). Choose n, such that sum of explained variance is at (80%)^2, which corresponds to my risk target. Is this about right so far? 




ronin


Total Posts: 585 
Joined: May 2006 


> @ ronin  OK, so I would run a PCA on the stock returns, scaled by the benchmark's weights, pick the first n PC's and run a regression of the benchmark against those n PC's (projection). Choose n, such that sum of explained variance is at (80%)^2, which corresponds to my risk target. Is this about right so far?
So far so good. Not sure what you mean by "stock returns, scaled by the benchmark's weights". PCA on the plain old stock returns, regress the benchmark against first n PCs, if not sufficiently close increase n and regress again.

"There is a SIX am?"  Arthur 


yanko


Total Posts: 68 
Joined: Nov 2009 


@ ronin  OK, I see, thanks for the reply. Forget the scaling part, that's me beeing stupid.
It's an interesting approach. It doesn't ensure optimality however. I could just as well solve the QP using the correlation matrix instead of the covariance and put my constraints directly in the QP. I don't see the benefit of doing a PCA... 








