Forums  > Pricing & Modelling  > Mutually orthogonal approximations?  
     
Page 1 of 1
Display using:  

lexx


Total Posts: 64
Joined: Apr 2006
 
Posted: 2020-01-17 18:52
Hi!
was wondering if anyone knows a solution to the following problem :
Given two vectors X* and Y* I want to create approximations to both of them, minimizing mean square errors, such that they are orthogonal to each other at the same time:

Min (X*-X)^2 + (Y*-Y)^2

X'*Y=0

Solution definitely exists, since I can regress one against the other and use residual,
but it's not exactly the same as above

kuebiko


Total Posts: 39
Joined: May 2018
 
Posted: 2020-01-17 20:34
I think the problem is not convex.

You are jointly solving for x and y... concatenate them into a single solution vector [x;y]. Without the constraint, you have two uncoupled trivial quadratic programs:

min_{x,y} 0.5*[x;y]’*[I, 0; 0, I]*[x;y] - [x*;y*]’*[x;y]

(in pseudo matlab notation)

But the orthogonalization constraint is a quadratic

[x;y]’*[0, I; I, 0]*[x;y] <= 0

and so you have a quadratically-constrained quadratic program.

For this to be convex the matrix appearing in the quadratic constraint needs to be positive definite. In that case you can use e.g. conic programming.

lexx


Total Posts: 64
Joined: Apr 2006
 
Posted: 2020-01-17 21:32
Thanks a lot! - interesting take.
Didn't think about it this way.
Will simplify then the problem, by adding orthogonality constraint with penalty

nikol


Total Posts: 993
Joined: Jun 2005
 
Posted: 2020-01-18 00:16
(X,Y) is defined as orthogonal basis in the same plane as (X*,Y*).

Define matrix-C:
Cxx = (X*, X)/|X|^2
Cxy = (X*, Y)/|Y|^2
Cyx = (Y*, X)/|X|^2
Cyy = (Y*, Y)/|Y|^2

(.,.) - scalar multiplication

It must be that
X* = Cxx.X + Cxy.Y
Y* = Cyx.X + Cyy.Y
ADDED ===
or
Z* = C.Z, Z = (X,Y)
===

You can redefine (X, Y) in terms of (X*, Y*), which gives matrix M
ADDED ===
X = Mxx.X* + Mxy.Y*
Y = Myx.X* + Myy.Y*
or
Z = inv(C).Z* = M.Z*
===

Hence, you want to find min Q(M), where
Q(M) = ((Mxx-1).X* + Mxy.Y* )^2 + (Myx.X* + (Myy-1).Y*)^2

So, it is just about finding 4 numbers from 4-equations:

dQ/dMij = 0, where i and j ={x,y}

The key is that it is 2-dim problem in (x,y)-space.
If (X*,Y*)=0 (already orthogonal), then (X,Y) = (X*,Y*)
For co-linear vectors, X* = k.Y*, there is no solution.

kuebiko


Total Posts: 39
Joined: May 2018
 
Posted: 2020-01-18 14:14
Nikol, that’s a really nice way of looking at it, but I think there’s still something undetermined about the system. If you write out your four derivative equations, you’ll find that only two are unique. The orthogonality requirement gives you a third equation, but I think we’re still missing a fourth. Unless I’m misunderstanding something (very possible)...

nikol


Total Posts: 993
Joined: Jun 2005
 
Posted: 2020-01-18 17:31
> you’ll find that only two are unique

Nope, I don't share this conclusion. The set of dQ/dM is not degenerate. Repeat your calculations again. I did it on the back of envelope, just lazy to type it in here.

(above, added some more lines)

kuebiko


Total Posts: 39
Joined: May 2018
 
Posted: 2020-01-18 19:49
You're right, I made a mistake and didn't say what I meant to say. The system is indeed full rank. But what I mean is, I think they are trivial. Have you tried solving the system for arbitrary vectors x* and y*? You get the identity solution for M unless you incorporate the orthogonality constraint.

Either way, the orthogonality condition has to be incorporated somehow, which it is not if we only consider dQ/dM (again, unless I'm missing something).

And we can see the orthogonality condition is nonlinear. You write

Z = M.Z*

So if we require Z'Z = 0 (inner product vanishes), then we need M to satisfy

Z*'M'MZ* = 0


nikol


Total Posts: 993
Joined: Jun 2005
 
Posted: 2020-01-18 20:58
I admit, my notation in ADDED is a bit misleading. It was to illustrate the solution path.

From your Z'Z = 0 (with my Z=(X,Y)) you have (X,Y)'(X,Y) = X^2 - Y^2 = 0, where
X^2=Y^2, which is not what I mean.

By definition, it must be that X.Y = 0
and if
X = Mxx.X* + Mxy.Y*
Y = Myx.X* + Myy.Y*
then
(Mxx.X* + Mxy.Y*)'.(Myx.X* + Myy.Y*) = 0

> Z*'M'MZ* = 0

Honestly, I don't see where you try to lead the solution. It has simple intuitive picture:

Any two non co-linear vectors (X*,Y*) define 2D-plane, which can have any basis (X,Y) (upto rotation). Then, the question was to find such X,Y, that the sum of distances between (X,X*) and (Y,Y*) are minimal. What I did is assumed arbitrary solution (X,Y), projected (X*,Y*) to that basis with C, then inverted it into M and expressed (X,Y) as a function of (X*,Y*), or (X(X*,Y*), Y(X*,Y*)). After that I am left with M only.

kuebiko


Total Posts: 39
Joined: May 2018
 
Posted: 2020-01-18 21:35
My point in writing Z*'M'MZ* = 0 is that this isn't a purely linear system for the entries in M, because there is a nonlinear constraint on M. When jointly solving for X and Y, the problem is non-convex (see my first reply). I understand the geometric intuition but I don't see how you can solve this with your approach. Can you give your numerical solution for some simple 2D example? What if X*=[1,0] and Y*=[1,1], so they are not orthogonal. What do you get for the entries in M?

nikol


Total Posts: 993
Joined: Jun 2005
 
Posted: 2020-01-18 22:36
Yep, you're right.
I get Mxx=Myy=1, Mxy=Myx=0 and, therefore, Z=Z* :(

ADDED: Understood all you said about linear system. Indeed, all my writing about C-matrix was left out from discussion and, therefore, orthogonality condition is not included.

lexx


Total Posts: 64
Joined: Apr 2006
 
Posted: 2020-01-22 00:23
nikol and kuebiko - many thanks for the interesting discussion, now see the problem from different perspective

ronin


Total Posts: 550
Joined: May 2006
 
Posted: 2020-01-22 10:47
> Solution definitely exists, since I can regress one against the other and use residual,
> but it's not exactly the same as above


I think it is the same as above. Wlog set X*=0.

Then your problem is min X^2 + (Y*-Y)^2 subject to X.Y=0. It's just an orthogonal projection.

"There is a SIX am?" -- Arthur

frolloos


Total Posts: 104
Joined: Dec 2007
 
Posted: 2020-01-22 13:35
Why can't you use Gram-Schmidt orthogonalisation? I am likely missing something here, but if the hard constraint is that they are orthogonal then you can use Gram-Schmidt, can't you? Whether the mean square error is minimised I don't know, and I am not sure you can do both at once.

https://en.wikipedia.org/wiki/Gram–Schmidt_process

One man's Theta is another man's Gamma - Me

frolloos


Total Posts: 104
Joined: Dec 2007
 
Posted: 2020-01-22 13:35
dbl post.

One man's Theta is another man's Gamma - Me

nikol


Total Posts: 993
Joined: Jun 2005
 
Posted: 2020-01-22 21:49
the problem with G-S is that u1=v1, i.e. first (arbitrary) vector, v1, always goes alongside with u1 generated orthogonal set and, therefore, u1-v1 = 0.

as far as I understood the initial problem, lexx wants {u}'s to be equidistant from {v}'s.

kuebiko


Total Posts: 39
Joined: May 2018
 
Posted: 2020-01-23 11:07
So it’s definitely bi-convex, meaning you can fix y and solve for x, then fix x and solve for y, and repeat in alternating fashion until convergence. But in general bi-convexity doesn’t guarantee a global optimum. So you’d have to play around with it and see if it’s sensitive to your choice of initialization. This is a simple problem so somebody who knows more about this may know whether or not there’s a global solution here.

ETwode


Total Posts: 5
Joined: May 2017
 
Posted: 2020-01-23 16:21
I could be wrong but I think it isn't quite as simple as orthogonal projection. Consider the case in 2d where X* and Y* are the standard basis vectors rotated inward toward each other by the same angle. We expect that the solution is to symmetrically push them both back outward (projecting onto the axes), and can show that eg. regressing X* on Y* and taking X to be the residual gives larger error than this. Every case where ||X*|| = ||Y*|| reduces to this situation.

Manipulating the KKT conditions suggests that the general solution is:
X = (1-u^2)^-1 [X* - uY*], Y = (1-u^2)^-1 [Y* - uX*], with
u = (s - sqrt(s^2 - 4r^2)) / 2r, s = ||X*||^2 + ||Y*||^2, r = X*'Y*
which passes some quick/basic sanity checks, but I haven't convinced myself the reasoning is 100% correct.

lexx


Total Posts: 64
Joined: Apr 2006
 
Posted: 2020-01-23 22:29
Wow! thanks for the discussion!
Didn't expect it

Wanted to share initial motivation for the problem :

I have 2 factors - equity characteristics, say value and quality.
And want to combine them together, based on how they perform.
They often do become highly correlated, so better results I could get if to
orthogonalize them - which I can do by regressing one against the other and using residuals instead of the original factor. But results depend on which one is regressed
and which is kept as is (tried different factors). Residualized factor is dominated by the one kept as-is (ETwode's point regarding errors)
So the posed problem would avoid that, and be "fair" to both factors - nikol's point of equi-distance.
G-S procedure I believe creates orthogonal basis consisting of linear combinations of the vectors, if i'm not mistaken. I would like to avoid that, since interpretation becomes difficult.
Min their respective errors, somewhat preserves the "nature" (to some degree) of approximating factors.
Previous Thread :: Next Thread 
Page 1 of 1