Summary
开贴讨论Rank deficient matrix线性回归之过程。示例数据包含在附件中,x is a matrix whose columns represent random variables and whose rows represent observations,y is an n-by-1 vector ofobserved responses。
回归方程组可如下表示:
式中:xi(i=1,2,…,10)与b均为m维列向量,求解a1,a2,…,a10。
Fig. 1显示方程组增广矩阵,左边9×10表示xi,第11列是b。
Fig. 1
引用定理:
n个未知数的非齐次线性方程组Ax=b有解的充分必要条件是系数矩阵A的秩(Rank)等于增广矩阵B的秩,且当R(A)=R(B)=n时方程组有惟一解,当R(A)=R(B)=r<n时方程组有无限多个解。[6]—P96。
Method
解:系数矩阵的秩R(xi)=9,增广矩阵的秩R(xi, b)=9<10
所以方程组a1,a2,…,a10有无限多个解。
若在Matlab应用regress函数,此时返回“Warning: X is rankdeficient to within machine precision.”,即是明示方程组系数矩阵是秩缺的。prc.rar
%此时
X=[ones(9,1) x];
[coefs,bint,r,rint,stats]=regress(b, X);
coefs=
0
1.26061943597438
3.23820029692269
-5.86899781062611
0
0.394522181750229
-1.56586683726205
4.19512963015161
-2.15499211579584
-0.256449681084742
1.82364987066155
这一组回归系数是方程组的可能解之一,它使得回归结果的残差最小(regress essentially computes the least error solution such that sum of residuals of Y -X*B has the least amount of error.)。
一般地,方程个数(本例是9个)小于未知数数量(10)是一定不能找到惟一解的。
参考文献[7]是一个非常好的例子,认真体会理解!本例应用regress有一点例外,它的输入矩阵不包含常数项(the intercept term),我曾去信询问@rayryeng,他在回复中指出这是数据提问者要求的(didn't include the intercept term because that is specific to his problem),他当时没有追究这个问题(I didn't question it because that was specific to his problem)。我尝试在输入矩阵之中包含常数项,按照方法介绍的过程得到了类似的结果,所以这个方法的准确性不应是否包含常数项而存在不同,都是正确的。
References
[1] Rank deficient for multipl regression.
[2] Covariance, from Wikipedia, thefree encyclopedia.
[3] 浅谈协方差矩阵. X,Y相互独立协方差一定为零,但是其逆命题却不真。
[4] 协方差与相关系数.
[5] A matrix is said to have full rank if its rank equals the largest possible for a matrix of the same dimensions, which is the lesser of the number of rows and columns. A matrix is said to be rank deficient if it does not have full rank. Rank (linear algebra), from Wikipedia, the free encyclopedia.
[6] 同济大学数学系. 工程数学:线性代数(第六版). 北京:高等教育出版社, 2014.
[7] Getting rank deficient warning when using regress function in MATLAB?
转载本文请联系原作者获取授权,同时请注明本文来自李旭科学网博客。
链接地址:https://wap.sciencenet.cn/blog-1148346-917270.html?mobile=1
收藏