2021-7-25 19:29

### 2. 构建kinship矩阵

kinship矩阵是亲缘关系矩阵，是根据样本的SNP，构建样本间的亲缘关系矩阵，在GS中也叫G矩阵。

「使用R代码进行可视化：」

library(data.table)

setDF(kinship)
row.names(kinship) = kinship$V1 kinship$V1 = NULL
colnames(kinship) = row.names(kinship)
kinship = as.matrix(kinship)

heatmap(kinship)

### 3. 构建PCA分析及可视化

PCA结果1：PCA得分

PCA结果2：特征值及百分比及累计百分比PCA结果3：特征向量

「使用R语言绘制PCA图：」

# PCA

## 2D-PCA
library(ggplot2)
ggplot(pca_re, aes(x=PC1, y=PC2))  + geom_point(size=2) +
geom_hline(yintercept = 0)  + # 添加x坐标
geom_vline(xintercept = 0) + # 添加y坐标
theme_bw()

library(scatterplot3d)
scatterplot3d(pca_re[,2:4],
pch = 16,angle=30,
box=T,type="p",
lty.hide=2,lty.grid = 2)

### 4. 构建MDS分析及可视化

MDS The MDS method performs classical multidimensional scaling as adapted from the R code for cmdscale(). An alternate name for this analysis is principal coordinate analysis. It produces results that are very similar to PCA (principal components analysis) but starts with a distance matrix and results in coordinate axes that are scaled differently. To use MDS, create a distance matrix from a genotype using Analysis/Distance Matrix in TASSEL or import a distance matrix. Then, select the distance matrix, choose Analysis/MDS, and enter the number of coordinate axes and associated eigenvalues to be reported.

MDS and PCA will handle missing data differently. PCA imputes missing data to the mean allele frequency. The distance matrix calculation computes pairwise distances between taxa, using only sites with non-missing data for each taxon in a pair. There is no theoretical reason to prefer one method over the other. The axes produced by either MDS or PCA can be used as covariates in GLM or MLM models to correct for population structure.

「MDS分析分为两步：」

• 首先构建Distance Matrix
• 然后进行MDS分析

「使用R语言进行可视化：」

# MDS

## 2D-MDS
library(ggplot2)
ggplot(mds_re, aes(x=PC1, y=PC2))  + geom_point(size=2) +
geom_hline(yintercept = 0)  + # 添加x坐标
geom_vline(xintercept = 0) + # 添加y坐标
theme_bw()