【R语言学习笔记】关于提取各类模型值的意外发现
2017-04-15 11:49
471 查看
之前在做各类回归方程和检验的时候,针对模型里面的值的提取总是有一种碰运气的成本,比如在做t检验的时候想提取里面的自由度,随便举个例子,基于mtcars这个数据集
结果为
里面其实是有df=35.907这个字段的,但是不能每次看到后在手工提取,之前的做法是针对这类名称,直接用a$df去看,但是其实这个字段储存在parameter里,比如
那么问题来了,我怎么知道哪个参数储存在哪里呢?
下面意外的用到了str函数。
比如针对刚才的t检验结果a,用str
看到有各类的参数,储存在$后的字段里,比如我要提取p值,直接输入
就能看到p值为3.500725e-19。
同理,我做一个方差分析,比如就这个mtcars数据集了
看这个方差分析都有什么参数:
可以看到有12个参数,比如我想看下相关系数:
而且前面的截距就是数字,还可以计算
同理,弄一个logistic回归的广义线性模型
结果为:
看到mpg有点显著性,那么我想要提取这个相关系数,看到广义模型的参数更为复杂
我只想提取显著的相关系数,则
总结下,当我们做个检验、分析、回归、包括主成分分析、聚类等时候,str函数和summary函数可以配合的很好,自动化的进行下一步工作。
a<-t.test(mtcars$vs,mtcars$cyl)
结果为
Welch Two Sample t-test data: mtcars$vs and mtcars$cyl t = -17.528, df = 35.907, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -6.415358 -5.084642 sample estimates: mean of x mean of y 0.4375 6.1875
里面其实是有df=35.907这个字段的,但是不能每次看到后在手工提取,之前的做法是针对这类名称,直接用a$df去看,但是其实这个字段储存在parameter里,比如
a$parameter df 35.90693
那么问题来了,我怎么知道哪个参数储存在哪里呢?
下面意外的用到了str函数。
比如针对刚才的t检验结果a,用str
str(a) List of 9 $ statistic : Named num -17.5 ..- attr(*, "names")= chr "t" $ parameter : Named num 35.9 ..- attr(*, "names")= chr "df" $ p.value : num 3.5e-19 $ conf.int : atomic [1:2] -6.42 -5.08 ..- attr(*, "conf.level")= num 0.95 $ estimate : Named num [1:2] 0.438 6.188 ..- attr(*, "names")= chr [1:2] "mean of x" "mean of y" $ null.value : Named num 0 ..- attr(*, "names")= chr "difference in means" $ alternative: chr "two.sided" $ method : chr "Welch Two Sample t-test" $ data.name : chr "mtcars$vs and mtcars$cyl" - attr(*, "class")= chr "htest"
看到有各类的参数,储存在$后的字段里,比如我要提取p值,直接输入
a$p.value [1] 3.500725e-19
就能看到p值为3.500725e-19。
同理,我做一个方差分析,比如就这个mtcars数据集了
fit.a<-aov(mpg~am,data=mtcars) summary(fit.a) Df Sum Sq Mean Sq F value Pr(>F) am 1 405.2 405.2 16.86 0.000285 *** Residuals 30 720.9 24.0 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
看这个方差分析都有什么参数:
str(fit.a) List of 12 $ coefficients : Named num [1:2] 17.15 7.24 ..- attr(*, "names")= chr [1:2] "(Intercept)" "am" $ residuals : Named num [1:32] -3.39 -3.39 -1.59 4.25 1.55 ... ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... $ effects : Named num [1:32] -113.65 -20.13 -0.64 4.33 1.63 ... ..- attr(*, "names")= chr [1:32] "(Intercept)" "am" "" "" ... $ rank : int 2 $ fitted.values: Named num [1:32] 24.4 24.4 24.4 17.1 17.1 ... ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... $ assign : int [1:2] 0 1 $ qr :List of 5 ..$ qr : num [1:32, 1:2] -5.657 0.177 0.177 0.177 0.177 ... .. ..- attr(*, "dimnames")=List of 2 .. .. ..$ : chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... .. .. ..$ : chr [1:2] "(Intercept)" "am" .. ..- attr(*, "assign")= int [1:2] 0 1 ..$ qraux: num [1:2] 1.18 1.18 ..$ pivot: int [1:2] 1 2 ..$ tol : num 1e-07 ..$ rank : int 2 ..- attr(*, "class")= chr "qr" $ df.residual : int 30 $ xlevels : Named list() $ call : language aov(formula = mpg ~ am, data = mtcars) $ terms :Classes 'terms', 'formula' language mpg ~ am .. ..- attr(*, "variables")= 120a9 language list(mpg, am) .. ..- attr(*, "factors")= int [1:2, 1] 0 1 .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. ..$ : chr [1:2] "mpg" "am" .. .. .. ..$ : chr "am" .. ..- attr(*, "term.labels")= chr "am" .. ..- attr(*, "order")= int 1 .. ..- attr(*, "intercept")= int 1 .. ..- attr(*, "response")= int 1 .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> .. ..- attr(*, "predvars")= language list(mpg, am) .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric" .. .. ..- attr(*, "names")= chr [1:2] "mpg" "am" $ model :'data.frame': 32 obs. of 2 variables: ..$ mpg: num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... ..$ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ... ..- attr(*, "terms")=Classes 'terms', 'formula' language mpg ~ am .. .. ..- attr(*, "variables")= language list(mpg, am) .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1 .. .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. .. ..$ : chr [1:2] "mpg" "am" .. .. .. .. ..$ : chr "am" .. .. ..- attr(*, "term.labels")= chr "am" .. .. ..- attr(*, "order")= int 1 .. .. ..- attr(*, "intercept")= int 1 .. .. ..- attr(*, "response")= int 1 .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> .. .. ..- attr(*, "predvars")= language list(mpg, am) .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric" .. .. .. ..- attr(*, "names")= chr [1:2] "mpg" "am" - attr(*, "class")= chr [1:2] "aov" "lm"
可以看到有12个参数,比如我想看下相关系数:
fit.a$coefficients (Intercept) am 17.147368 7.244939
而且前面的截距就是数字,还可以计算
fit.a$coefficients[1]*5 85.73684
同理,弄一个logistic回归的广义线性模型
fit.b<-glm(am~mpg+gear,data=mtcars,family=quasibinomial()) summary(fit.b)
结果为:
Call: glm(formula = am ~ mpg + gear, family = quasibinomial(), data = mtcars) Deviance Residuals: Min 1Q Median 3Q Max -1.68311 -0.00003 -0.00002 0.04042 1.17990 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -88.2992 7928.1434 -0.011 0.9912 mpg 0.3366 0.1403 2.399 0.0231 * gear 20.3062 1982.0355 0.010 0.9919 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for quasibinomial family taken to be 0.3263161) Null deviance: 43.230 on 31 degrees of freedom Residual deviance: 11.659 on 29 degrees of freedom AIC: NA Number of Fisher Scoring iterations: 19
看到mpg有点显著性,那么我想要提取这个相关系数,看到广义模型的参数更为复杂
str(fit.b) List of 30 $ coefficients : Named num [1:3] -88.299 0.337 20.306 ..- attr(*, "names")= chr [1:3] "(Intercept)" "mpg" "gear" $ residuals : Named num [1:32] 2.01 2.01 1.55 -1 -1 ... ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... $ fitted.values : Named num [1:32] 4.99e-01 4.99e-01 6.46e-01 1.73e-09 6.96e-10 ... ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... $ effects : Named num [1:32] -0.44841 1.37014 -0.00585 0.13687 0.08687 ... ..- attr(*, "names")= chr [1:32] "(Intercept)" "mpg" "gear" "" ... $ R : num [1:3, 1:3] -1.41 0 0 -30.88 4.07 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:3] "(Intercept)" "mpg" "gear" .. ..$ : chr [1:3] "(Intercept)" "mpg" "gear" $ rank : int 3 $ qr :List of 5 ..$ qr : num [1:32, 1:3] -1.41 3.56e-01 3.40e-01 4.87e-05 3.09e-05 ... .. ..- attr(*, "dimnames")=List of 2 .. .. ..$ : chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... .. .. ..$ : chr [1:3] "(Intercept)" "mpg" "gear" ..$ rank : int 3 ..$ qraux: num [1:3] 1.36 1.09 1 ..$ pivot: int [1:3] 1 2 3 ..$ tol : num 1e-11 ..- attr(*, "class")= chr "qr" $ family :List of 11 ..$ family : chr "quasibinomial" ..$ link : chr "logit" ..$ linkfun :function (mu) ..$ linkinv :function (eta) ..$ variance :function (mu) ..$ dev.resids:function (y, mu, wt) ..$ aic :function (y, n, mu, wt, dev) ..$ mu.eta :function (eta) ..$ initialize: expression({ if (NCOL(y) == 1) { if (is.factor(y)) y <- y != levels(y)[1L] n <- rep.int(1, nobs) if (any(y < 0 | y > 1)) stop("y values must be 0 <= y <= 1") mustart <- (weights * y + 0.5)/(weights + 1) } else if (NCOL(y) == 2) { n <- y[, 1] + y[, 2] y <- ifelse(n == 0, 0, y[, 1]/n) weights <- weights * n mustart <- (n * y + 0.5)/(n + 1) } else stop("for the 'quasibinomial' family, y must be a vector of 0 and 1's\nor a 2 column matrix where col 1 is no. successes and col 2 is no. failures") }) ..$ validmu :function (mu) ..$ valideta :function (eta) ..- attr(*, "class")= chr "family" $ linear.predictors: Named num [1:32] -0.00586 -0.00586 0.60003 -20.1774 -21.08622 ... ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... $ deviance : num 11.7 $ aic : num NA $ null.deviance : num 43.2 $ iter : int 19 $ weights : Named num [1:32] 2.50e-01 2.50e-01 2.29e-01 4.69e-09 1.89e-09 ... ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... $ prior.weights : Named num [1:32] 1 1 1 1 1 1 1 1 1 1 ... ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... $ df.residual : int 29 $ df.null : int 31 $ y : Named num [1:32] 1 1 1 0 0 0 0 0 0 0 ... ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ... $ converged : logi TRUE $ boundary : logi FALSE $ model :'data.frame': 32 obs. of 3 variables: ..$ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ... ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ... ..- attr(*, "terms")=Classes 'terms', 'formula' language am ~ mpg + gear .. .. ..- attr(*, "variables")= language list(am, mpg, gear) .. .. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1 .. .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. .. ..$ : chr [1:3] "am" "mpg" "gear" .. .. .. .. ..$ : chr [1:2] "mpg" "gear" .. .. ..- attr(*, "term.labels")= chr [1:2] "mpg" "gear" .. .. ..- attr(*, "order")= int [1:2] 1 1 .. .. ..- attr(*, "intercept")= int 1 .. .. ..- attr(*, "response")= int 1 .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> .. .. ..- attr(*, "predvars")= language list(am, mpg, gear) .. .. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "numeric" .. .. .. ..- attr(*, "names")= chr [1:3] "am" "mpg" "gear" $ call : language glm(formula = am ~ mpg + gear, family = quasibinomial(), data = mtcars) $ formula :Class 'formula' language am ~ mpg + gear .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> $ terms :Classes 'terms', 'formula' language am ~ mpg + gear .. ..- attr(*, "variables")= language list(am, mpg, gear) .. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1 .. .. ..- attr(*, "dimnames")=List of 2 .. .. .. ..$ : chr [1:3] "am" "mpg" "gear" .. .. .. ..$ : chr [1:2] "mpg" "gear" .. ..- attr(*, "term.labels")= chr [1:2] "mpg" "gear" .. ..- attr(*, "order")= int [1:2] 1 1 .. ..- attr(*, "intercept")= int 1 .. ..- attr(*, "response")= int 1 .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> .. ..- attr(*, "predvars")= language list(am, mpg, gear) .. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "numeric" .. .. ..- attr(*, "names")= chr [1:3] "am" "mpg" "gear" $ data :'data.frame': 32 obs. of 11 variables: ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ... ..$ disp: num [1:32] 160 160 108 258 360 ... ..$ hp : num [1:32] 110 110 93 110 175 105 245 62 95 123 ... ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ... ..$ wt : num [1:32] 2.62 2.88 2.32 3.21 3.44 ... ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ... ..$ vs : num [1:32] 0 0 1 1 0 1 0 1 1 1 ... ..$ am : num [1:32] 1 1 1 0 0 0 0 0 0 0 ... ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ... ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ... $ offset : NULL $ control :List of 3 ..$ epsilon: num 1e-08 ..$ maxit : num 25 ..$ trace : logi FALSE $ method : chr "glm.fit" $ contrasts : NULL $ xlevels : Named list() - attr(*, "class")= chr [1:2] "glm" "lm"
我只想提取显著的相关系数,则
fit.b$coefficients (Intercept) mpg gear -88.2992383 0.3366025 20.3061829
总结下,当我们做个检验、分析、回归、包括主成分分析、聚类等时候,str函数和summary函数可以配合的很好,自动化的进行下一步工作。
相关文章推荐
- 薛开宇学习笔记二之总结笔记(用一个预训练模型提取特征)--Linux语法总结
- laravel学习笔记---关于模型
- tensorflow 学习笔记10 网络模型的保存与提取
- R语言函数与模型学习笔记:残差相关性零均值检验及跨期相关系数(图)
- R语言与函数估计学习笔记(函数模型的参数估计)
- 关于ccna的一些学习笔记
- 孙鑫VC学习笔记:第十三讲 关于释放内存
- AppFuse学习笔记-模型层
- 软件测试学习笔记--(关于排错)
- C++对象模型之一 关于对象笔记
- [CSS2盒模型]--div学习笔记一
- Delphi 对象模型学习笔记
- Microsoft .NET 的企业解决方案模式 > Web 表示模式 > 模型-视图-控制器(学习笔记四)
- 关于用jsp实现http认证安全登陆的学习笔记。(正在原创ing)
- ACE学习笔记 ----- 一个简单的网页链接提取程序
- 关于sql语句中top + order by语句出现多提取问题的解决[cherryt笔记]
- 发现了一个关于struts学习的不错网址
- MPEG4 & H.264学习笔记之三 ------ 图像模型(图像处理过程)
- Delphi 对象模型学习笔记
- [学习笔记][ASP.NET]发现CuteEditor的一个小问题