您的位置:首页 > 编程语言

R语言基础编程技巧汇编 - 18

2015-04-06 16:57 423 查看

1. 利用stringr包处理字符串

包含非常方便的用于处理字符串的函数:str_c(str_join),str_match,str_replace,str_split等,具体使用方法请查看帮助文档。

2. 访问和修改函数内部定义的函数

ARcop.theta<-function(U)

{

gumbel<-function(theta,U)

{

dCopula(U,gumbelCopula(theta,dim=2))

}

f1<-function(theta)

{

sum(log(gumbel(theta,U)))

}

theta1<-optimize(f1,c(1,10),maximum=TRUE)$maximum

clayton<-function(theta,U)

dCopula(U,claytonCopula(theta,dim=2))

f2<-function(theta)

{

sum(log(clayton(theta,U)))

}

theta2<-optimize(f2,c(1,10),maximum=TRUE)$maximum

frank<-function(theta,U)

{

dCopula(U,frankCopula(theta,dim=2))

}

f3<-function(theta)

{

sum(log(frank(theta,U)))

}

theta3<-optimize(f3,c(1,100),maximum=TRUE)$maximum

return(list(theta1,theta2,theta3))

}

一个大神给出了下面的方法:

oldbody<-body(ARcop.theta)

newbody<-bquote(

{

.(redefine);

.(oldbody);

.(addstuff)

},

list(redefine=quote(return<-function(x){x}),oldbody=oldbody,addstuff=quote(function(){})))

body(ARcop.theta)<-newbody

closure<-ARcop.theta(U)

ls(environment(closure))

environment(closure)$f1

PS:我没法用ARcop.theta试验这段代码,因为我不知道dCopula哪里来的,你加载这个函数后,可以试验上面的代码,并且要注意调用ARcop.theta(U)的时候,改成正确的你的参数U

3. 利用all.vars函数从表达式中提取变量

all.vars(~ var1 + Var2 + var3)

#[1] "var1" "Var2""var3"

all.vars(expression(sin(x+y)))

#[1] "x" "y"

4. “1L”表示整数

整数值后加上L表示为integer类型,否则默认为numeric类型。

> class(1)

[1] "numeric"

> class(1L)

[1] "integer"

5. 绘制螺旋线

#test data,may be WRONG!

rho <- seq(0,10*pi,by=pi/100)

ph <- sqrt(rho)

x <- ph*cos(rho)

y <- ph*sin(rho)

#main plot

plot.new()

plot.window(xlim=c(-10,10), ylim=c(-10,10))

lines(x,y,col="green")

#axis

arrows(0,0,7,0)

arrows(0,0,0,7)

#x axis

segments(1:3,0,1:3,0.5)

text(0:3,0,0:3,pos=1)

#y axis

segments(0,1:3,0.5,1:3)

text(0,1:3,1:3,pos=2)

#main title

title(main=expression(paste("螺旋线:",rho,"=",sqrt(theta),seq="")))



6. 输出数据到剪贴板

以Windows环境为例,

writeClipboard(str = 'CCCC')

运行以上语句后,剪贴板中就保存了字符串CCCC。

Linux下则可以建立管道传给 xclip

7. 不显示图例,而显示文字

cog.data = read.table(textConnection('

A:239:RNA processing and modification

B:309:Chromatin structure and dynamics

C:919:Energy production and conversion

D:1206:Cell cycle control, cell division,chromosome partitioning

E:1125:Amino acid transport and metabolism

F:225:Nucleotide transport and metabolism

G:1795:Carbohydrate transport andmetabolism

H:489:Coenzyme transport and metabolism

I:623:Lipid transport and metabolism

J:2243:Translation, ribosomal structure andbiogenesis

K:2455:Transcription

L:2335:Replication, recombination andrepair

M:1164:Cell wall/membrane/envelopebiogenesis

N:230:Cell motility

O:2157:Posttranslational modification,protein turnover, chaperones

P:861:Inorganic ion transport andmetabolism

Q:860:Secondary metabolites biosynthesis,transport and catabolism

R:4564:General function prediction only

S:1265:Function unknown

T:1788:Signal transduction mechanisms

U:693:Intracellular trafficking, secretion,and vesicular transport

V:422:Defense mechanisms

W:10:Extracellular structures

Y:8:Nuclear structure

Z:552:Cytoskeleton'), sep = ':', header =FALSE, col.names = c('Code', 'Gene-Number', 'Functional-Categories'))

barplot(cog.data[, 2], col = rainbow(24),xlim = c(0, 30), ylim = c(0, 5000), names.arg = cog.data[, 1], las = 1, width =0.5, cex.names = 0.6, cex.axis = 0.6)

legend(15, 5000, legend = paste(cog.data[,1], cog.data[, 3], sep = ': '), y.intersp = 0.5, bty = "n", cex = 0.7)

title('COG Function Classification ofA-Unigene.fa Sequence')



8. 利用R中的“CRAN Task Views”功能批量下载某个专业的包

如果要使用这个功能,需要先下载包ctv:

install.packages('ctv")

然后使用如下命令:

install.views("Econometrics")

这样就可以将计量经济学范畴的包都下载安装了。

目前的views有:

Bayesian Bayesian Inference

Cluster Cluster Analysis & Finite Mixture Models

Econometrics ComputationalEconometrics

Environmetrics Analysis ofecological and environmental data

Finance Empirical Finance

Genetics Statistical Genetics

Graphics Graphic Displays & Dynamic Graphics & Graphic Devices &Visualization

gR gRaphical models in R

MachineLearning MachineLearning & Statistical Learning

Multivariate Multivariate Statistics

SocialSciences Statistics for the Social Sciences

Spatial Analysis of Spatial Data

涵盖的面已经比较广了,陆续应该还会有很多主题会添加进去。

大家可以在各个CRAN镜像中看到views这栏,里面会对每个具体的view进行一些介绍,比如计量经济学,我们可以在其介绍中看到很多有用的内容,尤其是关于计量经济学各个领域需要的包等信息,具体的信息可以通过如下链接获得:
http://cran.r-project.org/src/contrib/Views/Econometrics.html

9. I()函数的作用

I()用于区分表达式和formula部分,因为formula也使用各种运算符来表示不同的含义,比如
y ~ a + I(b+c), 表示a和b + c为自变量;而如果写成y
~ a + b + c,则表示a,b,c都是自变量。

再看下面的例子:

> x = iris[, 1]

> y = iris[, 2]

> lm(y ~ x)

Call:

lm(formula = y ~ x)

Coefficients:

(Intercept) x

3.41895 -0.06188

> lm(y ~ x + x^2)

Call:

lm(formula = y ~ x + x^2)

Coefficients:

(Intercept) x

3.41895 -0.06188

> lm(y ~ x + I(x^2))

Call:

lm(formula = y ~ x + I(x^2))

Coefficients:

(Intercept) x I(x^2)

6.4158 -1.0856 0.0857

10. 使用formatC和prettyNum函数格式化字符串

> xx <- pi * 10^(-5:4)

> cbind(format(xx, digits = 4),formatC(xx))

[,1] [,2]

[1,]"3.142e-05" "3.142e-05"

[2,]"3.142e-04" "0.0003142"

[3,]"3.142e-03" "0.003142"

[4,]"3.142e-02" "0.03142"

[5,]"3.142e-01" "0.3142"

[6,]"3.142e+00" "3.142"

[7,]"3.142e+01" "31.42"

[8,]"3.142e+02" "314.2"

[9,]"3.142e+03" "3142"

[10,] "3.142e+04""3.142e+04"

> cbind(formatC(xx, width = 9, flag ="-"))

[,1]

[1,]"3.142e-05"

[2,]"0.0003142"

[3,]"0.003142 "

[4,]"0.03142 "

[5,]"0.3142 "

[6,]"3.142 "

[7,]"31.42 "

[8,]"314.2 "

[9,]"3142 "

[10,] "3.142e+04"

> cbind(formatC(xx, digits = 5, width =8, format = "f", flag = "0"))

[,1]

[1,]"00.00003"

[2,]"00.00031"

[3,]"00.00314"

[4,]"00.03142"

[5,]"00.31416"

[6,]"03.14159"

[7,]"31.41593"

[8,]"314.15927"

[9,]"3141.59265"

[10,] "31415.92654"

> cbind(format(xx, digits = 4),formatC(xx, digits = 4, format = "fg"))

[,1] [,2]

[1,]"3.142e-05" "0.00003142"

[2,]"3.142e-04" "0.0003142"

[3,]"3.142e-03" "0.003142"

[4,]"3.142e-02" "0.03142"

[5,]"3.142e-01" "0.3142"

[6,]"3.142e+00" "3.142"

[7,]"3.142e+01" "31.42"

[8,]"3.142e+02" "314.2"

[9,]"3.142e+03" " 3142"

[10,] "3.142e+04""31416"

>

> formatC( c("a", "Abc", "noway"), width = -7) # <=> flag= "-"

[1] "a " "Abc " "no way "

> formatC(c((-1:1)/0,c(1,100)*pi), width= 8, digits = 1)

[1] " -Inf" " NaN" " Inf" " 3" " 3e+02"

>

> ## note that some of the results heredepend on the implementation

> ## of long-double arithmetic, which isplatform-specific.

> xx <-c(1e-12,-3.98765e-10,1.45645e-69,1e-70,pi*1e37,3.44e4)

> ## 1 2 3 4 5 6

> formatC(xx)

[1] "1e-12" "-3.988e-10""1.456e-69" "1e-70" "3.142e+37" "3.44e+04"

> formatC(xx, format = "fg") # special "fixed" format.

[1] "0.000000000001"

[2] "-0.0000000003988"

[3]"0.000000000000000000000000000000000000000000000000000000000000000000001456"

[4]"0.0000000000000000000000000000000000000000000000000000000000000000000001"

[5]"31415926535897927982886620480086844226"

[6] "34400"

> formatC(xx[1:4], format ="f", digits = 75) #>> even longer strings

[1]"0.000000000000999999999999999980050680026266718414262868463993072509765625000"

[2]"-0.000000000398765000000000018314655347850816724530886858701705932617187500000"

[3] "0.000000000000000000000000000000000000000000000000000000000000000000001456450"

[4]"0.000000000000000000000000000000000000000000000000000000000000000000000100000"

>

> formatC(c(3.24, 2.3e-6), format ="f", digits = 11, drop0trailing = TRUE)

[1] "3.24" "0.0000023"

>

> r <-c("76491283764.97430", "29.12345678901","-7.1234", "-100.1","1123")

> ## American:

> prettyNum(r, big.mark = ",")

[1] "76,491,283,764.97430"" 29.12345678901"" -7.1234"" -100.1"

[5] " 1,123"

> ## Some Europeans:

> prettyNum(r, big.mark = "'",decimal.mark = ",")

[1] "76'491'283'764,97430"" 29,12345678901"" -7,1234"" -100,1"

[5] " 1'123"

>

> (dd <- sapply(1:10, function(i)paste((9:0)[1:i], collapse = "")))

[1]"9" "98" "987" "9876" "98765" "987654" "9876543"

[8]"98765432" "987654321" "9876543210"

> prettyNum(dd, big.mark ="'")

[1]" 9" " 98" " 987" " 9'876" " 98'765" " 987'654"

[7]" 9'876'543" " 98'765'432" " 987'654'321" "9'876'543'210"

>

> ## examples of 'small.mark'

> pN <- stats::pnorm(1:7, lower.tail= FALSE)

> cbind(format (pN, small.mark = "", digits = 15))

[,1]

[1,] "1.58655 25393 1457e-01"

[2,] "2.27501 31948 1792e-02"

[3,] "1.34989 80316 3009e-03"

[4,] "3.16712 41833 1199e-05"

[5,] "2.86651 57187 9194e-07"

[6,] "9.86587 64503 7698e-10"

[7,] "1.27981 25438 8584e-12"

> cbind(formatC(pN, small.mark = " ",digits = 17, format = "f"))

[,1]

[1,] "0.15865 52539 31457 05"

[2,] "0.02275 01319 48179 21"

[3,] "0.00134 98980 31630 09"

[4,] "0.00003 16712 41833 12"

[5,] "0.00000 02866 51571 88"

[6,] "0.00000 00009 86587 65"

[7,] "0.00000 00000 01279 81"

>

> cbind(ff <- format(1.2345 +10^(0:5), width = 11, big.mark = "'"))

[,1]

[1,] " 2.2345"

[2,] " 11.2345"

[3,] " 101.2345"

[4,] " 1'001.2345"

[5,] " 10'001.2345"

[6,] "100'001.2345"

> ## all with same width (one more thanthe specified minimum)

>

> ## individual formatting to commonwidth:

> fc <- formatC(1.234 + 10^(0:8),format = "fg", width = 11, big.mark = "'")

> cbind(fc)

fc

[1,]" 2.234"

[2,]" 11.23"

[3,]" 101.2"

[4,]" 1'001"

[5,]" 10'001"

[6,]" 100'001"

[7,]" 1'000'001"

[8,]" 10'000'001"

[9,]"100'000'001"

>

> ## complex numbers:

> r <- 10.0000001; rv <-(r/10)^(1:10)

> (zv <- (rv + 1i*rv))

[1]1+1i 1+1i 1+1i 1+1i 1+1i 1+1i 1+1i 1+1i 1+1i 1+1i

> op <- options(digits = 7) ##(system default)

> (pnv <- prettyNum(zv))

[1]"1+1i" "1+1i" "1+1i" "1+1i""1+1i" "1+1i" "1+1i" "1+1i""1+1i" "1+1i"

> stopifnot(pnv == "1+1i", pnv== format(zv),

+ pnv == prettyNum(zv, drop0trailing = TRUE))

> ## more digits change the picture:

> options(digits = 8)

> head(fv <- format(zv), 3)

[1] "1.0000000+1.0000000i""1.0000000+1.0000000i" "1.0000000+1.0000000i"

> prettyNum(fv)

[1]"1.0000000+1.0000000i" "1.0000000+1.0000000i""1.0000000+1.0000000i" "1.0000000+1.0000000i"

[5]"1.0000001+1.0000001i" "1.0000001+1.0000001i""1.0000001+1.0000001i" "1.0000001+1.0000001i"

[9]"1.0000001+1.0000001i" "1.0000001+1.0000001i"

> prettyNum(fv, drop0trailing = TRUE) #a bit nicer

[1]"1+1i" "1+1i" "1+1i" "1+1i"

[5]"1.0000001+1.0000001i" "1.0000001+1.0000001i""1.0000001+1.0000001i" "1.0000001+1.0000001i"

[9]"1.0000001+1.0000001i" "1.0000001+1.0000001i"

> options(op)

11. 使用未exported的函数

tools:::makeLazyLoadDB

12. 使用“/”而不是“\\”作为路径访问符

“/”在所有的操作系统中都适用,而“\\”仅在Windows中适用,例如:

ds =read.table("dir_location\\file.txt", header=TRUE) # Windows only

ds =read.table("dir_location/file.txt", header=TRUE) # all OS (including Windows)

13. 创建文件夹

dir.create()

14. 求矩阵的迹

主对角线上的所有元素之和称为迹.

sum(diag(x))

15. 在图中绘制多行字符串

a<-c(1,10)

b<-c(2,12)

amean<-mean(a)

bmean<-mean(b)

plot(a,b)

hght<-strheight("Here")

Lines=list("Here are the values","", bquote(amean==.(amean)), "and",bquote(bmean==.(bmean)))

text(amean,bmean-(hght*1.5*seq(length(Lines))),do.call(expression,Lines),adj=c(0,0))

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: