您的位置:首页 > 其它

sas数据导入终极汇总-之二

2014-09-27 21:12 260 查看
从FTP读入数据

read raw data via FTP in SAS?

SAS has the ability to read raw data directly from FTP servers. Normally, you would use FTP to download the data to your local computer and then use SAS to read the data stored on your local computer. SAS allows you to bypass the FTP step and read the data
directly from the other computer via FTP without the intermediate step of downloading the raw data file to your computer. Of course, this assumes that you can reach the computer via the internet at the time you run your SAS program. The program below illustrates
how to do this. After the filename in you put ftp to tell SAS to access the data via FTP. After that, you supply the name of the file (in this case 'gpa.txt'. lrecl= is used to specify the width of your data. Be sure to choose a value that is at least as wide
as your widest record. cd= is used to specify the directory from where the file is stored. host= is used to specify the name of the site to which you want to FTP. user= is used to provide your userid (or anonymous if connecting via anonymous FTP). pass= is
used to supply your password (or your email address if connecting via anonymous FTP).

FILENAME in FTP 'gpa.txt' LRECL=80

CD='/local2/samples/sas/ats/'

HOST='cluster.oac.ucla.edu'

USER='joebruin'

PASS='yourpassword' ;

DATA gpa ;

INFILE in ;

INPUT gpa hsm hss hse satm satv gender ;

RUN;

PROC PRINT DATA=gpa(obs=10) ;

RUN;

读入多个数据文件

quarter1.dat

1 120321 1236 154669 211326

1 326264 1326 163354 312665

1 420698 1327 142336 422685

1 211368 1236 156327 655237

1 378596 1429 145678 366578

quarter2.dat

2 140362 1436 114641 362415

2 157956 1327 124869 345215

2 215547 1472 165578 412567

2 204782 1495 150479 364474

2 232571 1345 135467 332567

quarter3.dat

3 140357 1339 142693 205881

3 149964 1420 152367 223795

3 159852 1479 160001 254874

3 139957 1527 163567 263088

3 150047 1602 175561 277552

quarter4.dat

4 479574 1367 155997 36134

4 496207 1459 140396 35941

4 501156 1598 135489 39640

4 532982 1601 143269 38695

4 563222 1625 147889 39556

filename year ('d:\quarter1.dat' 'd:\quarter2.dat' 'd:\quarter3.dat' 'd:\quarter4.dat');

data temp;

infile year;

input quarter sales tax expenses payroll;

run;

proc print data = temp;

run;

读取excel 数据集

Reading an Excel file into SAS

Suppose that you have an Excel spreadsheet called auto.xls. The data for this spreadsheet are shown below.

MAKE MPG WEIGHT PRICE

AMC Concord 22 2930 4099

AMC Pacer 17 3350 4749

AMC Spirit 22 2640 3799

Buick Century 20 3250 4816

Buick Electra 15 4080 7827

Using the Import Wizard is an easy way to import data into SAS. The Import Wizard can be found on the drop down file menu. Although the Import Wizard is easy it can be time consuming if used repeatedly. The very last screen of the Import Wizard gives you the
option to save the statements SAS uses to import the data so that they can be used again. The following is an example that uses common options and also shows that the file was imported correctly.

PROC IMPORT OUT= WORK.auto1

DATAFILE= "C:\auto.xls"

DBMS=EXCEL REPLACE;

SHEET="auto1";

GETNAMES=YES;

MIXED=YES;

USEDATE=YES;

SCANTIME=YES;

RUN;

proc print data=auto1;

run;

Obs MAKE MPG WEIGHT PRICE

1 AMC Concord 22 2930 4099

2 AMC Pacer 17 3350 4749

3 Amc Spirit 22 2640 3799

4 Buick Century 20 3250 4816

5 Buick Electra 15 4080 7827

First we use the out= statement to tell SAS where to store the data once they are imported.

Next the datafile= statement tells SAS where to find the file we want to import.

The dbms= statement is used to identify the type of file being imported. This statement is redundant if the file you want to import already has an appropriate file extension, for example *.xls.

The replace statement will overwrite an existing file.

To specify which sheet SAS should import use the sheet="sheetname" statement. The default is for SAS to read the first sheet. Note that sheet names can only be 31 characters long.

The getnames=yes is the default setting and SAS will automatically use the first row of data as variable names. If the first row of your sheet does not contain variable names use the getnames=no.

SAS uses the first eight rows of data to determine whether the variable should be read as character or numeric. The default setting mixed=no assumes that each variable is either all character or all numeric. If you have a variable with both character and numeric
values or a variable with missing values use mixed=yes statement to be sure SAS will read it correctly.

Conveniently SAS reads date, time and datetime formats. The usedate=yes is the default statement and SAS will read date or time formatted data as a date. When usedate=no SAS will read date and time formatted data with a datetime format. Keep the default statement
scantime=yes to read in time formatted data as long as the variable does not also contain a date format.

Example 1: Making a permanent data file

What if you want the SAS data set created from proc import to be permanent? The answer is to use libname statement. Let's say that we have an Excel file called auto.xls in directory "d:\temp" and we want to convert it into a SAS data file (call it myauto) and
put it into the directory "c:\dissertation". Here is what we can do.

libname dis "c:\dissertation";

proc import datafile="d:\temp\auto.xls" out=dis.myauto replace;

run;

Example 2: Reading in a specific sheet

Sometimes you may only want to read a particular sheet from an Excel file instead of the entire Excel file. Let's say that we have a two-sheet Excel file called auto2.xls. The example below shows how to use the option sheet=sheetname to read the second sheet
called page2 in it.

proc import datafile="auto2.xls" out=auto1 replace;

sheet="page2";

run;

Example 3: Reading a file without variable names

What if the variables in your Excel file do not have variable names? The answer here is to use the statement getnames=no in proc import. Here is an example showing how to do this.

proc import datafile="a:\faq\auto.xls" out=auto replace;

getnames=no;

run;

Writing Excel files out from SAS

It is very easy to write out an Excel file using proc export in SAS version 8. Consider the following sample data file below.

Obs MAKE MPG WEIGHT PRICE

1 AMC 22 2930 4099

2 AMC 17 3350 4749

3 AMC 22 2640 3799

4 Buick 20 3250 4816

5 Buick 15 4080 7827

Here is a sample program that writes out an Excel file called mydata.xls into the directory "c:\dissertation".

proc export data=mydata outfile='c:\dissertation\mydata.xls' replace;

run;

SAS读入复杂分隔数据——字符长度不同,字符中间有空格作为间隔符

1.字符长度不同

data web;

length site $41;

input age site $ hits;

datalines;

12 http://www.site1.org/default.htm 123456

130 http://www.site2.com/index.htm 97654

254 http://www.site3.edu/department/index.htm 987654

;

proc print;

run;

Obs site age hits

1 http://www.site1.org/default.htm 12 123456

2 http://www.site2.com/index.htm 130 97654

3 http://www.site3.edu/department/index.htm 254 987654

或者用

data web;

input age site & $41. hits;

datalines;

12 http://www.site1.org/default.htm 123456

130 http://www.site2.com/index.htm 97654

254 http://www.site3.edu/department/index.htm 987654

;

proc print;

run;

Obs age site hits

1 12 http://www.site1.org/default.htm 123456

2 130 http://www.site2.com/index.htm 97654

3 254 http://www.site3.edu/department/index.htm 987654

2.字符有多个单词,单词之间用空格隔开

data fruit;

infile 'C:\messy.txt' delimiter = ' ' dsd;

length fruit $22;

input zip fruit $ pounds;

proc print;

run;

Obs fruit zip pounds

1 apples, grapes kiwi 10034 123456

2 oranges 92626 97654

3 pears apple 25414 987654

或者

data fruit;

input zip fruit & $22. pounds;

datalines;

10034 apples, grapes kiwi 123456

92626 oranges 97654

25414 pears apple 987654

;

proc print;

run;

Obs zip fruit pounds

1 10034 apples, grapes kiwi 123456

2 92626 oranges 97654

3 25414 pears apple 987654

没有格式库的情况下读入数据:

read a SAS data file when I don't have its format library

If you try to use a SAS data file that has permanent formats but you don't have the format library, you will get errors like this.

ERROR: The format $MAKEF was not found or could not be loaded.

ERROR: The format FORGNF was not found or could not be loaded.

Without the format library, SAS will not permit you to do anything with the data file. However, if you use options nofmterr; at the top of your program, SAS will go ahead and process the file despite the fact that it does not have the format library. You will
not be able to see the formatted values for your variables, but you will be able to process your data file. Here is an example.

OPTIONS nofmterr;

libname in "c:\";

PROC FREQ DATA=in.auto;

TABLES foreign make;

RUN;

高效的保留或者去掉部分变量的方式:

The following program creates exactly the same file, but is a more efficient program because SAS only reads the desired variables.

DATA auto2;

SET auto (KEEP = make mpg price);

RUN;

The drop data step option works in a similar way.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: