CSVsql 去掉字段双引号内部两个双引号之间有换行符导入sas出错求助

点击联系发帖人 时间：2017-05-31 14:24

mysql 字段换行符

温馨提示！由于新浪微博认证机制调整，您的新浪微博帐号绑定已过期，请重新绑定！&&|&&
LOFTER精选
网易考拉推荐
用微信&&“扫一扫”
将文章分享到朋友圈。
用易信&&“扫一扫”
将文章分享到朋友圈。
另外，从SAS编程的角度也可以解决这个问题的，但由于笔者我对编程语句不是很熟悉，所以等老子我熟悉以后在更新内容。
阅读(2625)|
用微信&&“扫一扫”
将文章分享到朋友圈。
用易信&&“扫一扫”
将文章分享到朋友圈。
历史上的今天
在LOFTER的更多文章
loftPermalink:'',
id:'fks_',
blogTitle:'SAS导入EXCEL出现的字符问题',
blogAbstract:'
导入SAS后，中文变量只剩下一个汉字，也就是只能显示2个字符，其实原因可以从以下方面总结：当我出现了这个问题后，我首先是从两个方面来着手。
1、excel数据本身。&&&&& 更改字符串的属性,选择格式设置进行更改；&&&& excel本身，尤其是2007版本和SAS本身有冲突所导致的，所以可以将excel转变为其他格式，csv或txt格式后导入sas.2、sas本身的问题。
因为：在SAS里面，空格是字符变量的缺失值，那么第一个变量和第二个变量 SAS系统认为是字符型变量，默认长度为2，所以，这个字段之后的变量都会被默认为2个字符的长度，所以导入数据的时候保证excel前面的两条记录不空；',
blogTag:'sas,字符串,图解',
blogUrl:'blog/static/',
isPublished:1,
istop:false,
modifyTime:6,
publishTime:7,
permalink:'blog/static/',
commentCount:3,
mainCommentCount:3,
recommendCount:1,
bsrk:-100,
publisherId:0,
recomBlogHome:false,
currentRecomBlog:false,
attachmentsFileIds:[],
groupInfo:{},
friendstatus:'none',
followstatus:'unFollow',
pubSucc:'',
visitorProvince:'',
visitorCity:'',
visitorNewUser:false,
postAddInfo:{},
mset:'000',
remindgoodnightblog:false,
isBlackVisitor:false,
isShowYodaoAd:false,
hostIntro:'',
hmcon:'0',
selfRecomBlogCount:'0',
lofter_single:''
{list a as x}
{if x.moveFrom=='wap'}
{elseif x.moveFrom=='iphone'}
{elseif x.moveFrom=='android'}
{elseif x.moveFrom=='mobile'}
${a.selfIntro|escape}{if great260}${suplement}{/if}
{list a as x}
推荐过这篇日志的人：
{list a as x}
{if !!b&&b.length>0}
他们还推荐了：
{list b as y}
转载记录：
{list d as x}
{list a as x}
{list a as x}
{list a as x}
{list a as x}
{if x_index>4}{break}{/if}
${fn2(x.publishTime,'yyyy-MM-dd HH:mm:ss')}
{list a as x}
{if !!(blogDetail.preBlogPermalink)}
{if !!(blogDetail.nextBlogPermalink)}
{list a as x}
{if defined('newslist')&&newslist.length>0}
{list newslist as x}
{if x_index>7}{break}{/if}
{list a as x}
{var first_option =}
{list x.voteDetailList as voteToOption}
{if voteToOption==1}
{if first_option==false},{/if}&&“${b[voteToOption_index]}”&&
{if (x.role!="-1") },“我是${c[x.role]}”&&{/if}
&&&&&&&&${fn1(x.voteTime)}
{if x.userName==''}{/if}
网易公司版权所有&&
{list x.l as y}
{if defined('wl')}
{list wl as x}{/list}导出csv文件时，处理分隔符问题
CSV文件默认以英文逗号做为列分隔符，换行符作为行分隔符。
& 如果不提供网页形式只用命令行或二进制程序输出数据到CSV，只需要将数据段按,分割，行按\n分割，写入到一个.csv文件即可。
& 但有时字段里含有,和换行符就麻烦了，数据输出会出现混乱。这时可以使用双引号"来将每个字段内容括起来，CSV默认认为由""括起来的内容是一个栏位，
这时不管栏位内容里有除"之外字符的任何字符都可以按原来形式引用。例如
&"姓; 名:k,j",email，是一行的内容，
&" 姓;名:k,j"
是一个单元格里的。用""括起来的栏位内容里有CSV默认的换行符也没关系。这样输出数据就很方便了。注意"是要加转义字符。
"\"".$result_name."\",\"".$result_employer."\",\"".$result_specialism."\",\"".$rank[0]."\",";
唯一需要注意的就是若字段内容里含有"怎么办。这时只需将"替换成两个双引号("")即可。CSV会将字段里的两个双引号""显示成一个。
&$result_specialism =
str_replace("\"","\"\"",$datamap['specialism']-&content());
如果需要使用网页导出数据的形式，需要写header了以PHP为例：
&首先做一个页面，是被下载的CSV文件页 saveCSV.php
//指明此文件类型，页面类型
&header("Content-Disposition:
filename=".$filename.".csv");
&header(''Content-Type:text/csv'');
处理数据输出
?&直接点击到一个php页面就能下载了
&form里用一个&input type='button'
value='导出CSV'
onclick='saveCSV()'/&，saveCSV()是一个JS方法，将form的action指向saveCSV.php
已投稿到：
以上网友发言只代表其个人观点，不代表新浪网的观点或立场。hjhnju 的BLOG
用户名：hjhnju
文章数：206
评论数：78
访问量：656596
注册日期：
阅读量：5863
阅读量：12276
阅读量：421649
阅读量：1110001
51CTO推荐博文
文本文件多以.txt、.dat以及.csv为后缀（在Unix/Linux世界，还可能出现.data数据甚至没有后缀的情况）。一般分两种，其一看起来跟下面一样――如果用UltraEdit等编辑器打开，你能看到指示列数的标尺，而且每个字段的列数是固定的，称为字段固定(Fixed Fields)的文本文件：
2810 61 MOD& F2804 38 HIGH F
下面的数据中，每个字段的列数不必相同，但都由同一样个的分隔符（这里是逗号）分开，所以称为分隔符固定(Delimited Data)或者自由格式(Free-format)的文件文件。如果分隔符为逗号（通常以.csv为后缀），干脆就叫做“用逗号分隔的文本文件”：
1-Mar-90,LON,19813-Mar-90,FRA,2073
对以上两种文本格式的数据，SAS提供了以下四种基本的输入模式：
列输入模式(Column) ――应用于字段固定的文本文件
格式化输入模式(Formatted) ――应用于字段固定的文本文件
列举输入模式(List) ――应用于分隔符固定的文本文件
命名输入模式(Named)
对以上几种输入模式，基本的语句如下，区别就在于input语句的具体设定：
data 你对导入数据的命名;&&& infile& 源文件名,加上具体的盘符位置;&&& input 变量输入设定；
1.列输入模式(Column)--应用于字段固定的文本文件
对字段固定的源文件，input语句的形式是
input 变量名1&s&开始列数-结束列数变量名2&s&开始列数-结束列数 …;
一个能工作的列输入模式语句看起来就像下面展示的：
data work.example1;&&& infile 'C:\data\example1.dat'& firstobs=2 obs=100;&&& input ID 1-3 Name $ 5-10;
对字段的选择非常灵活，可以任意选择字段，以及安排读入的次序；
整个字段或其部分可以重复读入；
字段间不一定要有空格或者其他分隔符来分隔；
字符变量最多可含32K个字符，并可以包含空格符；
对缺失数据没有特殊占位符的要求。一个空格字段就读为一个缺失值，也不会引起其他字段读入的错误。
可以设定输入长度，但不可以设定输入格式。对数值型变量，只能读入标准数据值(Standard Numeric Data Value)，即只包括数字、正负号、小数点和科学计数符号E构成的数，对日期型数据，以及包含美元符号、逗号等其他符号的数值，因为需要设定输入格式并按格式读入，列输入模式就无法正确读取。
2.格式化输入模式(Formatted)--应用于字段固定的文本文件
格式化输入模式类似于列模式：
它适用于字段固定格式的数据文件；
它也给出字段开始的列数，但不直接给出结束列数，而是通过输入格式给出读入长度；
它可以设定输入格式。
它的input语句的格式为：
input &指针控制& 变量名输入格式 …;
注1：指针控制，就是将输入列指针控制在某个位置，作为读入字段的开始列号，它有 @n 或者 +n 的形式：
@n表示从第n列开始读入（指示开始列数的绝对位置）；
+n表示将列控制指针增加n列侯读入（指示相对位置）。
一个能工作的格式化输入模式语句看起来就像下面展示的：
data work.example2;&&& infile 'C:\data\example2.dat';&&& input Name $ 2. @3 Job $5. +7 Place $8.;
3.列举输入模式(List)--应用于分隔符固定的文本文件
因为分隔符固定的数据可以用分隔符来确定字段，input语句特别简单：
input 变量名&s&…;
如果要读入非空格分隔符的数据，就需要在infile语句中指明（默认为空格）：
infile 文件盘符 &dlm="分隔符"&;
在列举模式下，变量长度的缺省值为8，长度超过8的字符变量在读入时会被截断，这时可以用lenght语句来设定长度：
length 变量名&s& 长度;
列举模式也可以设定输入格式，这只需要在输入变量设定时附加上相关修饰：
input 变量名&s&: 输入格式 …;
一个能工作的列举输入模式语句看起来就像下面展示的：
data work.example3;&&& infile 'C:\data\example3.dat' dlm=',';&&& length item $ 10.;&&& input ID Name $ item $&& income:comma9.;
4.命名输入模式(Named)
命名输入模式很少见到，因为很难得见到这种格式的原数据，其中三个变量是ID、Name和Score：
1 Name=Tom Score=A2 Name=Jim& Score=C
相应的input语句为:
input& ID Name=$3. Score=$1.
参考资料：
汪嘉冈《SAS V8基础教程》，北京：中国统计出版社，2001
SAS .lineTutor: Basic and Intermediate SAS
了这篇文章
类别：┆阅读(0)┆评论(0)
20:44:18 20:45:21SAS数据步导入数据终极汇总——经典
一、将数据录入SAS——DATA Step / Viewtable
&&& 1.Internal
raw data- Datalines or Cards 命令；
&&& 2.External
Raw data files- Infile 命令 + Input 命令;
二、将数据文件读入SAS ——DATA Step / PROC IMPORT
&& 1.将SAS文件读入SAS——
&& data sasuser.
set "F:\sas1.sas7bdat";
&& proc contents
data=sasuser.
&& 2.将其他形式文件导入成SAS ——PROC IMPORT
/ 直接读入其他形式文件
&& proc import datafile =
"c:\data\hsb2.sav" out= work.hsb2;
&& proc contents data=hsb2;
&& SAS导入数据：SAS recognizes the
file type to be imported by file extension.
&&&&&&&&&&&&&&&&&&&&&
对数据长度的限制
在一些操作环境，SAS假定外部文件的纪录对最长为256（一行数据包括空格等所有字符在内的长度），如果预计读入的纪录长度超过256，可在Infile语句中使用LRECL=n
这个命令。
读入以空格作为分隔符的原始数据
如果原始数据的不同变量之间是以至少一个空格作为分隔符的，那可以直接采用List方法将这些数据读入SAS。
List Input读数据非常方便，但也有很多局限性：
（1）&&&&&&
不能跳过数据；
（2）&&&&&&
所有的缺失值必须以点代替
（3）&&&&&&
字符型数据必须是不包含空格的，且长度不能超过8；
（4）&&&&&&
不能直接读入日期型等特殊类型的数据。
程序举例：
INPUT Name $ Age H
&&&&&&&&&&&&&&&&&&&&&
读入按列组织的数据
有些原始数据的变量之间没有空格或其他分隔符，因此这样的文件不能以List形式对入SAS。但若不同变量值的都在每条记录的固定位置处，则可以按照Column
形式读入数据。Colunm读数据方法要求所有的数据均为字符型或者标准的数值型（数值中仅包括数字，小数点，正负号，或者是E，不包括逗号或日期型数据）。
相对于List方法，Column读数据方法有如下优点：
（1）&&&&&&
变量值之间无需用空格分开；
（2）&&&&&&
可以空格表示缺失值；
（3）&&&&&&
字符型数据中可包括空格；
（4）&&&&&&
可跳过数据。
程序举例：
INPUT Name $ 1-10 Age 11-13 Height 14-18;
使用格式命令读入非标准格式的数据
字符型数据： $informat w.
数值型数据：&& informat w.d
日期型数据：&& Datew.
(1)字符型：
$CHARw. ：不删除前后空格，读入字符数据；
$HEXw. ：将16进制的数据转化成字符数据；
：删除前面空格，读入字符数据；
(2)日期，时间或日期时间型数据
DATEw.&&&&&&&&&&
：以ddmmmyy或ddmmmyyyy形式读入日期；
DATETIMEw. ：以ddmmmyy hh:mm:ss.ss 形式读入日期时间；
DDMMYYw.&&&&
：以ddmmyy或ddmmyyyy读入日期；
JULIANw.&&&&&&&
：以yyddd或yyyyddd读入Julia日期；
MMDDYYw.&&&&
：以mmddyy或mmddyyyy形式读入日期；
TIMEw.&&&&&&&&&&&&
：以hh:mm:ss.ss形式读入时间；
(3)数值型数据
COMMAw.d&&&&&&
：读入数值型数据，将其中的逗号，$ 删除，并将括号转化为负号
HEXw.&&&&&&&&&&&&&&&
：将16进制数据转化成浮点型数据
IBw.d&&&&&&&&&&&&&&&&&
：读入整数二进制数据；
PERCENTw.&&&&
：将百分数转化为普通数据；
w.d&&&&&&&&&&&&&&&&&&&&&
：读入标准的数值型数据。
INPUT Name $16. Age 3. +1 Type $1. +1 Date
MMDDYY10.&
(Score1 Score2 Score3 Score4 Score5) (4.1);
&&&&&&&&&&&&&&&&&&&&&&&&
多种输入格式综合
读入位置控制——列指针
+n &n ：控制列指针从当前位置向前或向后移动n个字符；
@n&& ：控制列指针指向
INPUT ParkName $ 1-22 State $ Year @40 Acreage COMMA9.;
&&&&&&&&&&&&&&&&&&&&&&&&
读入杂乱数据
在不确定从哪一列开始读入数据，但知道读入的数据均位于某一特定字符或字符串之后时，可采用@’character’列指针。
如：有字符串如下，需读入Breed：后面的字符串
My dog Sam Breed: Rottweiler Vet Bills: $478
（1）SAS 语句：Input @’Breed: ’ DogBreed
读入内容： Rottweil
读入Breed：后面的字符串，赋给DogBreed，读入时到空格时，自动结束。
（2）SAS 语句：Input @’Breed:’ DogBreed
$20.;&& 读入内容：Rottweiler Vet
读入Breed: 后面的字符串，赋给DogBreed，读入字符串的长度为20。
（3）SAS语句：Input @’Breed:’ DogBreed
:$20.;&& 读入内容：Rottweiler
读入Breed: 后面的字符串，赋给DogBreed，读入字符串的长度为20，但遇到空格时不再继续读数据。
从原始数据中读入多行数据作为SAS的一条观测
使用行指针：
‘ / ’—— 到下一行读数据
‘#n ’——到第n 行读数据
INPUT City $ State $ / NormalHigh NormalLow #3 RecordHigh
从一行原始数据中读入多个观测
在Input语句末尾使用@@标示，告诉SAS继续读入本行后面的数据。
INPUT City $ State $ NormalRain MeanDaysRain @@;
有选择的读入原始数据
SAS让用户无需读入所有的原始数据，然后再判断是否是用户所需要的数据。用户仅需先读入足够的变量，以判断该条观测是否为自己所需，然后在INPUT语句后以@结尾，以使SAS读数据的指针停在此处，保留此行数据。然后用户使用IF语句判断读入的观测是否为所需数据，若是，则使用第二个INPUT语句继续读入其余数据。
INPUT Type $ @;&
&& IF Type = ’surface’ THEN
&& INPUT Name $ 9-38 AMTraffic
均为锁定数据行的标示；
@标示在SAS进入下个循环之前就释放锁定的数据行，而@@标示在继续锁定数据行
在INFILE语句中控制输入的选项
（1）FIRSTOBS=n : 从n条观测开始读入数据
（2）OBS=n 读入n条观测
（3）当读进内存的观测长度小于INPUT语句设定的长度时
当SAS指针到达一条记录的末尾，而在INPUT语句中尚有未读入的变量时，SAS默认继续读入下一行数据。
MISSOVER：不读入下一行数据，而将未赋值的变量以缺失值填充。&
TRUNCOVER：当使用column或格式化读入方式时，某些数据行长度小于其他数据行长度时，使用TRUNCOVER选项，可防止SAS读入下一行数据。
使用DATA步读入分隔符文件
在INFILE语句中使用DLM= 选项或者DSD选项可以读入以特定符号作为分隔符的原始文件。
（1）The DLM= option （i.e. DLM=’&’）
如果是以Tab作为分隔符，则使用DLM=’09’X命令
（2）The DSD option：主要有三个功能
忽略单引号内的分隔符；
不将引号作为数据读入SAS；
将一行内连续两个单引号作为一个缺失值处理。
使用IMPORT程序步读入分隔符文件
IMPORT 程序的功能
（1）&&&&&&
自动扫描数据文件，并确定变量的类型（数值型或字符型）；
（2）&&&&&&
为字符型变量，自动设定变量的长度；
（3）&&&&&&
识别一些日期型数据；
（4）&&&&&&
将两个连续的分隔符作为一个缺失值读入SAS
（5）&&&&&&
读入引号内数据
（6）&&&&&&
自动将原始数据中不存在的变量赋缺失值；
PROC IMPORT DATAFILE=’filename’ OUT=data-
SAS根据读入文件的扩展名确定文件的类型。若读入文件没有正确的扩展名，或者是DLM文件，用户必须在IMPORT程序步中使用DBMS=option
选项。当读入数据集的名称已经存在于SAS库中，可用REPLACE选项将原数据覆盖。
PROC IMPORT DATAFILE=’filename’ OUT=data-set DBMS=identifier
在默认情况下，IMPORT程序步将第一行数据作为变量的名称。若第一行数据并非变量名，可在IMPORT语句后使用GETNAMES=NO语句。
若IMPORT程序读入的是分隔符文件，默认分隔符为空格。若不是，则需使用DILIMITER=statement语句指定分隔符。
PROC IMPORT DATAFILE=’filename’
OUT=data-set&
&&&&&&&&&&&&&&&&&&&&&&&&&&&
DBMS=DLM REPLACE;
GETNAMES=NO;
DELIMITER=’delimiter-character’;
使用IMPORT程序步读入PC文件
PROC IMPORT DATAFILE=’filename’ OUT=data-set
DBMS=identifier REPLACE;
列示SAS数据集的内容
PROC CONTENTS DATA=data-
CONTENTS程序步的功能是显示SAS对数据集的具体描述，主要内容有：
（1）&&&&&&
数据集描述
&&&&&&&&&&&&&&
数据集的名称；
&&&&&&&&&&&&&&
观测的数量；
&&&&&&&&&&&&&&
变量的数量；
&&&&&&&&&&&&&&
（2）&&&&&&
&&&&&&&&&&&&&&
变量类型；
&&&&&&&&&&&&&&
变量长度；
&&&&&&&&&&&&&&
变量的输出格式；
&&&&&&&&&&&&&&
变更的输入格式；
&&&&&&&&&&&&&
变量标识。
1.读入逗号分隔数据：cars_novname.csv
Acura,MDX,SUV,Asia,All,"$36,945 ","$33,337
",3.5,6,265,17,23,
Acura,RSX Type S 2dr,Sedan,Asia,Front,"$23,820 ","$21,761
",2,4,200,24,31,
Acura,TSX 4dr,Sedan,Asia,Front,"$26,990 ","$24,647
",2.4,4,200,22,29,
Acura,TL 4dr,Sedan,Asia,Front,"$33,195 ","$30,299
",3.2,6,270,20,28,
Acura,3.5 RL 4dr,Sedan,Asia,Front,"$43,755 ","$39,014
",3.5,6,225,18,24,
proc import datafile="cars_novname.csv" out=mydata dbms=csv
proc contents data=
SAS creates default variable names as VAR1-VARn when variables
names are not present in the raw data file.
2.读入制表键分隔的数据：
proc import datafile="cars.txt" out=mydata dbms=
&& getnames=
3.根据不同任务将不同的数据集永久保存到对应任务的文件夹下：
libname dis "c:\dissertation";
proc import datafile="cars.txt" out=dis.mydata dbms=dlm
&& delimiter='09'x;
&& getnames=
3.读入空格键分隔的数据：
proc import datafile="cars_sp.txt" out=mydata dbms=dlm
4.分隔符的终极例子：
Other kinds of delimiters
You can use delimiter= on the infile statement to tell SAS what
delimiter you are using to separate variables in your raw data
file. For example, below we have a raw data file that uses
exclamation points ! to separate the variables in the file.
The example below shows how to read this file by using
delimiter='!' on the infile statement.
INFILE 'readdel1.txt' DELIMITER='!' ;
PROC PRINT DATA=
As you can see in the output below, the data was read properly.
It is possible to use multiple delimiters. The example file below
uses either exclamation points or plus signs as delimiters.
By using delimiter='!+' on the infile statement, SAS will recognize
both of these as valid delimiters.
INFILE 'readdel2.txt' DELIMITER='!+' ;
PROC PRINT DATA=
As you can see in the output below, the data was read properly.
import缺陷及注意事项：
Proc import does not know the formats for your variables, but it is
able to guess the format based on what the beginning of your
dataset looks like. Most of the time, this guess is fine. But if
the length of a variable differs from beginning to end of your
file, you might end up with some truncated values.
重点语法-Infile options
For more complicated file layouts, refer to the infile options
described below.
The dlm= option can be used to specify the delimiter that separates
the variables in your raw data file. For example, dlm=','indicates
a comma is the delimiter (e.g., a comma separated file, .csv file).
Or, dlm='09'x indicates that tabs are used to separate your
variables (e.g., a tab separated file).
The dsd option has 2 functions. First, it recognizes two
consecutive delimiters as a missing value. For example, if your
file contained the line 20,30,,50 SAS will treat this as 20 30 50
but with the the dsd option SAS will treat it as 20 30 . 50 , which
is probably what you intended. Second, it allows you to include the
delimiter within quoted strings. For example, you would want to use
the dsd option if you had a comma separated file and your data
included values like "George Bush, Jr.". With the dsd option, SAS
will recognize that the comma in "George Bush, Jr." is part of the
name, and not a separator indicating a new variable.
This option tells SAS what on what line you want it to start
reading your raw data file. If the first record(s) contains header
information such as variable names, then set firstobs=n where n is
the record number where the data actually begin. For example, if
you are reading a comma separated file or a tab separated file that
has the variable names on the first line, then use firstobs=2 to
tell SAS to begin reading at the second line (so it will ignore the
first line with the names of the variables).
This option prevents SAS from going to a new input line if it does
not find values for all of the variables in the current line of
data. For example, you may be reading a space delimited file and
that is supposed to have 10 values per line, but one of the line
had only 9 values. Without the missover option, SAS will look for
the 10th value on the next line of data. If your data is supposed
to only have one observation for each line of raw data, then this
could cause errors throughout the rest of your data file. If you
have a raw data file that has one record per line, this option is a
prudent method of trying to keep such errors from cascading through
the rest of your data file.
Indicates which line in your raw data file should be treated as the
last record to be read by SAS. This is a good option to use for
testing your program. For example, you might use obs=100 to just
read in the first 100 lines of data while you are testing your
program. When you want to read the entire file, you can remove the
obs= option entirely.
A typical infile statement for reading a comma delimited file that
contains the variable names in the first line of data would be:
INFILE "test.txt" DLM=',' DSD MISSOVER FIRSTOBS=2 ;
读入有缺失值的数据或者读入数值中含有分隔符的数据
DATA cars2;
length make $ 20 ;
INFILE 'readdsd.txt' DELIMITER=',' DSD ;
PROC PRINT DATA=cars2;
48,'Bill Clinton',210
50,'George Bush, Jr.',180
DATA guys2;
length name $ 20 ;
INFILE 'readdsd2.txt' DELIMITER=',' DSD ;
PROC PRINT DATA=guys2;
最经典例子：从某行开始读入数据
DATA cars2;
length nf 8;
INFILE 'F:\cars1.csv' DELIMITER=',' dsd MISSOVER firstobs=2 ;
INPUT nf zh hh xb cs IHA
PROC PRINT DATA=cars2;
从FTP读入数据
read raw data via FTP in SAS?&
SAS has the ability to read raw data directly from FTP servers.
Normally, you would use FTP to download the data to your local
computer and then use SAS to read the data stored on your local
computer. SAS allows you to bypass the FTP step and read the data
directly from the other computer via FTP without the intermediate
step of downloading the raw data file to your computer. Of course,
this assumes that you can reach the computer via the internet at
the time you run your SAS program. The program below illustrates
how to do this. After the filename in you put ftp to tell SAS to
access the data via FTP. After that, you supply the name of the
file (in this case 'gpa.txt'. lrecl= is used to specify the width
of your data. Be sure to choose a value that is at least as wide as
your widest record. cd= is used to specify the directory from where
the file is stored. host= is used to specify the name of the site
to which you want to FTP. user= is used to provide your userid (or
anonymous if connecting via anonymous FTP). pass= is used to supply
your password (or your email address if connecting via anonymous
FILENAME in FTP 'gpa.txt' LRECL=80&
&&&&&&&&&&&&&&&
CD='/local2/samples/sas/ats/'&
&&&&&&&&&&&&&&&
HOST='cluster.oac.ucla.edu'
&&&&&&&&&&&&&&&
USER='joebruin'
&&&&&&&&&&&&&&&
PASS='yourpassword' ;
&& INPUT gpa hsm hss hse satm
PROC PRINT DATA=gpa(obs=10) ;
读入多个数据文件
quarter1.dat
quarter2.dat
quarter3.dat
quarter4.dat
filename year ('d:\quarter1.dat' 'd:\quarter2.dat'
'd:\quarter3.dat' 'd:\quarter4.dat');
input quarter sales
proc print data =
读取excel 数据集
Reading an Excel file into SAS
Suppose that you have an Excel spreadsheet called auto.xls. The
data for this spreadsheet are shown below.
MAKE&&&&&&&&&&
MPG WEIGHT PRICE
AMC Concord&&&
Pacer&&&&&
Spirit&&&&
Buick Century 20&& 3250
Buick Electra 15&& 4080
Using the Import Wizard is an easy way to import data into SAS. The
Import Wizard can be found on the drop down file menu. Although the
Import Wizard is easy it can be time consuming if used repeatedly.
The very last screen of the Import Wizard gives you the option to
save the statements SAS uses to import the data so that they can be
used again. The following is an example that uses common options
and also shows that the file was imported correctly.
PROC IMPORT OUT= WORK.auto1&
&&&&&&&&&&&
DATAFILE= "C:\auto.xls"&
&&&&&&&&&&&
DBMS=EXCEL REPLACE;
SHEET="auto1";&
GETNAMES=YES;
MIXED=YES;
USEDATE=YES;
SCANTIME=YES;
proc print data=auto1;
MAKE&&&&&&&&&&&&
Concord&&&&&&
Pacer&&&&&&&&
Spirit&&&&&&&
4&&& Buick
Century&&&&
5&&& Buick
Electra&&&&
First we use the out= statement to tell SAS where to store the data
once they are imported.&
Next the datafile= statement tells SAS where to find the file we
want to import.&
The dbms= statement is used to identify the type of file being
imported. This statement is redundant if the file you want to
import already has an appropriate file extension, for example
The replace statement will overwrite an existing
To specify which sheet SAS should import use the sheet="sheetname"
statement. The default is for SAS to read the first sheet. Note
that sheet names can only be 31 characters
The getnames=yes is the default setting and SAS will automatically
use the first row of data as variable names. If the first row of
your sheet does not contain variable names use the
getnames=no.&
SAS uses the first eight rows of data to determine whether the
variable should be read as character or numeric. The default
setting mixed=no assumes that each variable is either all character
or all numeric. If you have a variable with both character and
numeric values or a variable with missing values use mixed=yes
statement to be sure SAS will read it
correctly.&
Conveniently SAS reads date, time and datetime formats. The
usedate=yes is the default statement and SAS will read date or time
formatted data as a date. When usedate=no SAS will read date and
time formatted data with a datetime format. Keep the default
statement scantime=yes to read in time formatted data as long as
the variable does not also contain a date
Example 1: Making a permanent data file
What if you want the SAS data set created from proc import to be
permanent? The answer is to use libname statement. Let's say that
we have an Excel file called auto.xls in directory "d:\temp" and we
want to convert it into a SAS data file (call it myauto) and put it
into the directory "c:\dissertation". Here is what we can do.
libname dis "c:\dissertation";
proc import datafile="d:\temp\auto.xls" out=dis.myauto
Example 2: Reading in a specific sheet
Sometimes you may only want to read a particular sheet from an
Excel file instead of the entire Excel file. Let's say that we have
a two-sheet Excel file called auto2.xls. The example below shows
how to use the option sheet=sheetname to read the second sheet
called page2 in it.
proc import datafile="auto2.xls" out=auto1
sheet="page2";
Example 3: Reading a file without variable names
What if the variables in your Excel file do not have variable
names? The answer here is to use the statement getnames=no in proc
import. Here is an example showing how to do this.
proc import datafile="a:\faq\auto.xls" out=
Writing Excel files out from SAS
It is very easy to write out an Excel file using proc export in SAS
version 8. Consider the following sample data file below.
MAKE&&&&&&&&&&&&&&
MPG&&&&&&&&&
WEIGHT&&&&&&&&&&
AMC&&&&&&&&&&&&&&&&
22&&&&&&&&&&&
2930&&&&&&&&&&&
AMC&&&&&&&&&&&&&&&&
17&&&&&&&&&&&
3350&&&&&&&&&&&
AMC&&&&&&&&&&&&&&&&
22&&&&&&&&&&&
2640&&&&&&&&&&&
Buick&&&&&&&&&&&&&&
20&&&&&&&&&&&
3250&&&&&&&&&&&
Buick&&&&&&&&&&&&&&
15&&&&&&&&&&&
4080&&&&&&&&&&&
Here is a sample program that writes out an Excel file called
mydata.xls into the directory "c:\dissertation".
proc export data=mydata outfile='c:\dissertation\mydata.xls'
SAS读入复杂分隔数据——字符长度不同，字符中间有空格作为间隔符
1.字符长度不同
length site $41;
input age site $
12&&123456
130&&97654
254&&987654
Obs&&&&&&&&&&&&&&&&&&&&&
site&&&&&&&&&&&&&&&&&&&&&&
1&&&&&&&&&&&&&&&&&&
2&&&&&&&&&&&&&&&&&&&
input age site & $41.
130&&97654
age&&&&&&&&&&&&&&&&&&&&&
site&&&&&&&&&&&&&&&&&&&&&&&
12&&&&&&&&&&&&&&&&
130&&&&&&&&&&&&&&&&&&&
254&&&&&&&
2.字符有多个单词，单词之间用空格隔开
infile 'C:\messy.txt' delimiter = ' '
length fruit $22;
input zip fruit $
fruit&&&&&&&&&&&&&&&&&&
apples, grapes
oranges&&&&&&&&&&&&&&&
apple&&&&&&&&&&&
input zip fruit & $22.
10034 apples, grapes kiwi 123456
92626 oranges 97654
25414 pears
apple&&&&&
fruit&&&&&&&&&&&&&&&&&
10034&&& apples,
grapes kiwi&&&
oranges&&&&&&&&&&&&&&&&
25414&&& pears
apple&&&&&&&&&&&
没有格式库的情况下读入数据：
read a SAS data file when I don't have its format
If you try to use a SAS data file that has permanent formats but
you don't have the format library, you will get errors like
ERROR: The format $MAKEF was not found or could not be
ERROR: The format FORGNF was not found or could not be
Without the format library, SAS will not permit you to do anything
with the data file. However, if you
top of your program, SAS will go ahead and process the file despite
the fact that it does not have the format library. You will not be
able to see the formatted values for your variables, but you will
be able to process your data file. Here is an example.
libname in "c:\";
PROC FREQ DATA=in.
高效的保留或者去掉部分变量的方式：
The following program creates exactly the same file, but is a more
efficient program because SAS only reads the desired variables.
DATA auto2;
&& SET auto (KEEP = make mpg
The drop data step option works in a similar way.
DATA AUTO2;
&& SET auto (DROP = rep78 hdroom
trunk weight length&
&&&&&&&&&&&&&&&&&&&
turn displ gratio foreign);
比较双份录入的差异是否存在：
proc compare base = person1 compare = person2
已投稿到：
以上网友发言只代表其个人观点，不代表新浪网的观点或立场。}

天天发财游戏网