Discussion:
proc import, .csv, and commas in fields
(too old to reply)
Andrei Jirnyi
2012-05-07 20:47:26 UTC
Permalink
Is there any way to use proc import to import a .csv file, where
several variables contain longish text fields with commas in them
(enclosed in quotes, possibly containing newlines and literal quotes
escaped by doubling)? e.g. if I have:

# test.csv:
a,b,c
1, "some text", 11
2, "some, more, text", 22
3, "yet some, ""quoted text"" with, commas", 33

# test.sas:
proc import datafile="test.csv" out=test dbms=csv;
run;

proc print data=test;
run;

I get garbage:
# test.lst
Obs a b c

1 1 some text 11
2 2 some, more, 22
3 3 yet some, " 33
4 .

Obs VAR4 VAR5

1 .
2 .
3 .
4 .
Reeza
2012-05-07 23:09:46 UTC
Permalink
Post by Andrei Jirnyi
Is there any way to use proc import to import a .csv file, where
several variables contain longish text fields with commas in them
(enclosed in quotes, possibly containing newlines and literal quotes
a,b,c
1, "some text", 11
2, "some, more, text", 22
3, "yet some, ""quoted text"" with, commas", 33
proc import datafile="test.csv" out=test dbms=csv;
run;
proc print data=test;
run;
# test.lst
Obs a b c
1 1 some text 11
2 2 some, more, 22
3 3 yet some, " 33
4 .
Obs VAR4 VAR5
1 .
2 .
3 .
4 .
When you import SAS will generate the code in your log.
Copy the log and modify the length of the variables to import until you get it the way you want/need.

Though I can see that third line might be difficult...

TIP you can get rid of the line numbers by holding down ALT and then just selecting the code, rather than the log numbers.
lchmdream
2012-05-08 02:23:15 UTC
Permalink
Post by Reeza
Post by Andrei Jirnyi
Is there any way to use proc import to import a .csv file, where
several variables contain longish text fields with commas in them
(enclosed in quotes, possibly containing newlines and literal quotes
a,b,c
1, "some text", 11
2, "some, more, text", 22
3, "yet some, ""quoted text"" with, commas", 33
proc import datafile="test.csv" out=test dbms=csv;
run;
proc print data=test;
run;
# test.lst
Obs               a    b              c
 1                1    some text      11
 2                2    some, more,    22
 3                3    yet some, "    33
 4                .
Obs    VAR4               VAR5
 1                           .
 2                           .
 3                           .
 4                           .
When you import SAS will generate the code in your log.
Copy the log and modify the length of the variables to import until you get it the way you want/need.
Though I can see that third line might be difficult...
TIP you can get rid of the line numbers by holding down ALT and then just selecting the code, rather than the log numbers.- Hide quoted text -
- Show quoted text -
/
**********************************************************************
* PRODUCT: SAS
* VERSION: 9.1
* CREATOR: External File Interface
* DATE: 04MAY12
* DESC: Generated SAS Datastep Code
* TEMPLATE SOURCE: (None Specified.)

***********************************************************************/
data WORK.text ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile 'C:\0 Frank LI\Text.csv' delimiter = ',' MISSOVER DSD
lrecl=32767 firstobs=2 ;
informat a best32. ;
informat b $50. ;
informat c $31. ;
informat VAR4 $10. ;
informat VAR5 best32. ;
format a best12. ;
format b $50. ;
format c $31. ;
format VAR4 $10. ;
format VAR5 best12. ;
input
a
b $
c $
VAR4 $
VAR5
;
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR
detection macro variable */
run;

data textnew;
set text;
bb = compress(b, '"');
n = count(bb, ',') + 1;
do i = 1 to n;
y = scan(bb, i, ',');
output;
end;
keep a b c y;
run;

Loading...