Discussion:
Wildcard character in Proc Datasets
(too old to reply)
Hari
2006-01-16 17:45:24 UTC
Permalink
Hi,

I can use Proc datasets to Copy many files STARTING with a particular
letter or string (using colon : after the repeating letters) .

How do I do copy selected datasets only from my library in case, the
files of interest have names ENDING with let's say "_Pattern" and I
want to copy all such data sets. For example I want to copy dataset
named "ABC_Pattern", "YetAnother_Pattern", "More_Pattern",
"NeverEnding_Pattern", "Meaningless_Pattern" , "ItMakesSense_Pattern"
without specifying the name of each file. Is it possible? Is there a
"wildcard character" for starting of the filename?

Similarly I want to delete all the files ENDING with "_DeleteMe" from a
particular library. Is there a method equivalent to specifying colon
for the same?

regards,
Hari
India

PS: Is it non-standard to name files in the above manner?
Stéphane Colas
2006-01-16 20:02:49 UTC
Permalink
you can specify

files with starting string with :
all the file from a lib into the proc datasets

but wih the ending string is difficult. I suggest you to read the lib and load
the names like '%_pattern' via a call symput statement.

After you develop the names with a macro %do loop...

proc datasets...;
select %do i=1 %to &numfiles ; &myfiles %end;;
quit;


Stéphane.
Post by Hari
Hi,
I can use Proc datasets to Copy many files STARTING with a particular
letter or string (using colon : after the repeating letters) .
How do I do copy selected datasets only from my library in case, the
files of interest have names ENDING with let's say "_Pattern" and I
want to copy all such data sets. For example I want to copy dataset
named "ABC_Pattern", "YetAnother_Pattern", "More_Pattern",
"NeverEnding_Pattern", "Meaningless_Pattern" , "ItMakesSense_Pattern"
without specifying the name of each file. Is it possible? Is there a
"wildcard character" for starting of the filename?
Similarly I want to delete all the files ENDING with "_DeleteMe" from a
particular library. Is there a method equivalent to specifying colon
for the same?
regards,
Hari
India
PS: Is it non-standard to name files in the above manner?
Richard A. DeVenezia
2006-01-16 21:10:00 UTC
Permalink
Post by Hari
Hi,
I can use Proc datasets to Copy many files STARTING with a particular
letter or string (using colon : after the repeating letters) .
How do I do copy selected datasets only from my library in case, the
files of interest have names ENDING with let's say "_Pattern" and I
want to copy all such data sets.
Use the metadata dictionary to make a list of names based on any selection
criteria you can dream up.
Post by Hari
For example I want to copy dataset
named "ABC_Pattern", "YetAnother_Pattern", "More_Pattern",
"NeverEnding_Pattern", "Meaningless_Pattern" , "ItMakesSense_Pattern"
without specifying the name of each file. Is it possible?
Without specifying each -- no. Possible -- yes.
Post by Hari
Is there a "wildcard character" for starting of the filename?
Not in DATASETS
Post by Hari
Similarly I want to delete all the files ENDING with "_DeleteMe" from
a particular library. Is there a method equivalent to specifying colon
for the same?
No.


Going back to "Use the metadata dictionary to make a list of names based on
any selection criteria you can dream up." I chose Perl regular expressions
as the agent of specification for selection.
------------------------
%macro ListMembersNamedLike (pattern=123, lib=work, resultList=,
memtype=DATA);

proc sql;
select memname into :&resultList separated by ' '
from dictionary.members
where libname = "%upcase(&lib.)"
and prxMatch("/&pattern./i",trim(memname)) ne 0
and memtype = "%upcase(&memtype)"
;
quit;
%mend;


data x_1 x_2 x_3 x_4 x_5 x_55 x_100;
x=1;
run;

data x_42 / view=x_42;
set x_1;
run;

options mprint;

%let myList=;
%ListMembersNamedLike (pattern=^x_, lib=work, resultList=myList);
%put myList=&myList;

%let myList=;
%ListMembersNamedLike (pattern=_\d$, lib=work, resultList=myList);
%put myList=&myList;

%let myList=;
%ListMembersNamedLike (pattern=_\d+$, lib=work, resultList=myList);
%put myList=&myList;


%let myList=;
%ListMembersNamedLike (pattern=_\d{2}$, lib=work, resultList=myList,
memtype=VIEW);
%put myList=&myList;
------------------------

Richard A. DeVenezia
http://www.devenezia.com/
Hari
2006-01-18 19:52:34 UTC
Permalink
Richard,

Thanks for the RegEx solution. Your solution will give me a starting
point to learn as to how to use Regex.

I was trying RegEx past week for some problem of my own in SAS and
couldnt make much headway (later it got solved through some other
method).

Regards,
Hari
India
Daniel Boisvert
2006-01-17 22:46:33 UTC
Permalink
If you don't want to use Perl expressions, you could use something like:

PROC SQL;
SELECT DISTINCT memname
INTO :delme SEPARATED BY ' '
FROM sashelp.vstable
WHERE UPCASE(libname)='WORK' AND SCAN(UPCASE(memname),-1,'_')='DELETEME';
QUIT;

PROC DATASETS MT=data LIB=work NOLIST;
DELETE &delme.;
QUIT;

Just a thought...
Dan
Hari
2006-01-18 19:06:52 UTC
Permalink
Daniel,

I liked this method a lot. Infact a person in this group used the INTO
within Proc SQL for one of my problems and since then I have got
hooked to it and have been using it indiscriminately (though in may
places it would have been more efficient to use some other method).

I will have to start exploring the different tables within SASHELP (I
peaked in to it right now) so that I can use put them to use.

I have 2 requests:-

a) Could you please point me to a place where I can look up the
information stored in "dictionary.table" (this was in from statement of
Proc SQL in one solution I got for another topic). Iam able to see
SASHELP.Table in my SAS explorer. Is dictionary.table accessible from
SAS explorer?

b) Is there any place in HELP where the "system tables" such as
sashelp.table, dictionary.table are documented. I would get a better
understanding of information already recorded by SAS in a "neat" form
and might be able to put it to some original use in future.

Regards,
Hari
India
Richard A. DeVenezia
2006-01-18 19:53:13 UTC
Permalink
Post by Hari
Daniel,
I liked this method a lot. Infact a person in this group used the INTO
within Proc SQL for one of my problems and since then I have got
hooked to it and have been using it indiscriminately (though in may
places it would have been more efficient to use some other method).
You best be prepared to get bit. Indiscriminate use will lead you to the
64K limit of macro variable values.

%let x = %sysfunc(repeat(1,65535));


INTO is not be so polite as to tell you up front when your into :var will be
receiving more data than is allowed in a macro variable.


data x;
do x = 1 to 100000;
output;
end;
run;

proc sql;
reset noprint;
select x into :xs separated by ' ' from x;
quit;

Only when you try to resolve it does the error appear.

%put %length(&xs);


Note that the macro variable (or rather macro system?) knows what its
oversized length of the variable is. This is a little odd and might
indicate longer macro variable values will be allowed in future releases.
--
Richard A. DeVenezia
http://www.devenezia.com/
Hari
2006-01-18 19:56:27 UTC
Permalink
Richard,

Thanks for the tip. Wasnt aware that such limits exist. I got the error

ERROR: The text expression length (588894) exceeds maximum length
(65534). The text
expression has been truncated to 65534 characters.
1372 %put %length(&xs);
65534

regards,
Hari
India
Hari
2006-01-18 20:00:33 UTC
Permalink
Daniel and Richard,

I just wanted to add as to why I have filenames ending with patterns as
described above and why dont I use starting patterns.

If I want to find files in windows explorer or even SAS explorer then
if many of the file names start with same pattern then it becomes
difficult to navigate quickly to that file. So, the differentiation is
kept in the starting part. Does this logic make sense? Does everybody
maintain files in suhc manner (Anyway not all my files follow such
naming rules, I devise rules - though probably not the best-- as per
the situation).

Regards,
Hari
India
David L Cassell
2006-01-18 19:38:43 UTC
Permalink
Post by Hari
I liked this method a lot. Infact a person in this group used the INTO
within Proc SQL for one of my problems and since then I have got
hooked to it and have been using it indiscriminately (though in may
places it would have been more efficient to use some other method).
I will have to start exploring the different tables within SASHELP (I
peaked in to it right now) so that I can use put them to use.
I have 2 requests:-
a) Could you please point me to a place where I can look up the
information stored in "dictionary.table" (this was in from statement of
Proc SQL in one solution I got for another topic). Iam able to see
SASHELP.Table in my SAS explorer. Is dictionary.table accessible from
SAS explorer?
b) Is there any place in HELP where the "system tables" such as
sashelp.table, dictionary.table are documented. I would get a better
understanding of information already recorded by SAS in a "neat" form
and might be able to put it to some original use in future.
In reverse order...

The SASHELP.### tables are merely views of the DICTIONARY.@@@
tables. The names are not identical, but close enough to tell which is
which.
So the dictionary.indexes table is also accessible from the sashelp.vindex
view.

There are difference here and there. The dictionary.members table
has meta-data on *all* objects in defined SAS libraries. This has a
matching
view: sashelp.vmember. But it also gets split apart into sashelp.vscatalg,
sashelp.vslib, sashelp.vstable, sashelp.vstabvw, and so on for all the
different
types of members.

If you go to the SAS Online Docs and search for 'VCOLUMN' you should find
a table of the dictionary datasets, their related sashelp views, and a quick
description. But to get the exact contents, I recommend you look for
yourself:


proc sql ;
describe table dictionary.indexes;
quit;


Now you have the exact details on the contents of the file. To see how the
views are built from these dictionary tables, do this:


proc sql;
describe view sashelp.vindex;
quit;


HTH,
David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
Hari
2006-01-20 17:09:00 UTC
Permalink
David,

Thanks for the reply.
Post by David L Cassell
and search for 'VCOLUMN' you should find
a table of the dictionary datasets, their related sashelp views, and a quick
description. But to get the exact contents, I recommend you look for
I got it in Base SAS/ SAS LC/ SAS file concepts/ DICTIONARY tables.

Regards,
Hari
India
Daniel Boisvert
2006-01-19 14:32:16 UTC
Permalink
Hari,

Sounds fine. As long as it is consistent.
David L Cassell
2006-01-19 19:26:42 UTC
Permalink
Post by Hari
I just wanted to add as to why I have filenames ending with patterns as
described above and why dont I use starting patterns.
If I want to find files in windows explorer or even SAS explorer then
if many of the file names start with same pattern then it becomes
difficult to navigate quickly to that file. So, the differentiation is
kept in the starting part. Does this logic make sense? Does everybody
maintain files in suhc manner (Anyway not all my files follow such
naming rules, I devise rules - though probably not the best-- as per
the situation).
As long as you maintain a consistent naming system that meets company
requirements
and also meets *your* needs, I don't see a problem.

If you have so many files in a directory that you need special naming
conventions
just to find them with WinDoze Exploder, then perhaps the problem is that
you
need to organize the files into subdirectorties that better accommodate your
task requirements.

And not everyone maintains files and hierarchies the same way. I think you
might
benefit from looking around and seeing how some of your colleagues do it.
When there are multiple projects running around loose, it is often better to
have a separate direcory hierarchy for each project, with the same structure
in each hierarchy so that people can find things when they move to another
project, and so that code (like autocall macros) can run consistently
regardless
of project.

Do you separate out projects?
Within projects, do you have separate directories for data sources, code,
logs,
list output, HTML, graphs, etc.?
Would such an arrangement simplify your life?

HTH,
David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330

_________________________________________________________________
Is your PC infected? Get a FREE online computer virus scan from McAfee®
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963
Hari
2006-01-20 18:17:05 UTC
Permalink
David,
If you have so many files in a directory that you need special naming conventions
just to find them with WinDoze Exploder, then perhaps the problem is that you
eed to organize the files into subdirectorties that better accommodate your
ask requirements.
Actually I do seperate out projects and do have *lots*of sub-folders to
oganize my stuff. Wthin a sub-folder I might have let's say on average
20 or 25 files and its much easier to see the files of ineterest
qucikly when I have some customized conventions as per the project
commonalities. Likena single project might involve close to lets say 10
vendors with each having their own formats, type of data etc and so .My
company doesnt have any set standards as far as naming of files/folders
is concerned. I keep trying to find the best way to organize my stuff,
but feel that I can do a lot better by learning from more experienced
folks (My colleagues arent bitten by nomenclature bug as am).

Regards,
Hari
India

David L Cassell
2006-01-19 19:29:58 UTC
Permalink
Post by Hari
Richard,
Thanks for the tip. Wasnt aware that such limits exist. I got the error
ERROR: The text expression length (588894) exceeds maximum length
(65534). The text
expression has been truncated to 65534 characters.
1372 %put %length(&xs);
65534
Yes, that's the error you should get. 65534 = (2**16) - 2 . So it's a
built-in
limit based on data storage. (It used to be a lot smaller.)

My feeling is that if I have created a macro string which is threatening to
hit that
limit, then I have made a mistake somewhere in my code design.

David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
Loading...