Discussion:
Grammar for the SAS language?
(too old to reply)
Trond Ydersbond
2010-02-09 10:44:58 UTC
Permalink
Raw Message
Is there an open source grammar for the SAS language (or important
subsets)? I have searched this group, and the closest I came to an
answer was a mentioning of the need for this in 1995... In the Carolia
project (http://www.dullesopen.com/), they used, successfully, the
antlr (Java) compiler tool for implementeing a signifiicant subset of
SAS, but it does not seem like the reverse engineered SAS grammar they
used has been published.

Why should anyone need such a grammar? Well, just think about it.

Trond
Savian
2010-02-09 12:11:12 UTC
Permalink
Raw Message
Post by Trond Ydersbond
Is there an open source grammar for the SAS language (or important
subsets)? I have searched this group, and the closest I came to an
answer was a mentioning of the need for this in 1995... In the Carolia
project (http://www.dullesopen.com/), they used, successfully, the
antlr (Java) compiler tool for implementeing a signifiicant subset of
SAS, but it does not seem like the reverse engineered SAS grammar they
used has been published.
Why should anyone need such a grammar?  Well, just think about it.
Trond
Why do you need it? Are you going to create a parser?

I have most of it (procs at least) but I am unsure whether I would
release it w/o knowing why. Similarly, I have the regex for a lot of
the language that I use in my SAS cleanup tool.

Alan
http://www.savian.net
Trond Ydersbond
2010-02-09 13:08:13 UTC
Permalink
Raw Message
Post by Savian
Why do you need it? Are you going to create a parser?
I have most of it (procs at least) but I am unsure whether I would
release it w/o knowing why. Similarly, I have the regex for a lot of
the language that I use in my SAS cleanup tool.
If you have any doubts about releasing, don't do it.
And I'm sure quite a few have something similar to what you have...
Having put quite a lot of work into it, too.. Think about that.

If you can't see why someone should need it, well, don't bother.

Then, you might think a bit about why R is such an enormous success.

If I can't find it elsewhere, I'm going to start doing a subset,
usable for antlr. And I will publish it.

The list of potential uses for such a grammar is very, very long.

Trond
Savian
2010-02-09 17:17:08 UTC
Permalink
Raw Message
Post by Trond Ydersbond
Post by Savian
Why do you need it? Are you going to create a parser?
I have most of it (procs at least) but I am unsure whether I would
release it w/o knowing why. Similarly, I have the regex for a lot of
the language that I use in my SAS cleanup tool.
If you have any doubts about releasing, don't do it.
And I'm sure quite a few have something similar to what you have...
Having put quite a  lot of work into it, too.. Think about that.
If you can't see why someone should need it, well, don't bother.
Then, you might think a bit about why R is such an enormous success.
If I can't find it elsewhere, I'm going to start doing a subset,
usable for antlr. And I will publish it.
The list of potential uses for such a grammar is very, very long.
Trond
I know what the need was for and I agree on the usefulness. That said,
you were very crytic in what you asked.

It isn't easy, btw, so you need to realize that from the get-go.
Things are not delimited well and certain constructs are very, very
hard to parse. I have mulled over building a flex/bison parser as well
but I don't see much reason to go there at this time. I have a lot of
the regex, especially for data step and have ALL of the procs
documented in XML.

I don't think you will find anything on the web. The only ones I know
who have worked in this area are WPS, Savian, and Carolina. WPS won't
give it to you and I highly doubt Dulles Research would.

The procs are the hardest, btw along with the input statement.

SaviClean, found on my utilities page, will illustrate the parsing.
Just paste in SAS code and give it a whirl.

Alan
http://www.savian.net
Trond Ydersbond
2010-02-10 13:28:22 UTC
Permalink
Raw Message
Post by Savian
I know what the need was for and I agree on the usefulness. That said,
you were very crytic in what you asked.
It isn't easy, btw, so you need to realize that from the get-go.
Things are not delimited well and certain constructs are very, very
hard to parse. I have mulled over building a flex/bison parser as well
but I don't see much reason to go there at this time. I have a lot of
the regex, especially for data step and have ALL of the procs
documented in XML.
I don't think you will find anything on the web. The only ones I know
who have worked in this area are WPS, Savian, and Carolina. WPS won't
give it to you and I highly doubt Dulles Research would.
The procs are the hardest, btw along with the input statement.
SaviClean, found on my utilities page, will illustrate the parsing.
Just paste in SAS code and give it a whirl.
Thanks!
I am aware of at least some of the problems, think they go in part
back to SAS' Fortran heritage?
This is also why I think it is best to use tools like antlr. Scanning
and parsing can't really be distinguished completely from each other.
Savian
2010-02-10 14:39:19 UTC
Permalink
Raw Message
Post by Trond Ydersbond
Post by Savian
I know what the need was for and I agree on the usefulness. That said,
you were very crytic in what you asked.
It isn't easy, btw, so you need to realize that from the get-go.
Things are not delimited well and certain constructs are very, very
hard to parse. I have mulled over building a flex/bison parser as well
but I don't see much reason to go there at this time. I have a lot of
the regex, especially for data step and have ALL of the procs
documented in XML.
I don't think you will find anything on the web. The only ones I know
who have worked in this area are WPS, Savian, and Carolina. WPS won't
give it to you and I highly doubt Dulles Research would.
The procs are the hardest, btw along with the input statement.
SaviClean, found on my utilities page, will illustrate the parsing.
Just paste in SAS code and give it a whirl.
Thanks!
I am aware of at least some of the problems, think they go in part
back to SAS' Fortran heritage?
This is also why I think it is best to use tools like antlr. Scanning
and parsing can't really be distinguished completely from each other.- Hide quoted text -
- Show quoted text -
SAS should be primarily PL1 but it doesn't matter much.

Scan it out and see what you come up with.

Input statement is really, really nasty. Most of the other stuff is
fine and is easy to work with. I would focus on the most basic of
procs and ignore the more complex ones like IML.

Here is an example of happiness:

INPUT

@20 X ?:$test5.
;

or...

INPUT

+(-10) X ?:$test5.
;

They all look easy until you view them as a whole and it gets complex.
PUTs are the same.Macros also have their weirdness but it is all
doable.

I will assume you are doing to make a translator (R to SAS and vice-
versa) but that is a guess ;-]

Alan
http://www.savian.net
a***@gmail.com
2012-09-01 00:40:06 UTC
Permalink
Raw Message
Hi Savian,

I am taking a Grad course this semester where I have to use SAS. As an experiment I was thinking of an open source project to emulate SAS using R.

I'd be interested in getting access to your grammar if you would like to contribute it to the project.

Thanks.
Post by Savian
Post by Trond Ydersbond
Is there an open source grammar for the SAS language (or important
subsets)? I have searched this group, and the closest I came to an
answer was a mentioning of the need for this in 1995... In the Carolia
project (http://www.dullesopen.com/), they used, successfully, the
antlr (Java) compiler tool for implementeing a signifiicant subset of
SAS, but it does not seem like the reverse engineered SAS grammar they
used has been published.
Why should anyone need such a grammar?  Well, just think about it.
Trond
Why do you need it? Are you going to create a parser?
I have most of it (procs at least) but I am unsure whether I would
release it w/o knowing why. Similarly, I have the regex for a lot of
the language that I use in my SAS cleanup tool.
Alan
http://www.savian.net
Kenneth M. Lin
2012-09-04 22:51:26 UTC
Permalink
Raw Message
Very simple. A semicolon after every complete statement/sentence. Not case
sensitive.



"Trond Ydersbond" wrote in message news:3d3875dd-53c9-40c7-8bc8-***@k19g2000yqc.googlegroups.com...

Is there an open source grammar for the SAS language (or important
subsets)? I have searched this group, and the closest I came to an
answer was a mentioning of the need for this in 1995... In the Carolia
project (http://www.dullesopen.com/), they used, successfully, the
antlr (Java) compiler tool for implementeing a signifiicant subset of
SAS, but it does not seem like the reverse engineered SAS grammar they
used has been published.

Why should anyone need such a grammar? Well, just think about it.

Trond
c***@coopology.com
2014-04-22 14:02:26 UTC
Permalink
Raw Message
Did anyone ever pursue this? I've been thinking about taking on a project that would use a SAS grammar like this and would love it if there was a published version or if someone could share one with me.

Thanks!
m***@gmail.com
2017-07-11 19:01:45 UTC
Permalink
Raw Message
Post by c***@coopology.com
Did anyone ever pursue this? I've been thinking about taking on a project that would use a SAS grammar like this and would love it if there was a published version or if someone could share one with me.
Thanks!
If anyone else is still looking, CPAN has a SAS parser that may be of help:
http://search.cpan.org/~mlf/SAS-Parser-0.93/

Loading...