Discussion:
Use of Proc SQL to Count no of rows iteratively with a group
(too old to reply)
Aj
2006-06-19 08:23:46 UTC
Permalink
Hi All

I think using "RETAIN" and the last . or first . function this can be
done in a data step. But then
how can we use Proc SQL to assign rank to product with a BATCH ID (
Product being sorted in ascending order) .
THE TABLE IS AS FOLLOWS:

Batch Id Product
1 Q
1 W
1 E
2 A
2 C
3 U
3 T

TO RESULT SHOULD LOOK LIKE

Batch Id Product RANK
1 E 1
1 Q 2
1 W 3
2 A 1
2 C 2
3 T 1
3 U 2
Eric Hoogenboom
2006-06-19 09:02:03 UTC
Permalink
Aj,

See below for a three-step solution using the monotonic function. Not very
pretty but it delivers your required result.

I don't think (challenge!) that it can be done in one step.

Such a shame that Proc Rank does not rank on alphanumeric variables...


proc sql;
create table aj2 as
select batch
, product
from aj
order by 1, 2;

create table aj3 as
select *, monotonic() as rank
from aj2;

select aj3.batch
, aj3.product
, rank - minrank + 1 as rank
from aj3
, (select batch, min(rank) as minrank
from aj3
group by 1) batchrank
where aj3.batch = batchrank.batch;
quit;

Hth,
Eric
jshansen
2006-06-19 11:47:46 UTC
Permalink
Post by Eric Hoogenboom
Such a shame that Proc Rank does not rank on alphanumeric variables...
You can convert any alphanum var to its binary value and use that ...

Look here:
data one;
length batch 8 id $1 ;
input batch id $ ;
bin=input(put(id,binary.),best.);
cards;
1 Q
1 W
1 E
2 A
2 C
3 U
3 T
;;;;
run;
proc rank data=one out=two;
by batch;
var bin;
run;
proc sort data=two;
by batch bin;
run;

Jan Selchau-Hansen
Eric Eyre
2006-06-19 13:00:57 UTC
Permalink
AJ

* BTW I think it would also be nice if one could something like

proc sql;
select *, monotonic(some special argument) as rank
from one
group by id;

and get this kind of enumeration within groups

* below is not sql, but probably simpler

proc sort data=one out=two;
by id product;

data three;
set two;
by id;
if first.id then rank=0;
rank+1;

id product rank
1 E 1
1 Q 2
1 W 3
2 A 1
2 C 2
3 T 1
3 U 2

Erico
Post by Aj
Hi All
I think using "RETAIN" and the last . or first . function this can be
done in a data step. But then
how can we use Proc SQL to assign rank to product with a BATCH ID (
Product being sorted in ascending order) .
Batch Id Product
1 Q
1 W
1 E
2 A
2 C
3 U
3 T
TO RESULT SHOULD LOOK LIKE
Batch Id Product RANK
1 E 1
1 Q 2
1 W 3
2 A 1
2 C 2
3 T 1
3 U 2
unknown
2006-06-19 13:06:41 UTC
Permalink
The calculation of the binary equivalent could be done in a SQL or Data step view, so there will be only one pass of the data (through PROC RANK). Case sensitivity is in the eye of the beholder. If it isn't important then just do an upcase function. There are no unequal lengths (we are dealing with fixed length character vaiables) but leading spaces, among a host of other potential issues, might need to be handled in the view.

The biggest issue is portability. Would we get different binary values depending on the OS (e.g. PC vs MVS)?

Clint

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-***@LISTSERV.UGA.EDU]On Behalf Of Eric
Hoogenboom
Sent: Monday, June 19, 2006 8:40 AM
To: SAS-***@LISTSERV.UGA.EDU
Subject: Re: Use of Proc SQL to Count no of rows iteratively with a
group


Jan,

You are right, however it still needs an extra data step. And the binary
format is case-sensitive which may be a disadvantage. And what about
strings of unequal lengths.

I guess Rank is a very old and rusty Proc that SAS engineers prefer not to
touch anymore.

Eric
Eric Hoogenboom
2006-06-19 12:40:16 UTC
Permalink
Jan,

You are right, however it still needs an extra data step. And the binary
format is case-sensitive which may be a disadvantage. And what about
strings of unequal lengths.

I guess Rank is a very old and rusty Proc that SAS engineers prefer not to
touch anymore.

Eric
Howard Schreier <hs AT dc-sug DOT org>
2006-06-19 17:18:43 UTC
Permalink
First, introduce some ties to see how they are treated:

data have;
input batch id $ ;
cards;
1 Q
1 Q
1 Q
1 W
1 E
2 A
2 C
3 U
3 T
;

Now try:

proc sql;

create table ranked as
select batch,
id,
(select count(id)
from have
where batch=outer.batch and id<outer.id
)+1 as rank
from have as outer
order by batch, id
;

quit;

Result:

Obs batch id ran

1 1 E 1
2 1 Q 2
3 1 Q 2
4 1 Q 2
5 1 W 5
6 2 A 1
7 2 C 2
8 3 T 1
9 3 U 2
Post by Aj
Hi All
I think using "RETAIN" and the last . or first . function this can be
done in a data step. But then
how can we use Proc SQL to assign rank to product with a BATCH ID (
Product being sorted in ascending order) .
Batch Id Product
1 Q
1 W
1 E
2 A
2 C
3 U
3 T
TO RESULT SHOULD LOOK LIKE
Batch Id Product RANK
1 E 1
1 Q 2
1 W 3
2 A 1
2 C 2
3 T 1
3 U 2
Ian Whitlock
2006-06-19 18:46:18 UTC
Permalink
proc sql *, monotonic(some special argument) as rank
from one
group by id;
to enumerate within groups.
MONOTONIC is undocumented and in SQL by definition one should not
depend consistent behavior requiring a particular order of reading.
Consequently, the reader is responsible for reading below here and
using any of that information.
However, in the simple example given, one might guess that SQL will find no
reason to mess up the order. Hence a simple macro and a sort might do the
job.
%macro monotonic(temp) ;
%if %length(&temp) = 0 %then monotonic() ;
%else
monotonic() as &temp
, (calculated &temp - min(calculated &temp) + 1)
%end ;
%mend monotonic ;
Create test data.
data w ;
input id product $ ; r = ranuni(2354) ;
cards ;
1 E 1
2 C 2
1 W 3
2 A 1
1 Q 2
3 U 2
3 T 1
;
Then do the test.
/* sort required to standardize input to SQL
order clause not allowed in inline subquery
*/
proc sort data = w out = q ;
by id product ;
run ;
proc sql ;
select id, product, %monotonic(m) as rank
from q
group by id
order by id, rank
;
quit ;
With the exception of the required sort and the final ORDER clause, the
code is as simple as Erico required.
However, it is probably not a good idea to run with millions (maybe even
thousands) of observations. Add the procedure option _TREE to see
the complexity of the code request.
On the other hand it is hard to see how any under the hood implementation
of the requested MONOTONIC function could do much better than hide the
temporary variable without drastically changing the nature of SQL.
Ian Whitlock
Loading...