Discussion:
Weighted Counts in PROC REPORT
(too old to reply)
Vandenbroucke, David A
2009-09-04 12:34:48 UTC
Permalink
I do a lot of work with survey data that must be weighted in order to produce population-level estimates. I can't seem to work out how to get weighted frequencies when using Proc Report. The WEIGHT statement will give weighted sums, means, etc., but the N statistic is always unweighted. Now, unweighted counts are also useful to have, but often I need weighted ones as well, sometimes in the same table. It's not unusual for me to want to cross-tabulate the unweighted count, the weighted frequency, and the mean in the same report.

My work-around has been to use a data step to define a variable as Unit = 1, a constant. Then I can use a SUM statistic to get the weighted frequency. There has to be a better way to do this, without having to go through that extra step.

As an example, here is a Report step using housing data.
ZAdeq = a categorical variable measuring housing quality
ZSMHC = a continuous variable containing the monthly housing cost
SMSA = metropolitan area identifier
Weight = the weighting variable

PROC REPORT DATA=something NOWD;
COLUMN SMSA ZAdeq, (N Unit ZSMHC);
DEFINE SMSA/Group "Metro Area";
DEFINE ZAdeq /Across "Housing Quality";
DEFINE N / "Sample Cases";
DEFINE Unit /SUM "Housing Units";
DEFINE ZSMHC /MEAN "Mean Cost";
WEIGHT Weight;
RUN;

Dav Vandenbroucke
Senior Economist
U.S. Dept. HUD
***@hud.gov
202-402-5890

I disclaim any disclaimers.
Joe Matise
2009-09-04 13:09:50 UTC
Permalink
N is always unweighted (always!) in SAS (that I've encountered anyway).
SUMWGT is the sum of the weights, and would be effectively the 'weighted N'.

-Joe

On Fri, Sep 4, 2009 at 7:34 AM, Vandenbroucke, David A <
Post by Vandenbroucke, David A
I do a lot of work with survey data that must be weighted in order to
produce population-level estimates. I can't seem to work out how to get
weighted frequencies when using Proc Report. The WEIGHT statement will give
weighted sums, means, etc., but the N statistic is always unweighted. Now,
unweighted counts are also useful to have, but often I need weighted ones as
well, sometimes in the same table. It's not unusual for me to want to
cross-tabulate the unweighted count, the weighted frequency, and the mean in
the same report.
My work-around has been to use a data step to define a variable as Unit =
1, a constant. Then I can use a SUM statistic to get the weighted
frequency. There has to be a better way to do this, without having to go
through that extra step.
As an example, here is a Report step using housing data.
ZAdeq = a categorical variable measuring housing quality
ZSMHC = a continuous variable containing the monthly housing cost
SMSA = metropolitan area identifier
Weight = the weighting variable
PROC REPORT DATA=something NOWD;
COLUMN SMSA ZAdeq, (N Unit ZSMHC);
DEFINE SMSA/Group "Metro Area";
DEFINE ZAdeq /Across "Housing Quality";
DEFINE N / "Sample Cases";
DEFINE Unit /SUM "Housing Units";
DEFINE ZSMHC /MEAN "Mean Cost";
WEIGHT Weight;
RUN;
Dav Vandenbroucke
Senior Economist
U.S. Dept. HUD
202-402-5890
I disclaim any disclaimers.
Vandenbroucke, David A
2009-09-04 13:17:51 UTC
Permalink
Sent: Friday, September 04, 2009 9:10 AM
To: Vandenbroucke, David A
Subject: Re: Weighted Counts in PROC REPORT
N is always unweighted (always!) in SAS (that I've encountered anyway). SUMWGT is the sum of >the weights, and would be effectively the 'weighted N'.
Thank you. Would you show me how I would use SUMWGT in a PROC REPORT step? When I insert it into a COLUMN statement as I would N, I get an error.

Dav Vandenbroucke
Senior Economist
U.S. Dept. HUD
***@hud.gov
202-402-5890

I disclaim any disclaimers.
Arthur Tabachneck
2009-09-04 22:54:38 UTC
Permalink
David,

I find it interesting that I can't find any historical posts on sas-l that
address how to use sumwgt in proc report.

Does the following code accomplish the same thing?:

PROC REPORT DATA=test NOWD;
COLUMN SMSA ZAdeq, (weight Unit ZSMHC);
DEFINE SMSA/Group "Metro Area";
DEFINE ZAdeq /Across "Housing Quality";
DEFINE weight /SUM "Sample Cases";
DEFINE Unit /SUM "Housing Units";
DEFINE ZSMHC /MEAN "Mean Cost";
WEIGHT Weight;
RUN;

i.e., where 'weight' is your weighting variable and you simply report the
sum of that variable.

HTH,
Art
--------
On Fri, 4 Sep 2009 09:17:51 -0400, Vandenbroucke, David A
Post by Vandenbroucke, David A
Sent: Friday, September 04, 2009 9:10 AM
To: Vandenbroucke, David A
Subject: Re: Weighted Counts in PROC REPORT
N is always unweighted (always!) in SAS (that I've encountered anyway).
SUMWGT is the sum of >the weights, and would be effectively the 'weighted
N'.
Post by Vandenbroucke, David A
Thank you. Would you show me how I would use SUMWGT in a PROC REPORT step?
When I insert it into a COLUMN statement as I would N, I get an error.
Post by Vandenbroucke, David A
Dav Vandenbroucke
Senior Economist
U.S. Dept. HUD
202-402-5890
I disclaim any disclaimers.
Mark Miller
2009-09-05 00:12:07 UTC
Permalink
IFF you have positive integer weights(rare but possible), you could use
FREQ instead of WEIGHT.

SAS offers both FREQ and WEIGHT for weighting but they each seem to have
always been
handled differently in different procedures and not in ways that I have
always found useful.
I'm not sure whether you can get way with using both of these at the same
time in Proc Report.

... Mark Miller
Post by Arthur Tabachneck
David,
I find it interesting that I can't find any historical posts on sas-l that
address how to use sumwgt in proc report.
PROC REPORT DATA=test NOWD;
COLUMN SMSA ZAdeq, (weight Unit ZSMHC);
DEFINE SMSA/Group "Metro Area";
DEFINE ZAdeq /Across "Housing Quality";
DEFINE weight /SUM "Sample Cases";
DEFINE Unit /SUM "Housing Units";
DEFINE ZSMHC /MEAN "Mean Cost";
WEIGHT Weight;
RUN;
i.e., where 'weight' is your weighting variable and you simply report the
sum of that variable.
HTH,
Art
--------
On Fri, 4 Sep 2009 09:17:51 -0400, Vandenbroucke, David A
Post by Vandenbroucke, David A
Sent: Friday, September 04, 2009 9:10 AM
To: Vandenbroucke, David A
Subject: Re: Weighted Counts in PROC REPORT
N is always unweighted (always!) in SAS (that I've encountered anyway).
SUMWGT is the sum of >the weights, and would be effectively the 'weighted
N'.
Post by Vandenbroucke, David A
Thank you. Would you show me how I would use SUMWGT in a PROC REPORT
step?
When I insert it into a COLUMN statement as I would N, I get an error.
Post by Vandenbroucke, David A
Dav Vandenbroucke
Senior Economist
U.S. Dept. HUD
202-402-5890
I disclaim any disclaimers.
Continue reading on narkive:
Loading...