Discussion:
how to exclude value 0 when doing proc means?
(too old to reply)
demin
2009-11-12 16:11:10 UTC
Permalink
Raw Message
I want to exclude all items =0 and =. when doiong proc means, exclude
missing valude in the default of SAS. But if I want to also exlude all
items=0 but there are lots of items need to do the proc means. Do I
have to do the items one by one? Seems the exclusive option doesn't
work for this.
proc means data=mytable n mean max;
var Q1 Q2 ... Q10;
ods output Summary=mytable;
run;

Do I have to deal with each item, and delete those obs=0 items, and
then do the proc means for one item, and then repeat another item?
Thank in advance.
Paige Miller
2009-11-12 16:20:22 UTC
Permalink
Raw Message
Post by demin
I want to exclude all items =0 and =. when doiong proc means, exclude
missing valude in the default of SAS. But if I want to also exlude all
items=0 but there are lots of items need to do the proc means. Do I
have to do the items one by one? Seems the exclusive option doesn't
work for this.
proc means data=mytable n mean max;
  var Q1 Q2 ... Q10;
ods output Summary=mytable;
run;
Do I have to deal with each item, and delete those obs=0 items, and
then do the proc means for one item, and then repeat  another item?
Thank in advance.
Without further explanation from you, I think excluding zeros from
your calculations is a very dangerous thing to do, and could result in
seriously misleading results. In fact, this is one reason why the
Space Shuttle Challenger was allowed to be launched and subsequently
exploded, because someone did an analysis and excluded all zeros.

Now in some cases, it may be appropriate to exclude zeros. Can you
explain why it is appropriate to exclude zeros?

--
Paige Miller
paige\dot\miller \at\ kodak\dot\com
Joe Matise
2009-11-12 16:49:20 UTC
Permalink
Raw Message
With Paige's suggestion strongly in mind (but understanding you have reasons
you might do this, such as survey data where 0=refused, though i'd use
special missing .R or such), if you must do this, then you can use a format
that sets 0 to display as . [missing].

For example:

proc format lib=work;
value elevmisf
11=.
.=.
;
quit;

data class;
set sashelp.class;
format age elevmisf.;
if age=15 then age=.;
run;

proc means data=class;
format age elevmisf2.;
var age;
run;

The 15's I code over to missing because you have to have one actual missing
value for this to work. The 11's are the actually interesting ones.
Pretend that 11 is 0, in your data. By applying a format to them that is
identical to the missing value format, they become considered to be missing,
the same as anything else, and are excluded from the means. HOWEVER, this
seems to only work when you actually have some real missing values in there;
if you comment out the age=15 bit, then you just get a row with them
included as .

-Joe
Post by demin
I want to exclude all items =0 and =. when doiong proc means, exclude
missing valude in the default of SAS. But if I want to also exlude all
items=0 but there are lots of items need to do the proc means. Do I
have to do the items one by one? Seems the exclusive option doesn't
work for this.
proc means data=mytable n mean max;
var Q1 Q2 ... Q10;
ods output Summary=mytable;
run;
Do I have to deal with each item, and delete those obs=0 items, and
then do the proc means for one item, and then repeat another item?
Thank in advance.
Gerhard Hellriegel
2009-11-12 16:29:52 UTC
Permalink
Raw Message
In normal cases it makes no sense to treat 0 as missing. Therefor proc
means and any other procs don't use that similar.
You should use a datastep before you go into proc means:

data out;
set in;
if a=0 then a=.;
if b=0 then b=.;
...
run;

proc means;
...

If you want that for all numeric variables, you can use:

data out;
set in;
array nn (*) _numeric_;
do i=1 to dim(nn);
if nn(i)=0 then nn(i)=.;
end;
drop i;
run;


To force that behaviour.
Gerhard
Post by demin
I want to exclude all items =0 and =. when doiong proc means, exclude
missing valude in the default of SAS. But if I want to also exlude all
items=0 but there are lots of items need to do the proc means. Do I
have to do the items one by one? Seems the exclusive option doesn't
work for this.
proc means data=mytable n mean max;
var Q1 Q2 ... Q10;
ods output Summary=mytable;
run;
Do I have to deal with each item, and delete those obs=0 items, and
then do the proc means for one item, and then repeat another item?
Thank in advance.
Gerhard Hellriegel
2009-11-12 17:07:05 UTC
Permalink
Raw Message
that might be a interesting thing. It's hard for me to understand where it
is of use to treat 0 as missing. for sum it is unimportant. But what about
range, mean, ... and all the others?
Look at the values
0 12 24
mean = 18 or mean = 12 ?
range= 12 ?

even simply the _freq_ is strange, cause there are 3 values, not 2.
If you have the birthdate 01jan1960, you are 0 years old??? If you are
born one day later, you are something around 50 years old!! What one day
makes a difference! And that "new born"-state remains the same until you
die!!!! (or you change the tool to calculate the age...)

Gerhard
Post by Paige Miller
Post by demin
I want to exclude all items =0 and =. when doiong proc means, exclude
missing valude in the default of SAS. But if I want to also exlude all
items=0 but there are lots of items need to do the proc means. Do I
have to do the items one by one? Seems the exclusive option doesn't
work for this.
proc means data=mytable n mean max;
var Q1 Q2 ... Q10;
ods output Summary=mytable;
run;
Do I have to deal with each item, and delete those obs=0 items, and
then do the proc means for one item, and then repeat another item?
Thank in advance.
Without further explanation from you, I think excluding zeros from
your calculations is a very dangerous thing to do, and could result in
seriously misleading results. In fact, this is one reason why the
Space Shuttle Challenger was allowed to be launched and subsequently
exploded, because someone did an analysis and excluded all zeros.
Now in some cases, it may be appropriate to exclude zeros. Can you
explain why it is appropriate to exclude zeros?
--
Paige Miller
paige\dot\miller \at\ kodak\dot\com
demin
2009-11-12 20:38:01 UTC
Permalink
Raw Message
Post by Gerhard Hellriegel
that might be a interesting thing. It's hard for me to understand where it
is of use to treat 0 as missing. for sum it is unimportant. But what about
range, mean, ... and all the others?
Look at the values
0 12 24
mean = 18 or mean = 12 ?
range= 12 ?
even simply the _freq_ is strange, cause there are 3 values, not 2.
If you have the birthdate 01jan1960, you are 0 years old??? If you are
born one day later, you are something around 50 years old!! What one day
makes a difference! And that "new born"-state remains the same until you
die!!!! (or you change the tool to calculate the age...)
Gerhard
Post by Paige Miller
Post by demin
I want to exclude all items =0 and =. when doiong proc means, exclude
missing valude in the default of SAS. But if I want to also exlude all
items=0 but there are lots of items need to do the proc means. Do I
have to do the items one by one? Seems the exclusive option doesn't
work for this.
proc means data=mytable n mean max;
  var Q1 Q2 ... Q10;
ods output Summary=mytable;
run;
Do I have to deal with each item, and delete those obs=0 items, and
then do the proc means for one item, and then repeat  another item?
Thank in advance.
Without further explanation from you, I think excluding zeros from
your calculations is a very dangerous thing to do, and could result in
seriously misleading results. In fact, this is one reason why the
Space Shuttle Challenger was allowed to be launched and subsequently
exploded, because someone did an analysis and excluded all zeros.
Now in some cases, it may be appropriate to exclude zeros. Can you
explain why it is appropriate to exclude zeros?
--
Paige Miller
paige\dot\miller \at\ kodak\dot\com- Hide quoted text -
- Show quoted text -
sorry, I only received new message from Joe in my gmail, so I didn't
know many replies here. Thanks!

My case is about some aggregate scores(>=0), my boss has no interest
in those respondents who get all items wrong, so need to exclude them.
Thanks for the interesting example.
Francois van der Walt
2009-11-13 04:53:26 UTC
Permalink
Raw Message
Not discounting any of the comments regarding the merit of excluding 0's
will the use of "where" give you what you want?

Here is a piece of sample code that demonstrate the difference:

data example;
input var1;
datalines;
1
3
2
4
5
6
0
4
0
4
7
0
;
proc means;
run;
proc means;
where var1 ne 0;
run;

Results:

Without where
The MEANS Procedure

Analysis Variable : var1

N Mean Std Dev Minimum Maximum
12 3.0000000 2.4120908 0 7.0000000


With Where ne 0

N Mean Std Dev Minimum Maximum
9 4.0000000 1.8708287 1.0000000 7.0000000
Post by demin
Post by Gerhard Hellriegel
that might be a interesting thing. It's hard for me to understand where it
is of use to treat 0 as missing. for sum it is unimportant. But what about
range, mean, ... and all the others?
Look at the values
0 12 24
mean = 18 or mean = 12 ?
range= 12 ?
even simply the _freq_ is strange, cause there are 3 values, not 2.
If you have the birthdate 01jan1960, you are 0 years old??? If you are
born one day later, you are something around 50 years old!! What one day
makes a difference! And that "new born"-state remains the same until you
die!!!! (or you change the tool to calculate the age...)
Gerhard
Post by Paige Miller
Post by demin
I want to exclude all items =0 and =. when doiong proc means, exclude
missing valude in the default of SAS. But if I want to also exlude all
items=0 but there are lots of items need to do the proc means. Do I
have to do the items one by one? Seems the exclusive option doesn't
work for this.
proc means data=mytable n mean max;
var Q1 Q2 ... Q10;
ods output Summary=mytable;
run;
Do I have to deal with each item, and delete those obs=0 items, and
then do the proc means for one item, and then repeat another item?
Thank in advance.
Without further explanation from you, I think excluding zeros from
your calculations is a very dangerous thing to do, and could result in
seriously misleading results. In fact, this is one reason why the
Space Shuttle Challenger was allowed to be launched and subsequently
exploded, because someone did an analysis and excluded all zeros.
Now in some cases, it may be appropriate to exclude zeros. Can you
explain why it is appropriate to exclude zeros?
--
Paige Miller
paige\dot\miller \at\ kodak\dot\com- Hide quoted text -
- Show quoted text -
sorry, I only received new message from Joe in my gmail, so I didn't
know many replies here. Thanks!
My case is about some aggregate scores(>=0), my boss has no interest
in those respondents who get all items wrong, so need to exclude them.
Thanks for the interesting example.
p***@gmail.com
2017-10-27 14:32:40 UTC
Permalink
Raw Message
It makes sense to exclude some var's zeros in the case of survey answers when we do not want to have those whose answers that we do not want to see.
Post by Francois van der Walt
Not discounting any of the comments regarding the merit of excluding 0's
will the use of "where" give you what you want?
data example;
input var1;
datalines;
1
3
2
4
5
6
0
4
0
4
7
0
;
proc means;
run;
proc means;
where var1 ne 0;
run;
Without where
The MEANS Procedure
Analysis Variable : var1
N Mean Std Dev Minimum Maximum
12 3.0000000 2.4120908 0 7.0000000
With Where ne 0
N Mean Std Dev Minimum Maximum
9 4.0000000 1.8708287 1.0000000 7.0000000
Post by demin
Post by Gerhard Hellriegel
that might be a interesting thing. It's hard for me to understand where it
is of use to treat 0 as missing. for sum it is unimportant. But what about
range, mean, ... and all the others?
Look at the values
0 12 24
mean = 18 or mean = 12 ?
range= 12 ?
even simply the _freq_ is strange, cause there are 3 values, not 2.
If you have the birthdate 01jan1960, you are 0 years old??? If you are
born one day later, you are something around 50 years old!! What one day
makes a difference! And that "new born"-state remains the same until you
die!!!! (or you change the tool to calculate the age...)
Gerhard
Post by Paige Miller
Post by demin
I want to exclude all items =0 and =. when doiong proc means, exclude
missing valude in the default of SAS. But if I want to also exlude all
items=0 but there are lots of items need to do the proc means. Do I
have to do the items one by one? Seems the exclusive option doesn't
work for this.
proc means data=mytable n mean max;
var Q1 Q2 ... Q10;
ods output Summary=mytable;
run;
Do I have to deal with each item, and delete those obs=0 items, and
then do the proc means for one item, and then repeat another item?
Thank in advance.
Without further explanation from you, I think excluding zeros from
your calculations is a very dangerous thing to do, and could result in
seriously misleading results. In fact, this is one reason why the
Space Shuttle Challenger was allowed to be launched and subsequently
exploded, because someone did an analysis and excluded all zeros.
Now in some cases, it may be appropriate to exclude zeros. Can you
explain why it is appropriate to exclude zeros?
--
Paige Miller
paige\dot\miller \at\ kodak\dot\com- Hide quoted text -
- Show quoted text -
sorry, I only received new message from Joe in my gmail, so I didn't
know many replies here. Thanks!
My case is about some aggregate scores(>=0), my boss has no interest
in those respondents who get all items wrong, so need to exclude them.
Thanks for the interesting example.
Francois van der Walt
2009-11-18 04:08:57 UTC
Permalink
Raw Message
Dear Demin,

Did you expect the following result:
Variable N Mean Std Dev Minimum
Maximum

var1 9 4.0000000 1.8708287 1.0000000
7.0000000
var2 9 4.0000000 1.8708287 1.0000000
7.0000000
var3 12 3.0000000 2.4120908 0
7.0000000

Well the following adaptation to the code will deliver that:

data example;
input var1 var2 var3;
* add the following to make it as general as you want;
if var1=0 then var1=.;
* add if statements for all the other conditions;
;
datalines;
1 1 1
3 3 3
2 2 2
4 4 4
5 5 5
6 6 6
0 . 0
4 4 4
0 . 0
4 4 4
7 7 7
0 . 0
;

proc means;
run
;
proc means;
* Comment out or delete the following line;
* where (var1 ne 0) ;
run
;

Let me know if you have any questions.

Francois van der Walt



Please consider the environment before printing this email.


-----Original Message-----
From: demin qu [mailto:***@gmail.com]
Sent: Wednesday, 18 November 2009 1:07 AM
To: Francois van der Walt
Subject: Re: how to exclude value 0 when doing proc means?

Thanks for your reply.
Sorry, I am late to reply you because I found something not correct, but not
sure:
if add more variable,

data example;
input var1 var2 var3;
datalines
1 1 1
3 3 3
2 2 2
4 4 4
5 5 5
6 6 6
0 . 0
4 4 4
0 . 0
4 4 4
7 7 7
0 . 0
;

proc means;
run
;
proc means;
where (var1 ne 0) ;
run
;

if I only want to exclude those 0 for var1 but not for var3, seems doesn't
work, var3 mean still get the same as var1
variable N Mean Std Dev Minimum Maximum
var1 9 4.0000000 1.8708287 1.0000000 7.0000000
var2 9 4.0000000 1.8708287 1.0000000 7.0000000
var3 9 4.0000000 1.8708287 1.0000000
7.0000000
But var3 mean should = 3.


On Thu, 12 Nov 2009 23:53:26 -0500, Francois van der Walt
Post by Francois van der Walt
Not discounting any of the comments regarding the merit of excluding 0's
will the use of "where" give you what you want?
data example;
input var1;
datalines;
1
3
2
4
5
6
0
4
0
4
7
0
;
proc means;
run;
proc means;
where var1 ne 0;
run;
Without where
The MEANS Procedure
Analysis Variable : var1
N Mean Std Dev Minimum Maximum
12 3.0000000 2.4120908 0 7.0000000
With Where ne 0
N Mean Std Dev Minimum Maximum
9 4.0000000 1.8708287 1.0000000 7.0000000
Post by demin
Post by Gerhard Hellriegel
that might be a interesting thing. It's hard for me to understand where it
is of use to treat 0 as missing. for sum it is unimportant. But what about
range, mean, ... and all the others?
Look at the values
0 12 24
mean = 18 or mean = 12 ?
range= 12 ?
even simply the _freq_ is strange, cause there are 3 values, not 2.
If you have the birthdate 01jan1960, you are 0 years old??? If you are
born one day later, you are something around 50 years old!! What one day
makes a difference! And that "new born"-state remains the same until you
die!!!! (or you change the tool to calculate the age...)
Gerhard
Post by Paige Miller
Post by demin
I want to exclude all items =0 and =. when doiong proc means, exclude
missing valude in the default of SAS. But if I want to also exlude all
items=0 but there are lots of items need to do the proc means. Do I
have to do the items one by one? Seems the exclusive option doesn't
work for this.
proc means data=mytable n mean max;
var Q1 Q2 ... Q10;
ods output Summary=mytable;
run;
Do I have to deal with each item, and delete those obs=0 items, and
then do the proc means for one item, and then repeat another item?
Thank in advance.
Without further explanation from you, I think excluding zeros from
your calculations is a very dangerous thing to do, and could result in
seriously misleading results. In fact, this is one reason why the
Space Shuttle Challenger was allowed to be launched and subsequently
exploded, because someone did an analysis and excluded all zeros.
Now in some cases, it may be appropriate to exclude zeros. Can you
explain why it is appropriate to exclude zeros?
--
Paige Miller
paige\dot\miller \at\ kodak\dot\com- Hide quoted text -
- Show quoted text -
sorry, I only received new message from Joe in my gmail, so I didn't
know many replies here. Thanks!
My case is about some aggregate scores(>=0), my boss has no interest
in those respondents who get all items wrong, so need to exclude them.
Thanks for the interesting example.
Loading...