Discussion:
SAS DATE format - identifying blank dates
(too old to reply)
c***@gmail.com
2006-01-13 19:44:04 UTC
Permalink
Hi,

I have a variable called "start_date" and I formatted it as follows:

format start_date DATE9.

I found in my dataset that not all participants filled in this field.
I would like to identifying these missing records. How do i do this?

I've tried reformatting this variable to a character and then putting
if start_date = ' ' then missing=1. And I've even tried to put if
start_date = . then missing=1. But the format DATE 9. is neither
character nor numeric.

Can anyone help?

Thanks.
machelle
2006-01-13 20:12:12 UTC
Permalink
If start_Date has a format of date9. the following should work:

If start_date = . then missing=1;

Are you sure your format statement worked? If you right click the data
set it created (if it's a temporary data set, it will be located in
your work library) and select "View Columns" you'll see the format
attributed to each variable. Alternatively, you can just runa proc
contents data=(whatever you called the data set you created) and check
it there.

Machelle
c***@gmail.com
2006-01-13 20:27:41 UTC
Permalink
Hi,

Yes, I tried "If start_date = . then missing=1;" and still nothing.

When I do a proc contents:

Variable Type Len Format
start_date Num 8 YYMMDDN8

Just to clarify, for example, I have the dates '20040331' in the format
9.
When I do a proc freq, I see "Frequency missing =3D 123". I want to
specifically identfiy those missing so I can create a new variable
called
"missing".

So, let's say I have the folllowing data set

participant_id start_date
1 20040331
2 20040511
3

I want to end up with:

participant_id start_date missing
1 20040331 0
2 20040511 0
3 1


Is this clearer?



Thanks for your help
Howard Schreier <hs AT dc-sug DOT org>
2006-01-13 20:07:04 UTC
Permalink
Post by c***@gmail.com
Hi,
format start_date DATE9.
I found in my dataset that not all participants filled in this field.
I would like to identifying these missing records. How do i do this?
I've tried reformatting this variable to a character and then putting
if start_date = ' ' then missing=1. And I've even tried to put if
start_date = . then missing=1. But the format DATE 9. is neither
character nor numeric.
It is a numeric format.

But formats are irrelevant to missing-ness.

"if start_date = . then missing=1;" should work.

Are you sure that START_DATE is numeric?
Post by c***@gmail.com
Can anyone help?
Thanks.
toby dunn
2006-01-13 20:05:06 UTC
Permalink
Cresentmoon,

YOu are right that formats and informats for that matter are neither
character nore numeric. They are instructions to SAS stating how SAS should
reprecent the number help in start_date. And missing numbers are
represented as . if you look at data values for start_date.

if you would like a simple count of missing values of start_date there are
many ways such as a data step, proc means, proc freq, proc summary. YOur
gonna have to give us some more info. about what your doing here and/or want
to do. Preferably with some sample data and output for us to look at.
Help us help you.



Toby Dunn





From: ***@GMAIL.COM
Reply-To: ***@GMAIL.COM
To: SAS-***@LISTSERV.UGA.EDU
Subject: SAS DATE format - identifying blank dates
Date: Fri, 13 Jan 2006 11:44:04 -0800
MIME-Version: 1.0
Received: from bay0-mc7-f14.bay0.hotmail.com ([65.54.244.214]) by
imc2-s15.hotmail.com with Microsoft SMTPSVC(6.0.3790.211); Fri, 13 Jan 2006
11:46:27 -0800
Received: from mailgw.cc.uga.edu ([128.192.1.101]) by
bay0-mc7-f14.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.211); Fri, 13
Jan 2006 11:46:21 -0800
Received: from listserv.cc.uga.edu (listserv.uga.edu [128.192.1.75])by
mailgw.cc.uga.edu (8.12.11/8.12.11) with ESMTP id k0DInkgx013783;Fri, 13 Jan
2006 14:45:18 -0500
Received: from LISTSERV.UGA.EDU by LISTSERV.UGA.EDU (LISTSERV-TCP/IP release
1.8d) with spool id 836057 for SAS-***@LISTSERV.UGA.EDU; Fri, 13 Jan
2006 14:45:09 -0500
Received: from alfredo.cc.uga.edu (alfredo.cc.uga.edu [128.192.1.77]) by
listserv.cc.uga.edu (8.12.11/8.12.11) with ESMTP id k0DJj8YQ013187
for <sas-***@listserv.uga.edu>; Fri, 13 Jan 2006 14:45:08 -0500
Received: from alfredo.cc.uga.edu (localhost.localdomain [127.0.0.1]) by
alfredo.cc.uga.edu (8.12.11/8.12.11) with ESMTP id k0DJj8qG015920 for
<sas-***@listserv.uga.edu>; Fri, 13 Jan 2006 14:45:08 -0500
Received: (from ***@localhost) by alfredo.cc.uga.edu
(8.12.11/8.12.11/Submit) id k0DJj8wo015915 for
sas-***@listserv.uga.edu; Fri, 13 Jan 2006 14:45:08 -0500
X-Message-Info: 9FGFoCED9ZMJEtArYsGHkCokJxcKtOT+iOL+BEbKrws=
X-Authentication-Warning: alfredo.cc.uga.edu: news set sender to
***@listserv.uga.edu using -f
X-Newsgroups: comp.soft-sys.sas
Lines: 18
X-Complaints-To: groups-***@google.com
User-Agent: G2/0.2
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET
CLR 1.1.4322),gzip(gfe),gzip(gfe)
Complaints-To: groups-***@google.com
Injection-Info: f14g2000cwb.googlegroups.com; posting-host=205.207.78.4;
posting-account=tJm2GA0AAAAjxWsQswcVHEyXnm1JV68w
X-Scanned-By: Digested by UGA Mail Gateway on 128.192.1.75
Organization: http://groups.google.com
Comments: To: sas-***@uga.edu
Return-Path: owner-sas-***@LISTSERV.UGA.EDU
X-OriginalArrivalTime: 13 Jan 2006 19:46:21.0837 (UTC)
FILETIME=[078C9BD0:01C6187A]

Hi,

I have a variable called "start_date" and I formatted it as follows:

format start_date DATE9.

I found in my dataset that not all participants filled in this field.
I would like to identifying these missing records. How do i do this?

I've tried reformatting this variable to a character and then putting
if start_date = ' ' then missing=1. And I've even tried to put if
start_date = . then missing=1. But the format DATE 9. is neither
character nor numeric.

Can anyone help?

Thanks.
toby dunn
2006-01-13 20:46:38 UTC
Permalink
Sharon ,

data one ;
infile cards ;
input participant_id start_date date9. ;
cards ;
1 01Jan1957
2 21Feb1964
3
;
run ;

proc freq
data = one ;
table start_date / missing ;
format start_date date9. ;
run ;


data two ;
set one ;
if Start_date = . then missing = 1 ;
else missing = 0 ;
run ;

proc print
data = two ;
format start_date date9. ;
run ;



Toby Dunn





From: Sharon Carter <***@gmail.com>
To: toby dunn <***@hotmail.com>
CC: SAS-***@listserv.uga.edu
Subject: Re: SAS DATE format - identifying blank dates
Date: Fri, 13 Jan 2006 15:18:27 -0500


Hi Toby,

For example, I have the dates '01Jan1957' in the format 9.
When I do a proc freq, I see "Frequency missing = 123". I want to
specifically identfiy those missing so I can create a new variable called
"missing".

So, let's say I have the folllowing data set

participant_id start_date
1 01Jan1957
2 21Feb1964
3

I want to end up with:

participant_id start_date missing
1 01Jan1957 0
2 21Feb1964 0
3 1


Is this clearer?



Thanks for your help
Post by toby dunn
Cresentmoon,
YOu are right that formats and informats for that matter are neither
character nore numeric. They are instructions to SAS stating how SAS
should
reprecent the number help in start_date. And missing numbers are
represented as . if you look at data values for start_date.
if you would like a simple count of missing values of start_date there
are
Post by toby dunn
many ways such as a data step, proc means, proc freq, proc summary. YOur
gonna have to give us some more info. about what your doing here and/or
want
to do. Preferably with some sample data and output for us to look at.
Help us help you.
Toby Dunn
Subject: SAS DATE format - identifying blank dates
Date: Fri, 13 Jan 2006 11:44:04 -0800
MIME-Version: 1.0
Received: from bay0-mc7-f14.bay0.hotmail.com ([65.54.244.214]) by
imc2-s15.hotmail.com with Microsoft SMTPSVC(6.0.3790.211); Fri, 13 Jan
2006
11:46:27 -0800
Received: from mailgw.cc.uga.edu ([128.192.1.101]) by
bay0-mc7-f14.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.211); Fri,
13
Jan 2006 11:46:21 -0800
Received: from listserv.cc.uga.edu (listserv.uga.edu [128.192.1.75])by
mailgw.cc.uga.edu (8.12.11/8.12.11) with ESMTP id k0DInkgx013783;Fri, 13
Jan
2006 14:45:18 -0500
Received: from LISTSERV.UGA.EDU by LISTSERV.UGA.EDU (LISTSERV-TCP/IP
release
Jan
2006 14:45:09 -0500
Received: from alfredo.cc.uga.edu (alfredo.cc.uga.edu [128.192.1.77]) by
listserv.cc.uga.edu (8.12.11/8.12.11) with ESMTP id k0DJj8YQ013187
Received: from alfredo.cc.uga.edu (localhost.localdomain [127.0.0.1]) by
alfredo.cc.uga.edu (8.12.11/8.12.11) with ESMTP id k0DJj8qG015920
for
Post by toby dunn
(8.12.11/8.12.11/Submit) id k0DJj8wo015915 for
X-Message-Info: 9FGFoCED9ZMJEtArYsGHkCokJxcKtOT+iOL+BEbKrws=
X-Authentication-Warning: alfredo.cc.uga.edu: news set sender to
X-Newsgroups: comp.soft-sys.sas
Lines: 18
User-Agent: G2/0.2
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET
CLR 1.1.4322),gzip(gfe),gzip(gfe)
Injection-Info: f14g2000cwb.googlegroups.com; posting-host=205.207.78.4;
posting-account=tJm2GA0AAAAjxWsQswcVHEyXnm1JV68w
X-Scanned-By: Digested by UGA Mail Gateway on 128.192.1.75
Organization: http://groups.google.com
X-OriginalArrivalTime: 13 Jan 2006 19:46:21.0837 (UTC)
FILETIME=[078C9BD0:01C6187A]
Hi,
format start_date DATE9.
I found in my dataset that not all participants filled in this field.
I would like to identifying these missing records. How do i do this?
I've tried reformatting this variable to a character and then putting
if start_date = ' ' then missing=1. And I've even tried to put if
start_date = . then missing=1. But the format DATE 9. is neither
character nor numeric.
Can anyone help?
Thanks.
c***@gmail.com
2006-01-14 03:24:29 UTC
Permalink
hi toby,
your code worked! thanks to everyone for their assistance!
monal kohli
2006-01-13 19:50:48 UTC
Permalink
Not very clear,but can you give a sample of your
start_date..
Post by c***@gmail.com
Hi,
I have a variable called "start_date" and I
format start_date DATE9.
I found in my dataset that not all participants
filled in this field.
I would like to identifying these missing records.
How do i do this?
I've tried reformatting this variable to a character
and then putting
if start_date = ' ' then missing=1. And I've even
tried to put if
start_date = . then missing=1. But the format DATE
9. is neither
character nor numeric.
Can anyone help?
Thanks.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Sierra Information Services
2006-01-15 00:09:36 UTC
Permalink
I have another potential approach to this problem I would like to
share. I think what CrescentMoon is looking to do is to assign a
format label that says something like "missing" or "start date missing"
if the value of the variable is missing, BUT if the value is not
missing, format it with a SAS supplied date variable, such as mmddyy10.
If I am correct, then there's no need to run the data set through a
data step to create an otherwise unncessary "indicator variable"
representing whether or not "start date" is mssing.

Here is my approach:
a) Create a VALUE format MYDATEFMT using PROC FORMAT. See code below.
What I have done is used a "format within a format," which may a tool
that some readers on the list don't know about. The PROC FORMAT code
should below should be pretty straightforward: if the value of the
variable is missing ('dot') then the label is Not Shipped Yet. If the
value is NOT missing, then the value of the MMDDYY10 Format is used to
display the value.
b) The MYDATEFMT is used to represent the values of SHIPDATE in my
made-up data set.

I think this approach may have some value for folks for whom the
"missingness" of a date variable's value is "informative." You can get
all the information (both missing and non-missing values) of the date
variable of interest in ONE report/analysis and use the "format within
a format" facility to select how you want the both the missing and
non-missing values to be portrayed in your output. For example, you
could pick another SAS-supplied data format or change the format label
associated with the missing values to something more appropriate to
your specific analysis (e.g., "Patient Did Not Return for Treatment" or
"Response Not Received."

Here is my sample code. You can cut and paste it in to your program
editor and then try it yourself. I hope it is useful!

proc format;
* if the date variable's value is missing, the format label
"Not Shipped Yet" appears, if non-missing, format the date
value with the SAS-supplied MMDDYY10. format;
value mydatefmt
. = 'Not Shipped Yet'
other = [mmddyy10.];
run;

data shipping_data;
* generate some example data, and assign a few observations
to have missing values of shipdate;
do record= 1 to 15;
orderdate = (today() - 90) + ceil(uniform(0)*10);
if record in(1,3,5,8,11,14) then do;
shipdate = .;
end;
else do;
shipdate = orderdate + (ceil(uniform(0)*7));
end;
output;
end;
run;

options nodate nonumber nocenter;
proc freq data=shipping_data;
tables orderdate*shipdate/missing list nocum nopercent;
format shipdate mydatefmt. orderdate mmddyy10.;
title 'Using a Format within a Format';
run;

Andrew Karp
Sierra Information Services
www.SierraInformation.com

Loading...