Objective of this
chapter:
To create a simple SAS program in PC SAS System and understand the
basic program constructs. Specifically :
- Learn to use the DATA statement.
- Learn to use the INPUT statement.
- Learn to use the CARDS statement.
- Learn where to use the semicolon (;).
- Learn RUN statement.
- Learn TITLE statement.
- Learn to use two Procedures – PRINT and
FREQ.
- How to create a permanent dataset using
LIBNAME.
- Learn how to execute a SAS
program.
Pre-Requisite:
You should know how to open a SAS
System for PC application and identify the Program Editor, Log and
Output windows. Further you should be able to create a test
program following the instructions below.
Let us start with a simple
SAS program. This program creates a SAS dataset named 'sample_accounts'
with 3 fields. Two of them with numeric data type and one
with a character data type. This program demonstrates the use of
DATA, CARDS, and INPUT statements and also demonstrates the use of two
procedures viz. PROC PRINT and PROC FREQ. Each and every component in
the program is explained in detail below.
/* start*/
Data
sample_accounts;
INPUT Account 1-7 StatusCode $ 9 CreditLimit 11-14 ;
CARDS;
1234670 Z 0000
1234671 3000
1234672 Z 0000
1234673 T 3200
1234674 8000
1234675 D 2000
1234676 4000
1234677 S 6000
1234678 T 8000
1234679 T 2000
;
run;
Proc Print data = sample_accouts;
Title " Account Sample";
run;
Proc Freq data =sample_accounts ;
tables StatusCode/missing;
run;
/* end */
Copy and paste the above program block to SAS program Editor and
'Submit'.
Note: DATA,
INPUT , field names and PROC statements are case insensitive.
Proc Print and Proc Freq are used to show how these two
procedures used on a dataset. These procedures are explained in details
later in this guide.
DATA Statement
Use: Names a SAS data
set - in this case, 'sample_accounts'
Syntax: DATA SOMENAME;
Data sample_accounts;
Result: A SAS data
set named 'sample_accounts' is created
The DATA statement signals the beginning of a DATA step. The general
form of the SAS DATA statement is:
DATA SOMENAME;
The DATA statement names the data set you are creating. The name should
be 1-32 characters and must begin with a letter or underscore.
INPUT Statement
Use: creates field
names, data types and order of variables to SAS
Syntax: INPUT <variable
name> <variable type> <data position>;
INPUT
Account 1-7 StatusCode $ 9 CreditLimit 11-14 ;
Result: Input data are
defined for SAS
The INPUT statement
specifies the names, data types and the reading order of the variables
in your data. Although there are many types of INPUT statements,
which could be used in combination, this program restricts the use to
'column' Input style.
Have a look at the INPUT statement in our example :
INPUT
Account 1-7 StatusCode $ 9 CreditLimit 11-14 ;
In this statement we tell SAS system to create a numeric variable
'Account' from the input data line positions 1-7. Same way
variable 'StatusCode' is a character variable created ( a
character variable is defined with a dollar sign ($) after the variable
name ). The third variable is again a numeric one read from the
positions 11-14.
So if you look at the data
lines in the program, its easy to see how the positions for each
variables are determined.
Possible bugs: If you miss
the $ sign while reading a character variable SAS log will show a
warning 'Note: Invalid data for StatusCode in line x
positions 9-9' . The StatusCode will have values all
missing.
CARDS Statement
Use: Signals that input
data will follow
Syntax: CARDS;
Result: Data can be
processed for the SAS data set
The CARDS statement signals
that the data will follow next and immediately precedes your input data
lines. DATALINES is also used in place of CARDS in newer versions. If
the data needs to be embedded within the program, CARDS statement
helps to do that.
The general form of the
CARDS statement is:
INPUT Account 1-7 StatusCode $ 9 CreditLimit 11-14 ;
CARDS;
1234670 Z 0000
1234671 3000
1234672 Z 0000
1234673 T 3200
1234674 8000
1234675 D 2000
1234676 4000
1234677 S 6000
1234678 T 8000
1234679 T 2000
;
run;
Note: If the data is contained in an
external file, instead of CARDS, you will use an INFILE statement to
specify where that file resides. (Example: INFILE 'c:\accounts.txt';).
Possible
bugs: Always check the data are aligned in the program editor the way
INPUT statement expects it . For example, INPUT statement expects
StatusCode value at position 9, CreditLimit value at positions 11-14.
Any variations in the positions might result in reading the data to
wrong variables.
SEMICOLON
Use: Signals the end of
any SAS statement
Syntax:A DATA Step or
PROCedure statement; (DATA;)
DATA sample_accounts;
INPUT Account 1-7 StatusCode $ 9
CreditLimit 11-14 ;
CARDS;
Proc Print
data = sample_accouts;
Title " Account Sample"; run;
Result: SAS is signaled
that the statement is complete
The semicolon (;) is used as a delimiter to indicate the end of SAS
statements.
By now you might have figured out
that almost every line ends with a semicolon. SAS gives a compilation
error " Statement is not valid or it is used out of proper order"
if you miss one.
RUN Statement
Use: A step
boundary that shows the end of that step.
Syntax: RUN;
Result: The
statements and procedures specified in the SAS program blocks are
executed one by one.
RUN statement signals where a step ends and SAS executes the steps one
by one. RUN command is optional but its a common practice to end every
step with a RUN statement.
PROC PRINT & PROC FREQ
Proc print procedure
prints the contents of the dataset created into the output window so
that you could see what SAS data step read . PROC FREQ
creates a frequency table on 'StatusCode' variable and prints
that to the output window. More on the procedures and dealt later in
this guide.
TITLE Statement
Use: Puts TITLES
on your output
Syntax: TITLE 'some title';
Title " Account Sample";
Result: A TITLE is
added at the top of each page of the output printed.
The TITLE statement assigns a title, which appears at the top of the
output page.
How to Run a SAS
Program
In a PC SAS (MS Windows)
environment, there are three different windows that help in SAS
programming. Program Editor is used to compose the
programs and to execute it, Log window displays the
log (error messages, warnings and other messages) of execution and Output
window displays the outputs generated by the program steps.
Many times analysts use remote ‘signon’ to log on to a Unix shared SAS
server and in that case the program is running in a remote
server. If you are using remote sign-on from PC SAS, log
and output are automatically downloaded to the local PC.
Once the program is executed, check the log for any errors or warning.
Logs will normally give a clear idea whether the syntax was
incorrectly used or there was a type mismatch in variables. You
must fix all the errors and resubmit the program, every time there is
an error.
In Unix and Mainframe the programs are composed and executed
differently. Please refer a relevant manual for the same. Good
news is that SAS procedures and statements that are used in these
platforms are same.