A Simple Program Explained

Objective of this chapter:

To create a simple SAS program in PC SAS System and understand the basic program constructs.  Specifically :
  1. Learn to use the DATA statement.
  2. Learn to use the INPUT statement.
  3. Learn to use the CARDS statement.
  4. Learn where to use the semicolon (;).
  5. Learn RUN statement.
  6. Learn TITLE statement.
  7. Learn to use two Procedures – PRINT and FREQ.
  8. How to create a permanent dataset using LIBNAME.
  9. Learn how to execute  a SAS program.
Pre-Requisite: You should know how to open a SAS System for PC application and identify the Program Editor, Log and Output windows.  Further you should be able to create a test program following the instructions below.

Let us start with a simple SAS program. This program creates a SAS dataset named 'sample_accounts' with 3 fields. Two of them with  numeric data type  and one with a character data type.  This program demonstrates the use of DATA, CARDS, and INPUT statements and also demonstrates the use of two procedures viz. PROC PRINT and PROC FREQ. Each and every component in the program is explained in detail below.

/* start*/

Data sample_accounts;
INPUT Account 1-7 StatusCode $ 9 CreditLimit 11-14 ;
CARDS;
1234670 Z 0000
1234671   3000
1234672 Z 0000
1234673 T 3200
1234674   8000
1234675 D 2000
1234676   4000
1234677 S 6000
1234678 T 8000
1234679 T 2000
;
run;

Proc Print data = sample_accouts;
Title " Account Sample";
run;

Proc Freq data =sample_accounts ;
tables StatusCode/missing;
run;
/* end */

Copy and paste the above program block to SAS program Editor and 'Submit'.

Note: DATA, INPUT , field names and PROC statements are case insensitive.


Proc Print and Proc Freq are  used to show how these two procedures used on a dataset. These procedures are explained in details later in this guide.

DATA Statement

Use: Names a SAS data set - in this case, 'sample_accounts'
Syntax: DATA SOMENAME;

Data sample_accounts;
Result: A  SAS data set named 'sample_accounts' is created

The DATA statement signals the beginning of a DATA step. The general form of the SAS DATA statement is:
DATA SOMENAME;
The DATA statement names the data set you are creating. The name should be 1-32 characters and must begin with a letter or underscore.

INPUT Statement

Use: creates field names, data types  and order of variables to SAS
Syntax: INPUT <variable name>  <variable type>  <data position>;

INPUT Account 1-7 StatusCode $ 9 CreditLimit 11-14 ;
Result: Input data are defined for SAS

The INPUT statement specifies the names, data types and the reading order of the variables in your data. Although there are many  types of INPUT statements, which could be used in combination, this program restricts the use to 'column' Input style.


Have a look at the INPUT statement in our example :

INPUT Account 1-7 StatusCode $ 9 CreditLimit 11-14 ;


In this statement we tell SAS system to create a numeric variable 'Account' from the input data line positions 1-7.  Same way variable 'StatusCode'  is a  character variable created ( a character variable is defined with a dollar sign ($) after the variable name ). The  third variable is again a numeric one read from the positions 11-14.

So if you look at the data lines in the program, its easy to see how the positions for each variables are determined.

Possible bugs: If you miss the $ sign while reading a character variable SAS log will show a warning  'Note: Invalid data for StatusCode in line  x  positions 9-9'  .   The StatusCode will have values all missing. 

CARDS Statement

Use: Signals that input data will follow
Syntax: CARDS;
Result: Data can be processed for the SAS data set

The CARDS statement signals that the data will follow next and immediately precedes your input data lines. DATALINES is also used in place of CARDS in newer versions. If the data needs to be embedded within the  program, CARDS statement helps to do that.

The general form of the CARDS statement is:


INPUT Account 1-7 StatusCode $ 9 CreditLimit 11-14 ;
CARDS;
1234670 Z 0000
1234671   3000
1234672 Z 0000
1234673 T 3200
1234674   8000
1234675 D 2000
1234676   4000
1234677 S 6000
1234678 T 8000
1234679 T 2000
;
run;

Note: If the data is contained in an external file, instead of CARDS, you will use an INFILE statement to specify where that file resides. (Example: INFILE 'c:\accounts.txt';).

Possible bugs: Always check the data are aligned in the program editor the way INPUT statement expects it . For example, INPUT statement expects StatusCode value at position 9, CreditLimit value at positions 11-14. Any variations in the positions might result in reading the data to wrong variables.


SEMICOLON

Use: Signals the end of any SAS statement
Syntax:A DATA Step or PROCedure statement; (DATA;)
DATA sample_accounts;

INPUT Account 1-7 StatusCode $ 9 CreditLimit 11-14 ;
CARDS;

Proc Print data = sample_accouts;
Title " Account Sample"; run;
Result: SAS is signaled that the statement is complete
The semicolon (;) is used as a delimiter to indicate the end of SAS statements.

By now  you might have figured out that almost every line ends with a semicolon. SAS gives a compilation error  " Statement is not valid or it is used out of proper order" if you miss one.

RUN Statement

Use:  A step boundary that shows the end of that step.
Syntax: RUN;
Result:  The statements and procedures specified in the SAS program blocks are executed one by one.

RUN statement signals where a step ends and SAS executes the steps one by one. RUN command is optional but its a common practice to end every step with a RUN statement.

PROC PRINT & PROC FREQ

Proc print  procedure prints the contents of the dataset created into the output window so that you could see what SAS data step read . PROC  FREQ creates  a frequency table on 'StatusCode' variable and prints that to the output window. More on the procedures and dealt later in this guide.

TITLE Statement

Use:  Puts TITLES on your output
Syntax: TITLE 'some title';
Title " Account Sample";
Result:  A TITLE is added at the top of each page of the output printed.
The TITLE statement assigns a title, which appears at the top of the output page.

How to Run a SAS Program

In a PC SAS (MS Windows) environment, there are three different windows that help in SAS programming. Program Editor is used to compose the programs and to execute it, Log window displays the log (error messages, warnings and other messages) of execution and Output window displays the outputs generated by the program steps.


Many times analysts use remote ‘signon’ to log on to a Unix shared SAS server and in that case the program is running in a remote server.   If you are using remote sign-on from PC SAS, log and output are automatically downloaded to the local PC.


Once the program is executed, check the log for any errors or warning. Logs will normally give a clear idea whether the syntax was  incorrectly used  or there was a type mismatch in variables. You must fix all the errors and resubmit the program, every time there is an error.


In Unix and Mainframe the programs are composed and executed differently. Please refer a relevant manual for the same.  Good news is that SAS procedures and statements that are used in these platforms are same.

 

Copyright free public information. All trademarks,service marks, logos and names are properties of their respective owners.