45+ TOP SAS Programming Interview Questions and Answers
1. What areas of SAS are you most interested in?
A) BASE, STAT, GRAPH, ETSBriefly
2. Describe 5 ways to do a “table lookup” in SAS.
A) Match Merging, Direct Access, Format Tables, Arrays, PROC SQL
3. What versions of SAS have you used (on which platforms)?
A) SAS 9.1.3,9.0, 8.2 in Windows and UNIX, SAS 7 and 6.12
4. What are some good SAS programming practices for processing very large data sets?
A) Sampling method using OBS option or subsetting, commenting the Lines, Use Data Null
5. What are some problems you might encounter in processing missing values? In Data steps? Arithmetic? Comparisons? Functions? Classifying data?
A) The result of any operation with missing value will result in missing value. Most SAS statistical procedures exclude observations with any missing variable values from an analysis.
6. How would you create a data set with 1 observation and 30 variables from a data set with 30 observations and 1 variable?
A) Using PROC TRANSPOSE
7. What is the different between functions and PROCs that calculate the same simple descriptive statistics?
A) Proc can be used with wider scope and the results can be sent to a different dataset. Functions usually affect the existing datasets.
8. If you were told to create many records from one record, show how you would do this using array and with PROC TRANSPOSE?
A) Declare array for number of variables in the record and then used Do loop Proc Transpose with VAR statement
9. What do the SAS log messages “numeric values have been converted to character” mean? What are the implications?
A) It implies that automatic conversion took place to make character functions possible.
10. Why is a STOP statement needed for the POINT= option on a SET statement?
A) Because POINT= reads only the specified observations, SAS cannot detect an end-of-file condition as it would if the file were being read sequentially.
11. How do you control the number of observations and/or variables read or written?
A) FIRSTOBS and OBS option
12. Approximately what date is represented by the SAS date value of 730?
A) 31st December 1961
13. Identify statements whose placement in the DATA step is critical.
A) INPUT, DATA and RUN…
14. What does the RUN statement do?
A) When SAS editor looks at Run it starts compiling the data or proc step, if you have more than one data step or proc step or if you have a proc step. Following the data step then you can avoid the usage of the run statement.
15. Why is SAS considered self-documenting?
A) SAS is considered self documenting because during the compilation time it creates and stores all the information about the data set like the time and date of the data set creation later No. of the variables later labels all that kind of info inside the dataset and you can look at that info using proc contents procedure.
16. What are some good SAS programming practices for processing very large data sets?
A) Sort them once, can use firstobs = and obs = ,
17. What is the different between functions and PROCs that calculate thesame simple descriptive statistics?
A) Functions can used inside the data step and on the same data set but with proc’s you can create a new data sets to output the results. May be more ………..
18. If you were told to create many records from one record, show how you would do this using arrays and with PROC TRANSPOSE?
A) I would use TRANSPOSE if the variables are less use arrays if the var are more …………….. depends
19. What is a method for assigning first.VAR and last.VAR to the BY groupvariable on unsorted data?
A) In unsorted data you can’t use First. or Last.
20. How do you debug and test your SAS programs?
A) First thing is look into Log for errors or warning or NOTE in some cases or use the debugger in SAS data step.
21. What other SAS features do you use for error trapping and datavalidation?
A) Check the Log and for data validation things like Proc Freq, Proc means or some times proc print to look how the data looks like ……..
22. How would you combine 3 or more tables with different structures?
A) I think sort them with common variables and use merge statement. I am not sure what you mean different structures.
23. What are _numeric_ and _character_ and what do they do?
A) Will either read or writes all numeric and character variables in dataset.
24. For what purpose would you use the RETAIN statement?
A) The retain statement is used to hold the values of variables across iterations of the data step. Normally, all variables in the data step are set to missing at the start of each iteration of the data step. What is the order of evaluation of the comparison operators: + – * / ** ()?A) (), **, *, /, +, –
25. How could you generate test data with no input data?
A) Using Data Null and put statement
26. How do you debug and test your SAS programs?
A) Using Obs=0 and systems options to trace the program execution in log.
27. What can you learn from the SAS log when debugging?
A) It will display the execution of whole program and the logic. It will also display the error with line number so that you can and edit the program.
28. What is the purpose of _error_?
A) It has only to values, which are 1 for error and 0 for no error.
29. How do you test for missing values?
A) Using Subset functions like IF then Else, Where and Select.
30. In the flow of DATA step processing, what is the first action in a typical DATA Step?
A) When you submit a DATA step, SAS processes the DATA step and then creates a new SAS data set.( creation of input buffer and PDV)
31. What are SAS/ACCESS and SAS/CONNECT?
A) SAS/Access only process through the databases like Oracle, SQL-server, Ms-Access etc. SAS/Connect only use Server connection.
32. What is the one statement to set the criteria of data that can be coded in any step?
A) OPTIONS Statement, Label statement, Keep / Drop statements.
33. What is the purpose of using the N=PS option?
A) The N=PS option creates a buffer in memory which is large enough to store PAGESIZE (PS) lines and enables a page to be formatted randomly prior to it being printed.
34. What are the scrubbing procedures in SAS?
A) Proc Sort with nodupkey option, because it will eliminate the duplicate values.
35. What are the new features included in the new version of SAS i.e., SAS9.1.3?
A) The main advantage of version 9 is faster execution of applications and centralized access of data and support.
There are lots of changes has been made in the version 9 when we compared with the version 8. The following are the few:SAS version 9 supports Formats longer than 8 bytes & is not possible with version 8.
Length for Numeric format allowed in version 9 is 32 where as 8 in version 8.
Length for Character names in version 9 is 31 where as in version 8 is 32.
Length for numeric informat in version 9 is 31, 8 in version 8.
Length for character names is 30, 32 in version 8.3 new informats are available in version 9 to convert various date, time and datetime forms of data into a SAS date or SAS time.
•ANYDTDTEW. – Converts to a SAS date value •ANYDTTMEW. – Converts to a SAS time value. •ANYDTDTMW. -Converts to a SAS datetime value.CALL SYMPUTX Macro statement is added in the version 9 which creates a macro variable at execution time in the data step by •
Trimming trailing blanks • Automatically converting numeric value to character.
New ODS option (COLUMN OPTION) is included to create a multiple columns in the output.
36. What do the PUT and INPUT functions do?
A) INPUT function converts character data values to numeric values.
PUT function converts numeric values to character values.EX: for INPUT: INPUT (source, informat)
For PUT: PUT (source, format)
Note that INPUT function requires INFORMAT and PUT function requires FORMAT.
If we omit the INPUT or the PUT function during the data conversion, SAS will detect the mismatched variables and will try an automatic character-to-numeric or numeric-to-character conversion. But sometimes this doesn’t work because $ sign prevents such conversion. Therefore it is always advisable to include INPUT and PUT functions in your programs when conversions occur.
37. Which date function advances a date, time or datetime value by a given interval?
INTNX function advances a date, time, or datetime value by a given interval, and returns a date, time, or datetime value. Ex: INTNX(interval,start-from,number-of-increments,alignment)
INTCK: INTCK(interval,start-of-period,end-of-period) is an interval functioncounts the number of intervals between two give SAS dates, Time and/or datetime.
DATETIME () returns the current date and time of day.
DATDIF (sdate,edate,basis): returns the number of days between two dates.
38. How might you use MOD and INT on numeric to mimic SUBSTR on character Strings?
A) The first argument to the MOD function is a numeric, the second is a non-zero numeric; the result is the remainder when the integer quotient of argument-1 is divided by argument-2. The INT function takes only one argument and returns the integer portion of an argument, truncating the decimal portion. Note that the argument can be an expression.
DATA NEW ;
A = 123456 ;
X = INT( A/1000 ) ;
Y = MOD( A, 1000 ) ;
Z = MOD( INT( A/100 ), 100 ) ;
PUT A= X= Y= Z= ;
39. In ARRAY processing, what does the DIM function do?
A) DIM: It is used to return the number of elements in the array. When we use Dim function we would have to re –specify the stop value of an iterative DO statement if u change the dimension of the array.
40. How would you determine the number of missing or nonmissing values in computations?
A) To determine the number of missing values that are excluded in a computation, use the NMISS function.
m = . ;
y = 4 ;
z = 0 ;
N = N(m , y, z);
NMISS = NMISS (m , y, z);
The above program results in N = 2 (Number of non missing values) and NMISS = 1 (number of missing values).
41. Do you need to know if there are any missing values?
A) Just use: missing_values=MISSING(field1,field2,field3);
This function simply returns 0 if there aren’t any or 1 if there are missing values.If you need to know how many missing values you have then use num_missing=NMISS(field1,field2,field3);
You can also find the number of non-missing values with non_missing=N (field1,field2,field3);
42. What are the validation tools in SAS?
A) For dataset: Data set name/debugData set: name/stmtchk
For macros: Options:mprint mlogic symbolgen.
43. How can you put a “trace” in your program?
A) ODS Trace ON, ODS Trace OFF the trace records.
How would you code a merge that will keep only the observations that have matches from both data sets?
Using “IN” variable option. Look at the following example.
merge one(in=x) two(in=y);
if x=1 and y=1;
merge one(in=x) two(in=y);
if x and y;
44. What are input dataset and output dataset options?
A) Input data set options are obs, firstobs, where, in output data set options compress, reuse.Both input and output dataset options include keep, drop, rename, obs, first obs.
45. What is SAS GRAPH?
A) SAS/GRAPH software creates and delivers accurate, high-impact visuals that enable decision makers to gain a quick understanding of critical business issues.