40+ TOP SAS Interview Questions Answers
1. What is SAS?
SAS is an integrated set of software products. The acronym stands for Statistical Analysis System.
2. What are the special input delimiters?
The input delimiters are DLM and DSD.
3. What is the difference between a format and an informat?
Format: A format is to write data i.e. WORDIATE18 and WEEKDATEW
Informat: An informat is to read data i.e. comma, dollar and date (MMDDYYw, DATEw, TIMEw, PERCENTw
Describe an SAS function.
TRIM: removes trailing blanks from a character expression
Str1 = ‘my’;
Str2 = ‘dog’;
Result = TRIM (Str1)(Str2);
Result = ‘mydog’
4. What is a PDV?
A PDV or Program Data Vector is a logical area in the memory. SAS creates a database of one observation at a time. At the time of compilation an input buffer is created which holds a record from and external file. The PDV is created following this input buffer creation.
5. What is a PUT statement?
A PUT statement is a flexible tool in data step programming.
Examples of a PUT statement are:
PUT _all_ – writes the values of all variables
PUT 132*’_’ – writes 132 underscores
PUT one two three – writes three variable values separated by a space.
6. What is the difference between SAS functions and procedures?
Functions expect values to be supplied across an observation.
Procedures expect one variable value per observation.
7. Which SAS Statement does not perform automatic conversions in comparisons?
The “where” statement does not perform automatic conversions in comparisons.
8. SAS contains how many data types? What are they?
SAS has 2 data types, Character and Numeric.
9. What is SAS GRAPH?
SAS/GRAPH software creates and delivers accurate, high-impact visuals that enable decision makers to gain a quick understanding of critical business issues.
10. Why is a STOP statement needed for the point=option on a SET statement?
When you use the POINT= option, you must include a STOP statement to stop DATA step processing, programming logic that checks for an invalid value of the POINT=variable, or Both. Because POINT= reads only those observations that are specified in the DO statement, SAS cannot read an end-of-file indicator as it would if the file were being read sequentially. Because reading an end-of-file indicator ends a DATA step automatically, failure to substitute another means of ending the DATA step when you use POINT= can cause the DATA step to go into a continuous loop.
11) Which date functions advances a date time or date/time value by a given interval?
12) How we can call macros with in data step?
We can call the macro with CALLSYMPUT
13) In the flow of DATA step processing, what is the first action in a typical DATA Step?
When you submit a DATA step, SAS processes the DATA step and then creates a new SAS data set.( creation of input buffer and PDV)Compilation PhaseExecution Phase
14) How do u identify a macro variable
15) What are SAS/ACCESS and SAS/CONNECT?
SAS/Access only process through the databases like Oracle, SQL-server, Ms-Access etc. SAS/Connect only use Server connection.
16) How could you generate test data with no input data?
17) What is the one statement to set the criteria of data that can be coded in any step?
OPTIONS Statement, Label statement, Keep / Drop statements
18) What is the purpose of using the N=PS option?
The N=PS option creates a buffer in memory which is large enough to store PAGESIZE (PS) lines and enables a page to be formatted randomly prior to it being printed.
19) What are the scrubbing procedures in SAS?
Proc Sort with nodupkey option, because it will eliminate the duplicate values.
20) What are the new features included in the new version of SAS i.e., SAS9.1.3?
The main advantage of version 9 is faster execution of applications and centralized access of data and support.
21) What is option year cuttoff in sas?
by this option we can set the year span like
so it sets year from 2050 to 2049 ..
22) How to add a number to a macro variable?
Use %eval to do simple calculation for macro variables.
%let a = 1;
%let b = %eval(&a+1);
%put a=&a b=&b;
23) What has been your most common programming mistake?
I remember Missing semicolon and not checking log after submitting program, Not using debugging techniques and not using Fsview option vigorously are my common programming errors I made when I started learning SAS and in my initial projects.
24) Have you ever had to follow SOPs or programming guidelines?
SOP describes the process to assure that standard coding activities, which produce tables, listings and graphs, functions and/or edit checks, are conducted in accordance with industry standards are appropriately documented.
25) Name several ways to achieve efficiency in your program. Explain trade-offs.
Efficiency and performance strategies can be classified into 5 different areas.
· CPU time
· Data Storage
· Elapsed time
CPU Time and Elapsed Time- Base line measurements
26) What other SAS products have you used and consider yourself proficient in using?
Data _NULL_ statement, Proc Means, Proc Report, Proc tabulate, Proc freq and Proc print, Proc Univariate etc.
27) What is the significance of the ‘OF’ in X=SUM (OF a1-a4, a6, a9);
If don’t use the OF function it might not be interpreted as we expect. For example the function above calculates the sum of a1 minus a4 plus a6 and a9 and not the whole sum of a1 to a4 & a6 and a9. It is true for mean option also.
28) What do the PUT and INPUT functions do?
INPUT function converts character data values to numeric values.
PUT function converts numeric values to character values.
EX: for INPUT: INPUT (source, informat)
For PUT: PUT (source, format)
Note that INPUT function requires INFORMAT and PUT function requires FORMAT.
If we omit the INPUT or the PUT function during the data conversion, SAS will detect the mismatched variables and will try an automatic character-to-numeric or numeric-to-character conversion. But sometimes this doesn’t work because $ sign prevents such conversion. Therefore it is always advisable to include INPUT and PUT functions in your programs when conversions occur.
29) Which date function advances a date, time or datetime value by a given interval?
INTNX: INTNX function advances a date, time, or datetime value by a given interval, and returns a date, time, or datetime value.
INTCK: INTCK(interval,start-of-period,end-of-period) is an interval functioncounts the number of intervals between two give SAS dates, Time and/or datetime.
DATETIME () returns the current date and time of day.
DATDIF (sdate,edate,basis): returns the number of days between two dates.
30) What do the MOD and INT function do? What do the PAD and DIM functions do?
MOD: Modulo is a constant or numeric variable, the function returns the reminder after numeric value divided by modulo.
INT: It returns the integer portion of a numeric value truncating the decimal portion.
PAD: it pads each record with blanks so that all data lines have the same length. It is used in the INFILE statement. It is useful only when missing data occurs at the end of the record.
CATX: concatenate character strings, removes leading and trailing blanks and inserts separators.
SCAN: it returns a specified word from a character value. Scan function assigns a length of 200 to each target variable.
SUBSTR: extracts a sub string and replaces character values.
Extraction of a substring: Middleinitial=substr(middlename,1,1);
Replacing character values: substr (phone,1,3)=’433’;
If SUBSTR function is on the left side of a statement, the function replaces the contents of the character variable.
TRIM: trims the trailing blanks from the character values.
SCAN vs. SUBSTR:
SCAN extracts words within a value that is marked by delimiters.
SUBSTR extracts a portion of the value by stating the specific location. It is best used when we know the exact position of the sub string to extract from a character value.
31) How might you use MOD and INT on numeric to mimic SUBSTR on character Strings?
The first argument to the MOD function is a numeric, the second is a non-zero numeric; the result is the remainder when the integer quotient of argument-1 is divided by argument-2. The INT function takes only one argument and returns the integer portion of an argument, truncating the decimal portion. Note that the argument can be an expression.
DATA NEW ;
A = 123456 ;
X = INT( A/1000 ) ;
Y = MOD( A, 1000 ) ;
Z = MOD( INT( A/100 ), 100 ) ;
PUT A= X= Y= Z= ;
32) How to CREATE an external dataset with sas code?
I thought about sth like this:
filename fileref <device-type>
something like this buy it do not work.
and i would like to add BLK, DISP; UNIT SIZE
33) What is the use of Proc SQl?
PROC SQL is a powerful tool in SAS, which combines the functionality of data and proc steps. PROC SQL can sort, summarize, subset, join (merge), and concatenate datasets, create new variables, and print the results or create a new dataset all in one step! PROC SQL uses fewer resources when compared to that of data and proc steps. To join files in PROC SQL it does not require to sort the data prior to merging, which is must, is data merge.
34) How can u create zero observation dataset?
Creating a data set by using the like clause.ex: proc sql;create table latha.emp like oracle.emp;quit;In this the like clause triggers the existing table structure to be copied to the new table. using this method result in the creation of an empty table.
35. What are input dataset and output dataset options?
Input data set options are obs, firstobs, where, in output data set options compress, reuse.Both input and output dataset options include keep, drop, rename, obs, first obs.
36. What other SAS features do you use for error trapping and data validation?
What are the validation tools in SAS?
For dataset: Data set name/debugData set: name/stmtchk
For macros: Options:mprint mlogic symbolgen.
37. How can you put a “trace” in your program?
ODS Trace ON, ODS Trace OFF the trace records.
38. What is Enterprise Guide? What is the use of it?
It is an approach to import text files with SAS (It comes free with Base SAS version 9.0)
39. What is the main difference between rename and label?
1. Label is global and rename is local i.e., label statement can be used either in proc or data step where as rename should be used only in data step.
2. If we rename a variable, old name will be lost but if we label a variable its short name (old name) exists along with its descriptive name.
40. What is the difference between compiler and interpreter?
Give any one example (software product) that act as an interpreter?
Both are similar as they achieve similar purposes, but inherently different as to how they achieve that purpose. The interpreter translates instructions one at a time, and then executes those instructions immediately. Compiled code takes programs (source) written in SAS programming language, and then ultimately translates it into object code or machine language. Compiled code does the work much more efficiently, because it produces a complete machine language program, which can then be executed.