Advanced SAS Interview Questions and Answers

Advanced SAS Interview Questions and Answers

Last updated on 13th Oct 2020, Blog, Interview Question

About author

Kannan (Senior Technical Manager )

He is Highly Experienced in Respective Technical Domain with 6+ Years, Also He is a Respective Technical Trainer for Past 5 Years & Share's This Important Articles For us.

(5.0) | 16147 Ratings 2365

Want to shift your career to Advanced SAS? Then we will provide you with the complete details about the Interview question and Answers on our site page. We have provided all level of expertise question and answers to clear an interview on the first attempt itself. If you are good at the SAS concepts there are many leading companies that offer various job role like Base SAS and Advanced SAS, Clinical SAS Programmer, Analytics with Sas, SAS Analyst, Data Analyst (R/SAS), Sr Analyst Marketing sas, Advance Analytics – SAS, AVP – Analytics & Reporting, Software Developer- SAS Base and many other roles too. For any further details about the Advanced SAS jobs and Advanced SAS Interview Question and Answers visit our site Wisdomjobs.com.

1. Explain what SAS informats?

Ans:

  • SAS INFORMATS are used to read, or input data from external files known as Flat Files ASCII files, text files or sequential files).
  • The informat will tell SAS on how to read data into SAS variables.

2. What is the difference between %LOCAL and %GLOBAL?

Ans:

The %LOCAL that variable will be used only at the particular block only but in case of the %GLOBAL that variable will be used till the end of the SAS session.

3. What are SYMGET and SYMPUT?

Ans:

SYMPUT puts the value from a dataset into a macro variable where as SYMGET gets the value from the macro variable to the dataset.

4. What system options would you use to help debug a macro?

Ans:

  1. 1. The SAS System offers users a number of useful system options to help debug macro issues and problems.
  2. 2. The results associated with using macro options are automatically displayed on the SAS Log.
  3. 3. Specific options related to macro debugging appear in alphabetical order in the table below.
  4. 4. MEMRPT. Specifies that memory usage statistics be displayed on the SAS Log.

5. What are automatic variables for macro?

Ans:

Every time we invoke SAS, the macro processor automatically creates certain macro var.

Example : &sysdate, &sysday.

6. What is call symput?

Ans:

CALL SYMPUT takes a value from a data step and assigns it to a macro variable.

I can then use this macro variable in later steps. To assign a value to a single macro variable,

We use CALL SYMPUT with this general form.

  • CALL SYMPUT (“macro-variable-name”, value);

7. How are parameters passed to a macro?

Ans:

  • A macro variable defined in parentheses in a %MACRO statement is a macro parameter.
  • Macro parameters allow you to pass information into a macro.
  • Here is a simple example.
  • %macro plot (yvar= ,xvar= );
  • proc plot;
  • plot &yvar*&xvar;
  • run; %mend plot;
  • %plot(age,sex)

8. For what purposes have you used SAS macros?

Ans:

  1. 1. If we want to use a program step for executing to execute the same Proc step on multiple data sets.
  2. 2. We can accomplish repetitive tasks quickly and efficiently. A macro program can be reused many times.
  3. 3. Parameters passed to the macro program customize the results without having to change the code within the macro program.
  4. 4. Macros in SAS make a small change in the program and have SAS echo that change throughout that program.

9. How would you define the end of a macro?

Ans:

The end of the macro is defined by %Mend Statement

10. How would you identify a macro variable?

Ans:

with Ampersand (&) sign

Subscribe For Free Demo

Error: Contact form not found.

11. What is the maximum length of the macro variable?

Ans:

32 characters long.

12. Mention what are the data types does SAS contain?

Ans:

The data types in SAS are Numeric and Character.

13. Explain what Proc glm does?

Ans:

Proc glm performs simple and multiple regression, analysis of variance (ANOVAL), analysis of covariance, multivariate analysis of variance and repeated measure analysis of variance.

14. Explain what is the use of PROC gplot?

Ans:

PROC gplot has more options and can create more colorful and fancier graphics.

15. What is the difference between nodupkey and nodup options?

Ans:

The difference between the NODUP and NODUPKEY is that, NODUP compares all the variables in our dataset while NODUPKEY compares just the BY variables

16. Two ways to select every second row in a data set?

Ans:

  • data example;
  • set sashelp.class;
  • if mod(_n_,2) eq 0;
  • run;
  1. 1. MOD Function returns the remainder from the division of the first argument by the second argument. _N_ corresponds to each row.
  2. 2. The second row would be calculated like (2/2) which returns zero remainder.

data example1,

  • do i = 2 to nobs by 2;
  • set sashelp.class point=i nobs=nobs;
  • output;
  • end;
  • stop;
  • run;

17. How to select every second row of a group?

Ans:

Suppose we have a table sashelp.class. We want every second row by variable ‘sex’.

data example2,

  • proc sort data = sashelp.class;
  • by sex;
  • run;
  • data example2 (drop = N);
  • set sashelp.class;
  • by sex;
  • if first.sex then N = 1;
  • else N +1;
  • if N = 2 then output;
  • run;

18. How to calculate cumulative sum by group?

Ans:

Create Sample Data

  • data abcd;
  • input x y;
  • cards;
  • 1 25
  • 1 28
  • 1 27
  • 2 23
  • 2 35
  • 2 34
  • 3 25
  • 3 29
  • ;
  • run; 

Cumulative Sum by Group

Cumulative Sum by X

data example3,

  • set abcd;
  • if first.x then z1 = y;
  • else z1 + y;
  • by x;
  • run;

19. Can both WHERE and IF statements be used for subsetting on a newly derived variable?

Ans:

SAS . WHERE vs. IF

  1. 1. No. Only IF statement can be used for subsetting when it is based on a newly derived variable.
  2. 2. WHERE statement would return an error “newly derived variable is not on file”.
  3. 3. Please note that WHERE Option can be used for subsetting on a newly created variable.

data example4,

  • where =(z <=50));
  • set abcd;
  • z = x*y;
  • run;

20. Select the Second Highest Score with PROC SQL?

Ans:

  • input Name $ Score;
  • cards;
  • sam 75
  • dave 84
  • sachin 92
  • ram 91;
  • run;
  • proc sql;
  • select * from example5
  • where score in (select max(score) from example5 where score not in (select max(score) from example5));
  • quit; 

Tutorial . Learn PROC SQL with 20 Examples

21. Two ways to create a macro variable that counts the number of observations in a dataset

Ans:

  • data _NULL_;
  • if 0 then set sashelp.class nobs=n;
  • call symputx(‘totalrows’,n);
  • stop;
  • run;
  • %put nobs=&totalrows.;
  • proc sql;
  • select count(*) into. nrows from sashelp.class;
  • quit;
  • %put nobs=%left(&nrows.);

22. Suppose you have data for employees. It comprises employees’ name, ID and manager ID. You need to find out the manager ‘s name against each employee ID.

Ans:

SQL. Self Join

Create Sample Data

data example2;

  • input Name $ ID ManagerID;
  • cards;
  • Smith 123 456
  • Robert 456  .
  • William 222 456
  • Daniel 777 222
  • Cook 383 222
  • ;
  • run;

SQL Self Join

  • proc sql;
  • create table want as
  • select a.*, b.Name as Manager
  • from example2 as a left join example2 as b
  • on a.managerid = b.id;
  • quit;

Data Step . Self Join 

  • proc sort data=example2 out=x;
  • by ManagerID;
  • run;
  • proc sort data=example2 out=y (rename=(Name=Manager ID=ManagerID ManagerID=ID));
  • by ID;
  • run;
  • data want;
  • merge x (in= a) y (in=b);
  • by managerid;
  • if a;
  • run;

Create a macro variable and store TomDick&Harry

Issue : When the value is assigned to the macro variable, the ampersand placed after TomDick may cause SAS to interpret it as a macro trigger and an warning message would be occurred.

  • %let x = %NRSTR(TomDick&Harry);
  • %PUT &x.;

%NRSTR function is a macro quoting function which is used to hide the normal meaning of special tokens and other comparison and logical operators so that they appear as constant text as well as to mask the macro triggers ( %, &).

23. Difference between %STR and %NRSTR

Ans:

Both %STR and %NRSTR functions are macro quoting functions which are used to hide the normal meaning of special tokens and other comparison and logical operators so that they appear as constant text.

The only difference is %NRSTR can mask the macro triggers ( %, &) whereas %STR cannot.

24. How to pass unmatched single or double quotations text in a macro variable?

Ans:

  • %let eg  = %str(%’x);
  • %let eg2 = %str(x%”);
  • %put &eg;
  • %put &eg2;

If the argument to %STR or %NRSTR contains an single or double quotation mark or an unmatched open or close parenthesis, precede each of these characters with a % sign.

25. How can we use COUNTW function in a macro?

Ans:

  • %let cntvar = %sysfunc(countw(&nvar));

There are several useful Base SAS function that are not directly available in Macro, %Sysfunc enables those function to make them work in a macro.

  • %let x=temp;
  • %let n=3;
  • %let x3=result;
  • %let temp3 = result2;
Course Curriculum

Best JOB Oriented Advanced SAS Course to Enhance Your Skills

Weekday / Weekend BatchesSee Batch Details

26. How to reference a macro variable in selection criteria?

Ans:

Use double quotes to reference a macro variable in a selection criteria. Single quotes would not work.

27. How to debug %IF %THEN statements in a macro code

Ans:

MLOGIC option will display how the macro variable resolved each time in the LOG file as TRUE or FALSE for %IF %THEN.

28. Difference between %EVAL and %SYSEVALF functions

Ans:

  • Both %EVAL and %SYSEVALF are used to perform mathematical and logical operation with macro variables. %let last = %eval (4.5+3.2); returns error as %EVAL cannot perform arithmetic calculations with operands that have the floating point values.
  • It is when the %SYSEVALF function comes into picture.
  • %let last2 = %sysevalf(4.5+3.2);
  • %put &last2;

29. What would be the value of i after the code below completes

Ans:

  • data test;
  • set temp;
  • array nvars {3} x1-x3;
  • do i = 1 to 3;
  • if nvars{i} > 3 then nvars{i} =.;
  • end;
  • run;

Answer is 4. It is because when the first time the loop processes, the value of count is 1; the second time, 2; and the third time, 3. At the beginning of the fourth iteration, the value of count is 4, which is found to be greater than the stop value of 3 so the loop stops. However, the value of i is now 4 and not 3, the last value before it would be greater than 3 as the stop value.

30. How to compare two tables with PROC SQL

Ans:

The EXCEPT operator returns rows from the first query that are not part of the second query.

  • proc sql;
  • select * from newfile
  • except
  • select * from oldfile;
  • quit;

31. Selecting Random Samples with PROC SQL

Ans:

The RANUNI and OUTOBS functions can be used for selecting N random samples. The RANUNI function is used to generate random numbers.

  • proc sql outobs = 10;
  • create table tt as
  • select * from sashelp.class
  • order by ranuni(1234);
  • Quit;

32. How to use NODUPKEY kind of operation with PROC SQL

Ans:

In PROC SORT, NODUPKEY option is used to remove duplicates based on a variable. In SQL, we can do it like this .

  • proc sql noprint;
  • create table tt (drop = row_num) as
  • select *, monotonic() as row_num
  • from readin
  • group by name
  • having row_num = min(row_num)
  • order by ID;
  • quit;

33. How to make SAS stop macro processing on Error

Ans:

Check out this link – Stop SAS Macro on Error

34. Count Number of Variables assigned in a macro variables

Ans:

  • %macro nvars (ivars);
  • %let n=%sysfunc(countw(&ivars));
  • %put &n;
  • %mend;
  • %nvars (X1 X2 X3 X4);

35. Write a SAS Macro to extract Variable Names from a Dataset

Ans:

  • *Selecting all the variables;
  • proc sql noprint;
  • select name into . vars separated by ” “
  • from dictionary.columns
  • where LIBNAME = upcase(“work”)
  • and MEMNAME = upcase(“predata”);
  • quit;

The DICTIONARY.COLUMNS contains information such as name, type, length, and format, about all columns in the table. LIBNAME . Library Name, MEMNAME . Dataset Name

  • %put variables = &vars.;

36. How would DATA STEP MERGE and PROC SQL JOIN works on the following datasets shown in the image below?

Ans:

Many to Many Merge

The DATA step does not handle many-to-many matching very well.

When we perform many to many merges. the result should be a cartesian (cross) product.

For example, if there are three records that match from one contributing data set to two records from the other, the resulting data set should have 3 × 2 = 6 records.

Whereas, PROC SQL creates a cartesian product in case of many to many relationship.

37. If you have CRT, if you have what you say, what did you say?

Ans:

Yes, I have created a patient profile for my manager’s demands and statistics.

I use PROC contents and PROC SQL to create a list of simple patient with all information about a specific patient, including age, gender, race etc.

38. Store value in each row of a variable into macro variables

Ans:

  • data _null_;
  • set sashelp.class ;
  • call symput(cats(‘x’,_n_),Name);
  • run;
  • %put &x1. &x2. &x3.;

The CATS function is used to concatenate ‘x’ with _N_ (row index number) and removes leading and trailing spaces to the result.

39. Enlist the functions performed by SAS.

Ans:

SAS (Statistical Analysis System) has its own importance in every business domain.

Enlisted below are some of the summarized functions that are performed by SAS.

  1. 1. Data Management and Project Management
  2. 2. Data Warehousing
  3. 3. Operational Research and decisional support
  4. 4. Information Retrieval and Quality Management
  5. 5. Business Planning
  6. 6. Statistical Analysis

40. What are the 3 components in SAS programming?

Ans:

The 3 components in SAS programming are.

  1. 1. Statements
  2. 2. Variables
  3. 3. Dataset
Advanced-Sas Sample Resumes! Download & Edit, Get Noticed by Top Employers! Download

41. Enlist the syntax rules followed in SAS statements.

Ans:

The SAS program is written in the Editor Window. Here, it contains a series of statements followed by the proper syntax in an order for the SAS program to understand it.

Some of the syntax rules that are followed in the case of Statement component of SAS are as follows.

  • The end of any statement is marked by a semicolon (;).
  • A semicolon is also used to separate multiple statements that appear on a single line.
  • SAS statements are not case sensitive and extra spacing before statements are automatically removed.
  • Comments can be included in the SAS program for statements in two different ways as.
  • A line beginning with an asterisk (*) and ending with a semicolon (;).
  • A line beginning with a forwarding slash and an asterisk (/*) and ending with an asterisk and a forward slash (*/).

42. What are the data types that SAS contains?

Ans:

‘Numeric’ and ‘Character’ are the two types of data types which the SAS program contains.

43. What are PDV and their functions?

Ans:

Program Data Vector (PDV) is a logical concept and is defined as an area of memory where a data set is being built by SAS.

Functions of PDV are as follows.

  1. 1. A database having one observation at one time is created.
  2. 2. The input buffer for holding the data from an external file is created at the time of compilation.
  3. 3. PDV contains two automatic variables namely, _N_ (displays the count of the data step that is being executed) and _ERROR_ (notifies the error that occurs at the time of execution).

44. What do you know about the SAS data set?

Ans:

SAS data set is basically referred to as the data that is available for analysis within a SAS program. SAS dataset is also referred to as the SAS data table.

SAS data table consists of two parts.

  1. 1. Columns of variables
  2. 2. Rows of observations

Useful information about the SAS data set can be summarized as follows.

  1. 1. SAS Dataset can read as well as it has built-in data sources for use like Excel, Access, etc.
  2. 2. The dataset which is used only in the current session run and discarded after the session ends is known as Temporary Dataset.
  3. 3. The Dataset that is stored for use in the future session is also known as the Permanent Dataset.
  4. 4. The built-in data set can be accessed using this path Libraries -> My Libraries->SASHELP.

45. Explain why double trailing @@ is used in Input Statements?

Ans:

During data step iteration, including double trailing @@ in Input statements implies that SAS should hold the current record for the purpose of execution of the next Input statement rather than switching onto the new record.

46. Explain the difference between NODUP and NODUPKEY options?

Ans:

For removing duplicate values from the table, PROC SORT is basically categorized between two options.

  1. 1. NODUP
  2. 2. NODUPKEY

The difference between these two options can be seen below.

NODUPKEY NODUP

Compares just the BY variable present in the dataset. Compares all the variables present in the dataset.

Removes duplicate options for the values of variable listed in BY statement. Identifies and eliminates duplicate observations.

47. Which command is used to perform sorting in the SAS program?

Ans:

  • PROC SORT

This command is used for performing sorting, be it on a single variable or multiple variables, and is performed on the dataset where the new data set is created as a result of sorting but the original data set remains unchanged.

Sorting can be done in both ascending as well as descending order.

48. Differentiate INPUT and INFILE.

Ans:

Including an INFILE statement within the SAS programming identifies an external file that consists of the data, whereas including INPUT statement in SAS programming describes the variables used.

49. Explain the use of PROC print and PROC contents?

Ans:

The PROC step of the SAS program is used to invoke built-in procedures for analyzing the data of the dataset.

PROC print – Ensures that the data present in the dataset are read correctly.

PROC contents – Displays the information about the SAS dataset.

50. Explain DATA_NULL_?

Ans:

As the name defines, DATA_NULL_ is a data step that actually does not create any data set.

It is used for.

  1. 1. Creating macro variables.
  2. 2. Writing the output without any data set.

51. How is the character variable converted into a numeric variable and vice versa?

Ans:

Under SAS programming, there arise many tasks where a character value is to be converted into the numeric and in the same way, a numeric value is to be converted into a character value.

PUT() is used to convert numeric to character. In this case, the source format and source variable type must always be similar.

52. What is the purpose of _CHARACTER_ and _NUMERIC_?

Ans:

In the current dataset, _CHARACTER_ defines all the character variables that are currently defined.

Example: To include all the character variables in PROC MEANS, the following statements are used.

  • PROC MEANS;>
  • Var_character_;
  • Run;

_NUMERIC_ defines all the numeric variables that are currently defined.

Example: To include all the numeric variables in PROC MEANS, following statements are used.

  • PROC MEANS;>
  • Var_numeric_;
  • Run;

53. What commands are used in the case of including or excluding any specific variables in the data set?

Ans:

  • DROP, KEEP, and data set options are used for this purpose.
  • The variable we want to remove from the data step is specified in the DROP statement.
  • The variable we want to retain from the data step is specified in the KEEP statement.

54. Differentiate between PROC MEANS and PROC SUMMARY.

Ans:

The difference between PROC MEANS and PROC SUMMARY can be understood as follows.

PROC MEANS PROC SUMMARY
This procedure produces the printed report by default in the OUTPUT window.

This procedure includes the PRINT in the statement to produce the printed report.

By default take all the numeric variables in the analysis.

Takes the variables into the statistical analysis that are described in VAR statement.

55. Explain the purpose of SUBSTR functions in SAS programming.

Ans:

In SAS programming, whenever there is a requirement of the program to abstract a substring, the SUBSTR function is used in the case of a character variable.

When a start position and length are specified, then this function is used for abstracting character string.

56. Name and describe a few SAS character functions that are used for data cleaning in brief.

Ans:

Few SAS character functions that are used for data cleaning are enlisted below.

  1. 1. Compress(char_string) : function is used for removing blanks or some specified characters from a given string.
  2. 2. TRIM(str) : function is used for removing trailing blanks from a given string.
  3. 3. LOWCASE(char_string) : function is used for converting all the characters in a given string to lowercase.
  4. 4. UPCASE(char_string) : function is used for converting all the characters in a given string to uppercase.
  5. 5. COMPBL(str) : function is used for converting multiple blanks to a single blank.

57. Differentiate between CEIL and FlOOR functions.

Ans:

CEIL  function FLOOR function
It is used for truncating numeric values where it displays the output as the smallest integer. By smallest integer, here means the integer value is greater than/equal to the argument. It is used for truncating numeric values where it displays the output as the greatest integer. By greatest integer, here means that the integer value is less than/equal to the argument.

Example : CEIL(12.85) will display output as 13.

Example : FLOOR(12.85) will display output as 12.

58. What are the ways in which Macro variables can be created in SAS programming?

Ans:

Well a number of different techniques can be used to create macro variables in SAS programming.

Enlisted below are the five most commonly used methods.

  1. 1. %LET statement
  2. 2. Macro parameters (named as well as positional)
  3. 3. %DO statement (iterative)
  4. 4. INTO in PROC SQL
  5. 5. CALL SYMPUTX routine

59. Explain the purpose of the RETAIN statement.

Ans:

As the meaning of the word ‘RETAIN’ signifies to keep the value once assigned, the purpose of the RETAIN statement is the same in SAS programming as it’s meaning implies.

Within a SAS program, when it is required to move from the current iteration to the next of the data step, at that time the RETAIN statement tells SAS to retain the values rather than set them to missing.

60. Which command is used to save logs in the external file?

Ans:

  • PROC PRINTTO

This command is used to save logs in the external file.

61. Mention some common errors that are usually committed in SAS programming.

Ans:

  • Enlisted below are some of the common errors which are usually committed especially when you are new to this programming language.
  • The basic syntax includes a semicolon at the end of each statement and missing a semicolon is the most common mistake.
  • You skip checking the logs after submitting the program.
  • Commenting errors like failing to use comments where necessary or using comments in an inappropriate way.
  • Not using proper debugging methods.

62. Mention SAS system options to debug SAS macros.

Ans:

To help in tracking the macro code as well as the SAS code generated by the macros, some system options can be used.

63. Differentiate between SAS functions and SAS procedures.

Ans:

The major differences can be discovered/understood by the case explained for both SAS functions and Procedures.

Case:

For Function, argument value is supplied or say taken for calculation across the observation mentioned in the program statement whereas, in the case of Procedure, every observation is expected to have only one variable through which calculation is done as mentioned in the below example.

64. What do you know about SYMPUT and SYMGET?

Ans:

The major differences between the two are mentioned below.

SYMPUT is used for storing the value of a data set into the macro variable whereas SYMGET is used for retrieving the value from the macro variable to the data set.

65. Explain the special input delimiters used in SAS programming.

Ans:

The special input delimiters used in SAS programming are.

  1. 1. DLM
  2. 2. DSD

They are used in the statement ‘INFILE’ and DSD has the functionality of ignoring the delimiters that appear enclosed in quotation marks.

66. Which function is used to count the number of intervals between two SAS dates?

Ans:

Interval function INTCK is used for counting the number of intervals between two given SAS dates.

67. What %put &&x&n; and %put &&&x&n; would return?

Ans:

&&x&n : Two ampersands (&&) resolves to one ampersand (&) and scanner continues and then N resolves to 3 and then &x3 resolves to result.

&&&x&n :  First two ampersands (&&) resolves to & and then X resolves to temp and then N resolves to 3. In last, &temp3 resolves to result2.

Are you looking training with Right Jobs?

Contact Us

Popular Courses