Informatica Interview Questions and Answers
Last updated on 23rd Sep 2020, Blog, Interview Question
1. What is Informatica PowerCenter?
Ans:
Informatica PowerCenter is an ETL/data integration tool that has a wide range of applications. This tool allows users to connect and fetch data from different heterogeneous sources and subsequently process the same.
For example, users can connect to a SQL Server Database or an Oracle Database, or both, and also integrate the data from both these databases to a third system.
2. Mention some typical use cases of Informatica.
Ans:
There are many typical use cases of Informatica, but this tool is predominantly leveraged in the following scenarios:
- When organizations migrate from the existing legacy systems to new database systems
- When enterprises set up their data warehouse
- While integrating data from various heterogeneous systems including multiple databases and file-based systems
- For data cleansing
3. How can we filter rows in Informatica?
Ans:
There are two ways to filter rows in Informatica, they are as follows:
- Source Qualifier Transformation: It filters rows while reading data from a relational data source. It minimizes the number of rows while mapping to enhance performance. Also, Standard SQL is used by the filter condition for executing in the database.
- Filter Transformation: It filters rows within a mapped data from any source. It is added close to the source to filter out the unwanted data and maximize performance. It generates true or false values based on conditions.
4. Differentiate between Joiner and Lookup transformations.
Ans:
Joiner | Lookup |
---|---|
It is not possible to override the query | It is possible to override the query |
Only the ‘=’ operator is available | All operators are available for use |
Users cannot restrict the number of rows while reading relational tables | Users can restrict the number of rows while reading relational tables |
It is possible to join tables with Joins | It behaves as Left Outer Join while connecting with the database |
5. In Informatica Workflow Manager, how many repositories can be created?
Ans:
Depending upon the number of ports that are required, repositories can be created. In general, however, there can be any number of repositories.
6. What are the types of lookup transformation?
Ans:
There are four different types of lookup transformation:
- 1. Relational or flat file lookup: It performs lookup on relational tables.
- 2. Pipeline lookup: It performs lookup on application sources.
- 3. Connected or unconnected lookup: While the connected lookup transformation receives data from source, performs lookup, and returns the result to the pipeline, the unconnected lookup happens when the source is not connected. It returns one column to the calling transformation.
- 4. Cached or uncached lookup: Lookup transformation can be configured to cache lookup data, or we can directly query the lookup source every time a lookup is invoked.
7. How do pre- and post-session shell commands function?
Ans:
A command task can be called as a pre- or post-session shell command for a session task. Users can run it as a pre-session command, a post-session success command, or a post-session failure command. Based on use cases, the application of shell commands can be changed or altered.
8. What can we do to improve the performance of Informatica Aggregator Transformation?
Ans:
Aggregator performance improves dramatically if records are sorted before passing to the aggregator and if the ‘sorted input’ option under Aggregator Properties is checked. The record set should be sorted on those columns that are used in the Group By operation. It is often a good idea to sort the record set in the database level, e.g., inside a source qualifier transformation, unless there is a chance that the already sorted records from the source qualifier can again become unsorted before reaching the aggregator.
9. How can we update a record in the target table without using Update Strategy?
Ans:
A target table can be updated without using ‘Update Strategy.’ For this, we need to define the key in the target table in Informatica level, and then we need to connect the key and the field we want to update in the mapping target. In the session level, we should set the target property as ‘Update as Update’ and check the ‘Update’ check box.
Let us assume, we have a target table ‘Customer’ with fields as ‘Customer ID,’ ‘Customer Name,’ and ‘Customer Address.’ Suppose if we want to update ‘Customer Address’ without an Update Strategy, then we have to define ‘Customer ID’ as the primary key in Informatica level, and we will have to connect ‘Customer ID’ and ‘Customer Address’ fields in the mapping. If the session properties are set correctly as described above, then the mapping will only update the ‘Customer Address’ field for all matching customer IDs.
10. Why do we use mapping parameters and mapping variables?
Ans:
Basically, mapping parameters and mapping variables represent values in mappings and mapplets.
Mapping Parameters
- Mapping parameters represent constant values that are defined before running a session.
- After creation, parameters appear in the Expression Editor.
- These parameters can be used in source qualifier filters, in user-defined joins, or for overriding.
Mapping Variables
- As opposed to mapping parameters, mapping variables can change values during sessions.
- The last value of a mapping variable is saved to the repository at the end of each successful session by the Integration Service. However, it is possible to override saved values with parameter files.
- Basically, mapping variables are used to perform incremental reads of data sources.
Subscribe For Free Demo
Error: Contact form not found.
11. What are the features of complex mapping?
Ans:
The features of complex mapping are as follows:
- There are more numbers of transformations
- It uses complex business logic
12. How can we identify whether a mapping is correct or not without a connecting session?
Ans:
With the help of the debugging option, we can identify whether a mapping is correct or not without connecting sessions.
13. Can we use mapping parameters or variables, developed in one mapping, into any other reusable transformation?
Ans:
Yes, we can use mapping parameters or variables into any other reusable transformation because they don’t have any mapplet.
14. What is the use of the aggregator cache file?
Ans:
If extra memory is needed, the aggregator provides extra cache files for keeping the transformation values. It also keeps the transitional values that are there in the local buffer memory.
15. What is lookup transformation?
Ans:
The transformation that has an entrance right to RDBMS is known as lookup transformation.
16. What do you understand by the term ‘role-playing dimension’?
Ans:
The dimensions that are used for playing diversified roles while remaining in the same database domain are known as role-playing dimensions.
17. How can we access repository reports without SQL or other transformations?
Ans:
We can access repository reports by using a metadata reporter. There is no need of using SQL or other transformations as it is a web app.
18. Mention the types of metadata that are stored in the repository.
Ans:
The types of metadata, which is stored in the repository, are Target definition, Source definition, Mapplet, Mappings, and Transformations.
19. What is code page compatibility?
Ans:
When data is transferred from the source code page to the target code page then all the characteristics of the source page must be present in the target page to prevent data loss, this feature is called Code Page Compatibility.
Code page compatibility comes into picture when the INFORMATICA server is running in Unicode data movement mode. In this case, the two code pages are said to be identical when their encoded characters are virtually identical and thus results in no loss of data.
For complete accuracy, it is said that the source code page is the subset of the target code page.
20. How can we confirm all mappings in the repository simultaneously?
Ans:
At a time, we can validate only one mapping. Hence, mapping cannot be validated simultaneously.
21. Define Aggregator transformation.
Ans:
It is different from expression transformation in which we can do calculations in set, but in aggregator transformation we can do aggregate calculations such as averages, sum, etc.
22. What is Expression transformation?
Ans:
It is used for performing nonaggregated calculations. We can test conditional statements before the output results are moved to the target tables.
An expression transformation is a collective Powercenter mapping transformation.
23. Define Filter transformation.
Ans:
Filter transformation is a way of filtering rows in a mapping. It has all ports of input/output, and the row which matches with that condition can only pass by that filter.
24. Define Joiner transformation.
Ans:
It combines two associated mixed sources located in different locations, while a source qualifier transformation can combine data rising from a common source.
25. What do you mean by Lookup transformation?
Ans:
Lookup transformation is used for maintaining data in a relational table through mapping. We can use multiple lookup transformations in a mapping.
26. Define the command task.
Ans:
This task permits one or more shell commands in UNIX or DOS in Windows to run during the workflow.
27. Explain a standalone command task.
Ans:
This task can be used anywhere in the workflow to run the shell commands.
28. What is a predefined event?
Ans:
A predefined event is the file-watch event. It waits for a specific file to arrive at a specific location.
29. What is a user-defined event?
Ans:
User-defined events are a flow of tasks in the workflow. Events can be developed and then raised as per requirement. It allows users to create and name an event.
30. Define workflow.
Ans:
The group of directions that communicates with the server about how to implement tasks is known as workflow.
31. Mention the different tools used in Workflow Manager?
Ans:
The different tools used in Workflow Manager are:
- Task Developer
- Task Designer
- Workflow Designer
32. Name the other tools used for scheduling purposes other than Workflow Manager and pmcmd.
Ans:
‘CONTROL M’ is a third-party tool used for scheduling purposes.
33. Define OLAP (Online Analytical Processing).
Ans:
It is a process by which multi-dimensional analysis occurs.
34. From where can we find the throughput option in Informatica?
Ans:
In Workflow Monitor, we can find the throughput option. By right-clicking on session, then pressing on get run properties, and, under source/target statistics, we can find this option.
35. Define target load order.
Ans:
It is specified on the criteria of source qualifiers in a mapping. If there are many source qualifiers attached to various targets, then we can entitle an order in which Informatica loads data in targets.
36. Define Informatica.
Ans:
These process flow diagrams are known as mappings. Once a mapping is made, it can be scheduled to run as and when required. In the background, the Informatica server takes care of fetching data from the source, transforming it, and loading it to the target.
Informatica is a tool, supporting all the steps of the Extraction, Transformation, and Load (ETL) process. Nowadays, Informatica is also being used as an integration tool. Informatica is an easy-to-use tool. It has got a simple visual interface like forms in visual basic. You just need to drag and drop different objects (known as transformations) and design process flow for data extraction, transformation, and load.
37. What are the new features of Informatica 9.x Developer?
Ans:
From an Informatica Developer’s perspective, some of the new features in Informatica 9.x are as follows:
- 1. Lookup can be configured as an active transformation—it can return multiple rows on successful matches.
- 2. You can write SQL override in uncached lookups also. Previously, you could do it only on cached lookups.
- 3. You can control the size of the session log. In a real-time environment, you can control the session log file size or time.
- 4. Database deadlock resilience feature—this will ensure that the session does not immediately fail if it encounters any database deadlock; it will now retry the operation again. You can configure the number of retry attempts.
38. What are the advantages of Informatica?
Ans:
The following are the advantages of Informatica:
- It is a GUI tool; coding in any graphical tool is generally faster than hand code scripting.
- It can communicate with all major data sources (mainframe/RDBMS/Flat Files/XML/VSM/SAP, etc).
- It can handle very large data very effectively.
- The user can apply mappings, extract rules, cleansing rules, transformation rules, and aggregation logic and loading rules are in separate objects in an ETL tool. Any change in any of the objects will give a minimum impact on other objects.
- The object is reusable (Transformation Rules).
Informatica has different ‘adapters’ for extracting data from packaged ERP applications (such as SAP or PeopleSoft). - Resources are available in the market.
- It can be run on Windows and Unix environment
39. Which are the different editions of INFORMATICA PowerCenter that are available?
Ans:
Different editions of INFORMATICA PowerCenter are:
- Standard Edition
- Advanced Edition
- Premium Edition
The current version of PowerCenter available is v10 with a high-performance increase.
40. How can you differentiate between PowerCenter and Power Map?
Ans:
Given below are the differences between PowerCenter and Power Map.
INFORMATICA PowerCenter | INFORMATICA PowerMart |
---|---|
Processes high volume of data | Processes low volume of data |
Supports global and local repositories | Supports only local repositories |
Supports data retrieval from ERPSources like SAP, PeopleSoft etc. | Do not support data retrieval from ERP sources |
Converts local repositories to global | Do not convert local repositories to global |
41. What are the different components of PowerCenter?
Ans:
Given below are the 7 important components of PowerCenter:
- 1. PowerCenter Service
- 2. PowerCenter Clients
- 3. PowerCenter Repository
- 4. PowerCenter Domain
- 5. Repository Service
- 6. Integration Service
- 7. PowerCenter Administration Console
- 8. Web Service Hub
42. What are the different clients of PowerCenter?
Ans:
Here is the list of PowerCenter clients:
- PowerCenter designer
- PowerCenter workflow monitor
- PowerCenter workflow manager
- PowerCenter repository manager
43. How to elaborate Tracing Level?
Ans:
Tracing level can be defined as the amount of information that the server writes in the log file. Tracing level is created and configured either at the transformation level or at session-level else at both the levels.
Given below are the 4 types of tracing level:
- None
- Terse
- Verbose Initialization
- Verbose Data
44. What is Enterprise Data Warehousing?
Ans:
When a large amount of data is assembled at a single access point then it is called Enterprise Data Warehousing. This data can be reused and analyzed at regular intervals or as per the need of the time requirement.
Considered as the central database or say a single point of access, enterprise data warehousing provides a complete global view and thus helps in decision support.
It can be more understood from the following points which define its features:
- All important business information stored in this unified database can be accessed from anywhere across the organization.
- Although the time required is more, periodic analysis on this single source always produces better results.
- Security and integrity of data are never compromised while making it accessible across the organization.
45. What is the benefit of Session Partitioning?
Ans:
While integration service is running in the environment the workflow is partitioned for better performance. These partitions are then used to perform Extraction, Transformation, and Loading.
46. How can we create an Index after completion of the Load Process?
Ans:
Command tasks are used to create an Index. Command task scripts can be used in a session of the workflow to create an index.
47. What are Sessions?
Ans:
Session is a set of instructions that are used while moving data from the source to the destination. We can partition the session to implement several sequences of sessions to improve server performance.
After creating a session we can use the server manager or command-line program pmcmd to stop or start the session.
48. How can we use Batches?
Ans:
Batches are the collection of sessions that are used to migrate the data from the source to target on a server. Batches can have the largest number of sessions in it but they cause more network traffic whereas fewer sessions in a batch can be moved rapidly.
49. What is Rank Transformation?
Ans:
Rank transformation is Active as well as Connected. It is used to sort and rank a set of records either top or bottom. It is also used to select data with the largest or smallest numeric value based on a specific port.
50. What is Rank Index in Rank transformation?
Ans:
Rank Index is assigned by the task designer to each record. The rank index port is used to store ranking positions for each row. Rank transformation identifies each row from the top to bottom and then assigns Rank Index.
51. What is Status Code in INFORMATICA?
Ans:
Code provides an Error handling mechanism during each session. Status Code is issued by the stored procedure to recognize whether it is committed successfully or not and provides information to the INFORMATICA server to decide whether the session has to be stopped or continued.
52. What are Junk Dimensions?
Ans:
Junk dimension is a structure that consists of a group of some junk attributes such as random codes or flags. It forms a framework to store related codes with respect to a specific dimension at a single place instead of creating multiple tables for the same.
53. What is Mapplet in Informatica?
Ans:
Mapplet is a reusable object that contains a certain set of rules for transformation and transformation logic that can be used in multiple mappings. Mapplet is created in the Mapplet Designer in the designer tool.
54. What is Decode in Informatica?
Ans:
To understand Decode, let’s consider it as similar to the CASE statement in SQL. It is basically the function that is used by an expression transformation in order to search a specific value in a record.
There can be unlimited searches within the Decode function where a port is specified for returning result values. This function is usually used in cases where it is required to replace nested IF statements or to replace lookup values by searching in small tables with constant values.
55. What is Union Transformation in INFORMATICA?
Ans:
Union transformation is used to combine the data from different sources and frame it with the same port and data type. It is much like a clause in SQL.
56. What is Source Qualifier Transformation in INFORMATICA?
Ans:
Source Qualifier transformation is useful in Mapping, whenever we add relational flat files it is automatically created. It is an active and connected transformation that represents those rows which are read by integration service.
57. What is INFORMATICA Worklet?
Ans:
Worklet works as a Mapplet with the feature of reusability, the only difference is that we can apply the Worklet to any number of workflows in INFORMATICA, unlike Mapplet. Worklet saves the logic and tasks at a single place to reuse.
Worklet is much similar to the Mapplet and is defined as the group of tasks that can be either reusable or non-reusable at the workflow level. It can be added to as many workflows as required. With its reusability feature, much time is saved as reusable logic can be developed once and can be placed from where it can be reused.
In the INFORMATICA PowerCenter environment, Mapplets are considered as the most advantageous feature. They are created by Mapplet designers and are a part of the Designer tool.
It basically contains a set of transformations that are designed to be reused in multiple mapping.
Mapplets are said to be reusable objects which simplify mapping by:
- Including multiple transformations and source definitions.
- Not required to connect to all input and output ports.
- Accept data from sources and pass to multiple transformations
Well, overall when it is required to reuse the mapping logic then the logic should be placed in the Mapplet.
58. What is SUBSTR in INFORMATICA?
Ans:
SUBSTR is a function that extracts or removes a set of characters from a larger character set.
Syntax:
- SUBSTR( string, start [,length] )
Where,
string defines the character that we want to search.
start is an integer that is used to set the position where the counting should get started.
Length is an optional parameter that is used to count the length of a string to return from its starting position.
For Example, SUBSTR(Contact, 5, 8), where we start at the 5th character of our contact and return to the next 8 characters.
59. What is the scenario which compels an informatica server to reject files?
Ans:
This happens when it faces DD_Reject in update strategy transformation. Moreover it disrupts the database constraint filed in the rows that were condensed.
60. Tell us something about an Expression Transformation in Informatica and how it is achieved.
Ans:
An expression transformation is a common Powercenter mapping transformation which is typically used for transforming data passed, one record at a time. It is passive and connected.
Data can be manipulated, variables can be created, and output ports can be generated within an expression. Conditional statements can be written within output ports or variables can be created for transforming data according to requirements.
61. Why does a user need an ETL tool?
Ans:
ETL tools prevent the users from writing complex code for connecting multiple sources and handle errors. ETL has been developed to prevent any such complex code-writing.
62. Define the terms ‘Active Transformation’ and ‘Passive Transformation’.
Ans:
An Active Transformation is the one that performs the given functions:
- Changes the number of rows during transformation input and output, such as Filter transformation.
- Changes the transaction boundary by defining commit or rollback points, such as the transaction control transformation
- Changes the row type, such as Update Strategy is active as it flags the rows for insert, delete, update or reject
A Passive Transformation is that type of Transformation which does not change the number of rows that pass through it, such as an Expression transformation.
63. What is the effect on a mapping if the data types are modified between the Source and the concerned Source Qualifier?
Ans:
The Source Qualifier transformation is used to display the transformation data types. The transformation data types, on the other hand, are used to determine how the source database binds data when the Integration Service reads it. If the data types are explicitly modified by the user in the Source Qualifier transformation or if the datatypes in the source definition and the Source Qualifier transformation do not match, the Designer marks the mapping as invalid when the project is finally saved.
64. Consider a situation: All the Select Distinct and the number of sorted ports property in the SQ have been utilized. Here, Custom SQL Query is added. What changes will occur?
Ans:
Whenever the user adds the Custom SQL or the SQL override query, the User-Defined Join, Source Filter, Number of Sorted Ports, and Select Distinct settings in the Source Qualifier transformation are overridden. Thus, only the user-defined SQL Query is finally executed in the database and all other options are implicitly ignored.
65. Where can we use the Source Filter, Select Distinct and number of sorted ports properties of the Source Qualifier transformation?
Ans:
The Source Filter option is generally used for causing a reduction in the number of rows for the Integration Service queries in order to improve performance.
The Select Distinct option is used to use the Integration Service for selecting unique values from a source and for filtering unnecessary data which might have stayed earlier in the data flow in order to improve performance.
The Number Of Sorted Ports option is used to sort the data from the source so as to use it in transformations like Aggregator or Joiner, as sorted input shows an improvement in the performance.
66. What happens if the SELECT list COLUMNS in the Custom overrides the SQL Query and the OUTPUT PORTS order in the SQ transformation are dissimilar?
Ans:
Any form of mismatch or change in the order of the list of selected columns in accordance with the connected transformation output ports usually causes a session failure.
67. Tell us about some situations where the Joiner transformation is used instead of the Source Qualifier transformation.
Ans:
When the user needs to join the Source Data of heterogeneous sources and to join flat files, he/she generally uses the Joiner transformation. The Joiner transformation can be freely used when the following types of sources have to be joined:
- Join data from different Flat Files.
- Join relational sources and flat files.
- Join data from different Relational Databases.
68. Tell us about the maximum number that can be used within the number of sorted ports for the Sybase source system.
Ans:
The Sybase can easily support a maximum of 16 columns in an ORDER BY clause. Hence a source that is Sybase, more than 16 columns should not be sorted.
69. How can the user successfully implement the Aggregation operation without actually using an Aggregator Transformation?
Ans:
The Expression Transformation can be used to achieve this task at a time when the user can access the previous row data as well as the currently processed data in an Expression Transformation. The Sorter, Expression and Filter transformation can be used for aggregation at the Informatica level.
70. What is a Sorter Transformation?
Ans:
Sorter transformation helps us sort collections of data by port or ports.
71. What is the procedure to import VSAM files from source to target? Do I need a special plugin?
Ans:
As far as I know, by using the power exchange tool to convert VSAM files to oracle tables then do mapping as usual to the target table.
72. What is Joiner Transformation?
Ans:
The joiner transformation is an active and connected transformation used to join two heterogeneous sources.
73. What is a Decode in Informatica?
Ans:
A decode in Informatica is a function used within an Expression Transformation.
74. What is a Router Transformation?
Ans:
The Router Transformation allows users to split a single pipeline of data into multiple.
75. What is a Rank Transformation?
Ans:
The Rank Transformation lets users sort and ranks the top or bottom set of records based on a specific port.
76. What is Filter Transformation?
Ans:
Filter transformation is used to filter out rows in a mapping. It is active and connected.
77. What is a Sequence Generator Transformation?
Ans:
Sequence Generator Transformation is used to generate primary fundamental values or a range of sequence numbers for calculations or processing. It is passive and connected.
78. What is a Master Outer Join?
Ans:
A master outer join is a specific join typesetting within a joiner transformation.
79. What are some examples of Informatica ETL programs?
Ans:
Some examples of Informatica ETL programs are mappings, workflows, and tasks.
80. Explain dimensional tables?
Ans:
Dimension tables contain attributes that describe fact records in the table.
81. What is star schema?
Ans:
It is the simplest form of data warehouse schema that consists of one or more dimensional and facts tables.
82. Describe snowflake schema?
Ans:
A snowflake schema is a fact table that is connected to several dimensional tables.
83. What is a mapplet?
Ans:
Mapplet is essential to create reusable mappings in Informatica.
84. What is a natural primary key?
Ans:
A natural primary key uniquely identifies each record within a table and relates records to additional data stored in other tables.
85. What is a surrogate key?
Ans:
A surrogate key is used as a substitute or replacement for the primary key when the update process becomes difficult for a future requirement. A Surrogate Key is any column or set of columns attached to every record in a Dimension table in a Data Warehouse.
86. What is the difference between a repository server and a powerhouse?
Ans:
A repository server controls the complete repository, which includes tables, charts, and various procedures, etc.
A powerhouse server governs the implementation of various processes among the factors of the server’s database repository.
87. How many repositories can be created in Informatica?
Ans:
As many as required.
88. Describe Data Concatenation?
Ans:
Data concatenation is the process of bringing different pieces of the record together.
89. How can one identify whether the mapping is correct or not without connecting the session?
Ans:
With the help of debugging options.
90. Name the designer tools for creating transformations?
Ans:
Mapping designer, transformation developer, and mapplet designer are used for creating transformations.
91. Differentiate between sessions and batches?
Ans:
A session is a set of commands for the server to move data to the target, while a batch is a set of tasks that can include one or more tasks.
92. What is the difference between static cache and dynamic cache?
Ans:
In the static cache, the data will remain the same for the entire session, whereas in the dynamic cache, whenever a row is inserted, the cache will also be updated.
93. What is the command used to run a batch?
Ans:
The pmcmd command.
94. What is a Session?
Ans:
A session in Informatica is a set of instructions to be followed when data is being transferred from source to target using Session Command. A Session Command can be a pre-session command or a post-session command.
Are you looking training with Right Jobs?
Contact Us- Tableau Interview Questions and Answers
- SSIS Tutorial
- Apache Spark Tutorial
- Talend Interview Questions and Answers
- Ab Initio Interview Questions and Answers
Related Articles
Popular Courses
- Talend Online Training
11025 Learners
- Tableau Online Training
12022 Learners
- Ab Initio Online Training
11141 Learners
- What is Dimension Reduction? | Know the techniques
- Difference between Data Lake vs Data Warehouse: A Complete Guide For Beginners with Best Practices
- What is Dimension Reduction? | Know the techniques
- What does the Yield keyword do and How to use Yield in python ? [ OverView ]
- Agile Sprint Planning | Everything You Need to Know