IDQ Interview Questions and Answers

IDQ Interview Questions and Answers

Last updated on 21st Oct 2020, Blog, Interview Question

About author

Anishkumar ((Sr Data Quality Manager ) )

High level Domain Expert in TOP MNCs with 8+ Years of Experience. Also, Handled Around 20+ Projects and Shared his Knowledge by Writing these Blogs for us.

(5.0) | 12547 Ratings 4133

Informatica offers products for ETL, data masking, data Quality, data replica, data virtualization, master data management, etc, and you must know that these technologies are very much in demand so yes, the future scope of Informatica is good.Learning Informatica is worth in 2019 as well as 2020. If we talk about ETL tools, then Informatica is the best and leading ETL tool in the market.So, career growth, and salary is very good for Informatica professionals. It is also very easy to learn Informatica and start a career in it

1. What is Lookup transformation?

Ans:

Lookup transformation is used to look up a source, source qualifier, or target in order to get relevant data. It is used to look up a ‘flat file’, ‘relational table’, ‘view’ or ‘synonym’.

Lookup can be configured as Active or Passive as well as Connected or Unconnected transformation. When the mapping contains the lookup transformation, the integration service queries the lookup data and compares it with lookup input port values. One can use multiple lookup transformations in a mapping.

The lookup transformation is created with the following type of ports:

  1. 1. Input port (I)
  2. 2. Output port (O)
  3. 3. Look up Ports (L)
  4. 4. Return Port (R)

2. What is Informatica PowerCenter?

Ans:

InformaticaPowerCenter is an ETL/data integration tool that has a wide range of applications. This tool allows users to connect and fetch data from different heterogeneous sources and subsequently process the same.

For example, users can connect to a SQL Server Database or an Oracle Database, or both, and also integrate the data from both these databases to a third system.

3. Mention some typical use cases of Informatica.

Ans:

There are many typical use cases of Informatica, but this tool is predominantly leveraged in the following scenarios:

  • When organizations migrate from the existing legacy systems to new database systems
  • When enterprises set up their data warehouse
  • While integrating data from various heterogeneous systems including multiple databases and file-based systems
  • For data cleansing

4. How can we filter rows in Informatica?

Ans:

There are two ways to filter rows in Informatica, they are as follows:

  1. 1. Source Qualifier Transformation: It filters rows while reading data from a relational data source. It minimizes the number of rows while mapping to enhance performance. Also, Standard SQL is used by the filter condition for executing in the database.
  2. 2. Filter Transformation: It filters rows within a mapped data from any source. It is added close to the source to filter out the unwanted data and maximize performance. It generates true or false values based on conditions.

5. What are the types of lookup transformation?

Ans:

There are four different types of lookup transformation:

Relational or flat file lookup: It performs lookup on relational tables.

Pipeline lookup: It performs lookup on application sources.

Connected or unconnected lookup: While the connected lookup transformation receives data from source, performs lookup, and returns the result to the pipeline, the unconnected lookup happens when the source is not connected. It returns one column to the calling transformation.

Cached or uncached lookup: Lookup transformation can be configured to cache lookup data, or we can directly query the lookup source every time a lookup is invoked.

6. Differentiate between Joiner and Lookup transformations.

Ans:

JoinerLookup
It is not possible to override the queryIt is possible to override the query
Only the ‘=’ operator is availableAll operators are available for use
Users cannot restrict the number of rows while reading relational tablesUsers can restrict the number of rows while reading relational tables
It is possible to join tables with JoinsIt behaves as Left Outer Join while connecting with the database
It has large community support.Community support is restricted to the underscore template.

7. In Informatica Workflow Manager, how many repositories can be created?

Ans:

Depending upon the number of ports that are required, repositories can be created. In general, however, there can be any number of repositories.

8. How do pre- and post-session shell commands function?

Ans:

A command task can be called as a pre- or post-session shell command for a session task. Users can run it as a pre-session command, a post-session success command, or a post-session failure command. Based on use cases, the application of shell commands can be changed or altered.

Subscribe For Free Demo

Error: Contact form not found.

9. What can we do to improve the performance of Informatica Aggregator Transformation?

Ans:

Aggregator performance improves dramatically if records are sorted before passing to the aggregator and if the ‘sorted input’ option under Aggregator Properties is checked. The record set should be sorted on those columns that are used in the Group By operation. It is often a good idea to sort the record set in the database level, e.g., inside a source qualifier transformation, unless there is a chance that the already sorted records from the source qualifier can again become unsorted before reaching the aggregator.

10. How can we update a record in the target table without using Update Strategy?

Ans:

A target table can be updated without using ‘Update Strategy.’ For this, we need to define the key in the target table in Informatica level, and then we need to connect the key and the field we want to update in the mapping target. In the session level, we should set the target property as ‘Update as Update’ and check the ‘Update’ check box.

Let us assume, we have a target table ‘Customer’ with fields as ‘Customer ID,’ ‘Customer Name,’ and ‘Customer Address.’ Suppose if we want to update ‘Customer Address’ without an Update Strategy, then we have to define ‘Customer ID’ as the primary key in Informatica level, and we will have to connect ‘Customer ID’ and ‘Customer Address’ fields in the mapping. If the session properties are set correctly as described above, then the mapping will only update the ‘Customer Address’ field for all matching customer IDs.

11. Why do we use mapping parameters and mapping variables?

Ans:

Basically, mapping parameters and mapping variables represent values in mappings and mapplets.

Mapping Parameters

  1. 1. Mapping parameters represent constant values that are defined before running a session.
  2. 2. After creation, parameters appear in the Expression Editor.
  3. 3. These parameters can be used in source qualifier filters, in user-defined joins, or for overriding.

Mapping Variables

  1. 1. As opposed to mapping parameters, mapping variables can change values during sessions.
  2. 2. The last value of a mapping variable is saved to the repository at the end of each successful session by the Integration Service. However, it is possible to override saved values with parameter files.
  3. 3. Basically, mapping variables are used to perform incremental reads of data sources.

12. Define the surrogate key.

Ans:

A surrogate key is basically an identifier that uniquely identifies modeled entities or objects in a database. Not being derived from any other data in the database, surrogate keys may or may not be used as primary keys. It is basically a unique sequential number. If an entity exists in the outside world and is modeled within the database, or represents an object within the database, it is denoted by a surrogate key. In these cases, surrogate keys for specific objects or modeled entities are generated internally.

13. Explain sessions and also shed light on how batches are used to combine executions.

Ans:

A session is nothing but a teaching set which ought to be implemented to convert data from a source to a target. To carry out sessions, users need to leverage the session’s manager or use the pmcmd command. For combining sessions, in either a serial or a parallel manner, batch execution is used. Any number of sessions can be grouped into batches for migration.

14. What is incremental aggregation?

Ans:

Basically, incremental aggregation is the process of capturing changes in the source and calculating aggregations in a session. This process incrementally makes the integration service to update targets and avoids the process of calculating aggregations on the entire source.

15. How can we delete duplicate rows from flat files?

Ans:

We can delete duplicate rows from flat files by leveraging the sorter transformation and selecting the distinct option. Selecting this option will delete the duplicate rows.

16. What are the features of Informatica Developer 9.1.0?

Ans:

From an Informatica Developer’s perspective, some of the new features in Informatica Developer 9.1.0 are as follows:

In the new version, lookup can be configured as an active transformation—it can return multiple rows on a successful match.

Now, we can write SQL override in uncached lookup also. Previously, we could do it only on cached lookup.

  • Control over the size of our session log: In a real-time environment, we can control the session log file size or log file time.
  • Database deadlock resilience feature: This will ensure that our session does not immediately fail if it encounters any database deadlock. It will retry the operation. We can configure the number of retry attempts.

17. What are the advantages of using Informatica as an ETL tool over Teradata?

Ans:

First up, Informatica is a data integration tool, while Teradata is an MPP database with some scripting and fast data movement capabilities.

Advantages of Informatica over Teradata:

  1. 1. It functions as a metadata repository for the organization’s ETL ecosystem. Informatica jobs (sessions) can be arranged logically into worklets and workflows in folders. It leads to an ecosystem which is easier to maintain and quicker for architects and analysts to analyze and enhance.
  2. 2. Job monitoring and recovery: It is easy-to-monitor jobs using Informatica Workflow Monitor. It is also easier to identify and recover in the case of failed jobs or slow-running jobs. It exhibits an ability to restart from failure row step.
  3. 3. Informatica MarketPlace: It is a one-stop shop for lots of tools and accelerators to make SDLC faster and improve application support.
  4. 4. It enables plenty of developers in the market with varying skill levels and expertise to interact.
  5. 5. Lots of connectors to various databases are available, including support for Teradata MLoad, TPump, FastLoad, and Parallel Transporter in addition to the regular (and slow) ODBC drivers.
  6. 6. Surrogate key generation through shared sequence generators inside Informatica could be faster than generating them inside the database.
  7. 7. If a company decides to move away from Teradata to another solution, then vendors like Infosys can execute migration projects to move the data and change the ETL code to work with the new database quickly, accurately, and efficiently using automated solutions.
  8. 8. Pushdown optimization can be used to process the data in the database.
  9.  9. It has an ability to code ETL such that processing load is balanced between the ETL server and the database box—this is useful if the database box is ageing and/or in case the ETL server has a fast disk/large enough memory and CPU to outperform the database in certain tasks.
  10. 10.  It has an ability to publish processes as web services.

Advantages of Teradata over Informatica:

  1. 1. Cheaper (initially): No initial ETL tool license costs. There is only less OPEX costs as one doesn’t need to pay for yearly support from Informatica Corp.
  2. 2. Great choice if all the data to be loaded are available as structured files—which can then be processed inside the database after an initial stage load.
  3.  3. Good choice for a lower complexity ecosystem.
  4. 4. Only Teradata developers or resources with good ANSI/Teradata SQL/BTEQ knowledge are required to build and enhance the system.

18. Differentiate between various types of schemas in data warehousing.

Ans:

Star Schema : Star schema is the simplest style of data mart schema in computing. It is an approach which is most widely used to develop data warehouses and dimensional data marts. It features one or more fact tables referencing to numerous dimension tables.

Snowflake : A logical arrangement of tables in a multidimensional database, snowflake schema is represented by centralized fact tables which are connected to multidimensional tables. Dimensional tables in a star schema are normalized using snowflaking. Once normalized, the resultant structure resembles a snowflake with the fact table at the middle. Low-cardinality attributes are removed, and separate tables are formed.

Fact Constellation Schema : Fact constellation schema is a measure of online analytical processing (OLAP), and OLAP happens to be a collection of multiple fact tables sharing dimension tables and viewed as a collection of stars. It can be seen as an extension of the star schema.

19. Define OLAP. What are the different types of OLAP?

Ans:

OLAP or Online Analytical Processing is a specific category of software that allows users to analyze information from multiple database systems simultaneously. Using OLAP, analysts can extract and have a look at business data from different sources or points of view.

Types of OLAP:

ROLAP: ROLAP or Relational OLAP is an OLAP server that maps multidimensional operations to standard relational operations.

MOLAP: MOLAP or Multidimensional OLAP uses array-based multidimensional storage engines for multidimensional views on data. Numerous MOLAP servers use two levels of data storage representation to handle dense and sparse datasets.

HOLAP: HOLAP of Hybrid OLAP combines both ROLAP and MOLAP for faster computation and higher scalability of data.

20. What is the target load order? How to set it?

Ans:

When a mapplet is used in a mapping, Designer allows users to set target load order for all sources that pertain to the mapplet. In Designer, users can set the target load order in which Integration Service sends rows to targets within the mapping. A target load order group is basically a collection of source qualifiers, transformations, and targets linked together in a mapping. The target load order can be set to maintain referential integrity while operating on tables that have primary and secondary keys.

Steps to Set the Target Load Order

Step 1: Create a mapping that contains multiple target load order groups

Step 2: Click on Mappings and then select Target Load Plan

Step 3: The Target Load Plan dialog box lists all Source Qualifier transformations with targets that receive data from them

Step 4: Select a Source Qualifier and click on the Up and Down buttons to change the position of the it

Step 5: Repeat Steps 3 and 4 for other Source Qualifiers if you want to reorder them

Step 6: Click on OK after you are done

21. Define Target Designer.

Ans:

If we are required to perform ETL operations, we need source data, target tables, and the required transformations. Target Designer in Informatica allows us to create target tables and modify the pre-existing target definitions.

Target definitions can be imported from various sources, including flat files, relational databases, XML definitions, Excel worksheets, etc.

For opening Target Designer, click on the Tools menu and select the Target Designer option.

22. What are the advantages of Informatica?

Ans:

The advantages of Informatica as a data integration tool are many.

  1. 1. It facilitates the effective and efficient communication and transformation of data between different sources.
  2. 2. Informatica is faster, cheaper, and easy to learn.
  3. 3. Monitoring jobs becomes easy with it, and so do recovering failed jobs and pointing out slow jobs.
  4. 4. It has many robust features including database information, data validation, migration of projects from one database to another, etc.

23. List some of the PowerCenter client applications with their basic purpose.

Ans:

  • Repository Manager: An administrative tool which is used to manage repository folders, objects, groups, etc.
  • Administration Console: Used to perform service tasks
  • PowerCenter Designer: Contains several designing tools including source analyzer, target designer, mapplet designer, mapping manager, etc.
  • Workflow Manager: Defines a set of instructions that are required to execute mappings
  • Workflow Monitor: Monitors workflows and tasks

24. What are sessions? List down their properties.

Ans:

Available in the Workflow Manager, sessions are configured by creating a session task. Within a mapping program, there can be multiple sessions which can be either reusable or non-reusable.

Properties of Sessions

  1. 1. Session tasks can run concurrently or sequentially, as per the requirement.
  2. 2. They can be configured to analyze performance.
  3. 3. Sessions include log files, test load, error handling, commit interval, target properties, etc.

25. What are the various types of transformations possible in Informatica?

Ans:

The various types of transformations are:

  1. 1. Aggregator Transformation
  2. 2. Expression Transformation
  3. 3. Normalizer Transformation
  4. 4. Rank Transformation
  5. 5. Filter Transformation
  6. 6. Joiner Transformation
  7. 7. Lookup Transformation
  8. 8. Stored procedure Transformation
  9. 9. Sorter Transformation
  10. 10. Update Strategy Transformation
  11. 11. XML Source Qualifier Transformation
  12. 12. Router Transformation
  13. 13. Sequence Generator Transformation

26. What are the features of connected lookup?

Ans:

The features of connected lookup are as follows:

  • It takes in the input directly from the pipeline.
  • It actively participates in data flow, and both dynamic and static cache are used.
  • It caches all lookup columns and returns default values as the output when the lookup condition does not match.
  • It is possible to return more than one column value to the output port.
  • It supports user-defined default values.

27. Define junk dimensions.

Ans:

Junk dimensions are structures that consist of a group of a few junk attributes such as random codes or flags. They form a framework to store related codes with respect to a specific dimension at a single place, instead of creating multiple tables for the same.

28. What is the use of Rank Transformation?

Ans:

Be it active or connected, rank transformation is used to sort and rank a set of records either from the top or from the bottom. It is also used to select data with the largest or smallest numeric value based on specific ports.

29. Define Sequence Generator transformation.

Ans:

Available in both passive and connected configurations, the Sequence Generator transformation is responsible for the generation of primary keys or a sequence of numbers for calculations or processing. It has two output ports that can be connected to numerous transformations within a mapplet. These ports are:

  1. 1. NEXTVAL : This can be connected to multiple transformations for generating a unique value for each row or transformation.
  2. 2. CURRVAL : This port is connected when NEXTVAL is already connected to some other transformation within the mapplet.

30. What is the purpose of the INITCAP function?

Ans:

When invoked, the INITCAP function capitalizes the first character of each word in a string and converts all other characters to lowercase.

Syntax:

  • INITTCAP (string_name)

31. Define enterprise data warehousing?

Ans:

When the data of an organization is developed at a single point of access, it is known as enterprise data warehousing.

32. Differentiate between a database and a data warehouse?

Ans:

Database has a group of useful information which is brief in size as compared to a data warehouse. In a data warehouse, there are sets of every kind of data whether it is useful or not, and the data is extracted as per the requirement of the customer.

33. What do you understand by the term ‘domain’?

Ans:

The term ‘domain’ refers to all interlinked relationships and nodes that are undertaken by sole organizational point.

34. Differentiate between a repository server and a powerhouse.

Ans:

A repository server mainly guarantees the repository reliability and uniformity, while a powerhouse server tackles the execution of many procedures between the factors of the server’s database repository.

Course Curriculum

Enhance Your Career with Informatica DQ Training By MNC Experts

  • Instructor-led Sessions
  • Real-life Case Studies
  • Assignments
Explore Curriculum

35. Mention the advantages of partitioning a session.

Ans:

The main advantage of partitioning a session is to make the server’s process and competence better. Another advantage is that it implements the solo sequences within the session.

36. How can we create indexes after completing the load process?

Ans:

With the help of command tasks at the session level, we can create indexes after the load procedure.

37. Define sessions in Informatica ETL.

Ans:

A session is a teaching group that requires the transformation of information from the source to a target.

38. How many sessions can we have in one group?

Ans:

We can have any number of sessions, but it is advisable to have fewer sessions in a batch because it will become easier for migration.

39. Differentiate between a mapping parameter and a mapping variable.

Ans:

The values that alter during the session’s implementation are known as mapping variables, whereas the values that don’t alter during the session’s implementation are known as mapping parameters.

40. What are the features of complex mapping?

Ans:

The features of complex mapping are as follows:

  1. 1. There are more numbers of transformations
  2. 2.  It uses complex business logic

41. How can we identify whether a mapping is correct or not without a connecting session?

Ans:

With the help of the debugging option, we can identify whether a mapping is correct or not without connecting sessions.

42. Can we use mapping parameters or variables, developed in one mapping, into any other reusable transformation?

Ans:

Yes, we can use mapping parameters or variables into any other reusable transformation because they don’t have any mapplet.

43. What is the use of the aggregator cache file?

Ans:

If extra memory is needed, aggregator provides extra cache files for keeping the transformation values. It also keeps the transitional values that are there in the local buffer memory.

44. What is the benefit of partitioning a session?

Ans:

Partitioning a session means solo implementation sequences within the session. It’s main purpose is to improve server’s operation and efficiency. Other transformations including extractions and other outputs of single partitions are carried out in parallel.

45. What do you understand by the term ‘role-playing dimension’?

Ans:

The dimensions that are used for playing diversified roles while remaining in the same database domain are known as role-playing dimensions.

46. How can we access repository reports without SQL or other transformations?

Ans:

We can access repository reports by using a metadata reporter. There is no need of using SQL or other transformations as it is a web app.

47. Mention the types of metadata that are stored in the repository.

Ans:

The types of metadata, which is stored in the repository, are Target definition, Source definition, Mapplet, Mappings, and Transformations.

48. What is code page compatibility?

Ans:

Transfer of data takes place from one code page to another such that both code pages have the same character sets; then, data failure will not occur.

49. How can we confirm all mappings in the repository simultaneously?

Ans:

At a time, we can validate only one mapping. Hence, mapping cannot be validated simultaneously.

50. Define Aggregator transformation.

Ans:

It is different from expression transformation in which we can do calculations in sets, but in aggregator transformation we can do aggregate calculations such as averages, sum, etc.

51. What is Expression transformation?

Ans:

It is used for performing nonaggregated calculations. We can test conditional statements before the output results are moved to the target tables.

52. Define Filter transformation.

Ans:

Filter transformation is a way of filtering rows in a mapping. It has all ports of input/output, and the row which matches with that condition can only pass by that filter.

53. Define Joiner transformation.

Ans:

It combines two associated mixed sources located in different locations, while a source qualifier transformation can combine data rising from a common source.

54. How are indexes created after completing the load process?

Ans:

For the purpose of creating indexes after the load process, command tasks at session level can be used. Index creating scripts can be brought in line with the session’s workflow or the post session implementation sequence. Moreover this type of index creation cannot be controlled after the load process at transformation level.

55. How can we use Union transformation?

Ans:

It is a different input group transformation that is used to combine data from different sources.

56. Define incremental aggregation.

Ans:

The incremental aggregation is done whenever a session is developed for a mapping aggregate.

57. Differentiate between a connected lookup and an unconnected lookup.

Sample Resumes! Download & Edit, Get Noticed by Top Employers! Download

58. Define mapplet.

Ans:

A mapplet is a recyclable object that uses a mapplet designer.

59. What is reusable transformation?

Ans:

This transformation is used various times in mapping. It is different from other mappings which use the transformation as it is stored as metadata.

60. Define update strategy.

Ans:

Whenever a row has to be updated or inserted based on some sequence, then update strategy is used. But in this, condition should be specified before for the processed row to be ticked as Update or Insert.

61. Explain the scenario which compels Informatica server to reject files

Ans:

When it faces DD_Reject in update strategy transformation, then it sends the server to reject files.

62. Mention the prerequisite tasks to achieve the session partition.

Ans:

In order to perform session partition, one needs to configure the session to partition source data and then install the Informatica server machine in multifold CPUs.

63. In Informatics’ server, which files are created during the session RUMs?

Ans:

The following types of files are created during the session RUMs:

  • Errors log
  • Bad file
  • Workflow low
  • Session log

64. Define a session task.

Ans:

It is a mass of instruction that guides PowerCenter servers about how and when to move data from sources to targets.

65. Define the command task.

Ans:

This task permits one or more shell commands in UNIX or DOS in Windows to run during the workflow.

66. Explain a standalone command task.

Ans:

This task can be used anywhere in the workflow to run the shell commands.

67. What is a predefined event?

Ans:

A predefined event is the file-watch event. It waits for a specific file to arrive at a specific location.

68. What is a user-defined event?

Ans:

User-defined events are a flow of tasks in the workflow. Events can be developed and then raised as per requirement.

69. Define workflow.

Ans:

The group of directions that communicates with the server about how to implement tasks is known as workflow.

70. Mention the different tools used in Workflow Manager?

Ans:

The different tools used in Workflow Manager are:

  1. 1. Task Developer
  2. 2. Task Designer
  3. 3. Workflow Designer

71. Name the other tools used for scheduling purposes other than Workflow Manager and pmcmd.

Ans:

‘CONTROL M’ is a third-party tool used for scheduling purposes.

72. Define OLAP (Online Analytical Processing).

Ans:

It is a process by which multi-dimensional analysis occurs.

73. Name the different types of OLAP.

Ans:

Different types of OLAP are ROLAP, HOLAP, and DOLAP.

74. Define worklet.

Ans:

Worklet is said when the workflow tasks are collected in a group. It includes timer, decision, command, event wait, etc.

75. Mention the use of a Target Designer.

Ans:

With the help of a Target Designer, we can create target definition.

76. From where can we find the throughput option in Informatica?

Ans:

In Workflow Monitor, we can find the throughput option. By right-clicking on session, then pressing on get run properties, and, under source/target statistics, we can find this option.

77. Define target load order.

Ans:

It is specified on the criteria of source qualifiers in a mapping. If there are many source qualifiers attached to various targets, then we can entitle an order in which Informatica loads data in targets.

78. Define Informatica.

Ans:

Informatica is a tool, supporting all the steps of the Extraction, Transformation, and Load (ETL) process. Nowadays, Informatica is also being used as an integration tool. Informatica is an easy-to-use tool. It has got a simple visual interface like forms in visual basic. You just need to drag and drop different objects (known as transformations) and design process flow for data extraction, transformation, and load.

These process flow diagrams are known as mappings. Once a mapping is made, it can be scheduled to run as and when required. In the background, the Informatica server takes care of fetching data from the source, transforming it, and loading it to the target.

79. What are the different clients of PowerCenter?

Ans:

Here is the list of PowerCenter clients:

  • PowerCenter designer
  • PowerCenter workflow monitor
  • PowerCenter workflow manager
  • PowerCenter repository manager

80. What are the different lookup cache(s)?

Ans:

Informatica Lookups can be cached or uncached (no cache). A cached lookup can be either static or dynamic. A static cache is one which does not modify the cache once it is built, and it remains the same during the session run. On the other hand, a cache refreshed during the session runs by inserting or updating the records in cache based on the incoming source data.

By default, Informatica cache is static cache. A lookup cache can also be divided as persistent or non-persistent based on whether Informatica retains the cache even after the completion of session run or deletes it.

81. How to elaborate Tracing Level?

Ans:

Tracing level can be defined as the amount of information that the server writes in the log file. Tracing level is created and configured either at the transformation level or at session-level else at both the levels.

82. What is Decode in Informatica?

Ans:

To understand Decode, let’s consider it as similar to the CASE statement in SQL. It is basically the function that is used by an expression transformation in order to search a specific value in a record.

There can be unlimited searches within the Decode function where a port is specified for returning result values. This function is usually used in cases where it is required to replace nested IF statements or to replace lookup values by searching in small tables with constant values.

Decode is a function that is used within Expression transformation. It is used just like the CASE statement in SQL to search a specific record

83. What is an Informatica ETL Tool?

Ans:

Informatica ETL tool is the market leader in data integration and data quality services. Informatica is a successful ETL and EAI tool with significant industry coverage. ETL refers to extract, transform, and load. Data integration tools are different from other software platforms and languages.

They have no inbuilt feature to build a user interface where the end user can see the transformed data. Informatica ETL tool “power center” has the capability to manage, integrate, and migrate enterprise data.

84. How many number of sessions can one group batches?

Ans:

One can group any number of sessions but it would be easier for migration if the number of sessions are lesser in a batch.

85. What is the need for an ETL tool?

Ans:

The problem comes with traditional programming languages where we need to connect to multiple sources and then handle errors. For this, we have to write complex code. ETL tools provide a ready-made solution for this. We don’t need to worry about handling these things, and hence we can concentrate on coding the required part.

86. Is ‘sorter’ an active or passive transformation?

Ans:

It is an active transformation because it removes the duplicates from the key and consequently changes the number of rows.

87. What is Aggregator Transformation in INFORMATICA?

Ans:

Aggregator transformation can be active or connected. It works as the GROUP BY clause in SQL. It is useful to perform aggregate calculations on groups in INFORMATICA PowerCenter. It performs an aggregate calculation on data using aggregate type function viz. SUM, AVG, MAX, and MIN.

88. What is the difference between active and passive transformation?

Ans:

Active Transformation:- An active transformation can perform any of the following actions:

  1. 1. Change the number of rows that pass through the transformation: For instance, the Filter transformation is active because it removes rows that do not meet the filter condition.
  2. 2. Change the transaction boundary: For e.g., the Transaction Control transformation is active because it defines a commit or roll back transaction based on an expression evaluated for each row.
  3. 3. Change the row type: For e.g., the Update Strategy transformation is active because it flags rows for insert, delete, update, or reject.

Passive Transformation: A passive transformation is one which will satisfy all these conditions:

  1. 1. Does not change the number of rows that pass through the transformation
  2. 2. Maintains the transaction boundary
  3. 3. Maintains the row type

89. Name the output files created by the Informatica server during session running.

Ans:

Informatica server log: Informatica server (on UNIX) creates a log for all status and error messages (default name: pm.server.log). It also creates an error log for error messages. These files will be created in the Informatica Home directory.

Session log file: Informatica server creates session log files for each session. It writes information about sessions into log files such as initialization process, creation of SQL commands for reader and writer threads, errors encountered and load summary. The amount of detail in the session log file depends on the tracing level that you set.

Session detail file: This file contains load statistics for each target in mapping. Session detail includes information such as table name, number of rows written or rejected. You can view this file by double clicking on the session in the monitor window.

Performance detail file: This file contains session performance details which tells you where performance can be improved. To generate this file, select the performance detail option in the session property sheet.

Reject file: This file contains the rows of data that the writer does not write to targets.

Control file: Informatica server creates a control file and a target file when you run a session that uses the external loader. The control file contains the information about the target flat file such as data format and loading instructions for the external loader.

Post session email: Post session email allows you to automatically communicate information about a session run to designated recipients. You can create two different messages. One if the session completed successfully and another if the session fails.

Indicator file: If you use the flat file as a target, you can configure the Informatica server to create an indicator file. For each target row, the indicator file contains a number to indicate whether the row was marked for insert, update, delete or reject.

Output file: If a session writes to a target file, the Informatica server creates the target file based on file properties entered in the session property sheet.

Cache files: When the Informatica server creates a memory cache, it also creates cache files. For the following circumstances, Informatica server creates index and data cache files.

90. What are the types of groups in router transformation?

Ans:

  • Input group
  • Output group
  • Default group

91. What is the difference between STOP and ABORT options in Workflow Monitor?

Ans:

On issuing the STOP command on the session task, the integration service stops reading data from the source although it continues processing the data to targets. If the integration service cannot finish processing and committing data, we can issue the abort command.

ABORT command has a timeout period of 60 seconds. If the integration service cannot finish processing data within the timeout period, it kills the DTM process and terminates the session

92. Why is sorter an active transformation?

Ans:

When the Sorter transformation is configured to treat output rows as distinct, it assigns all ports as part of the sort key. The integration service discards duplicate rows that were compared during the sort operation. The number of input rows will vary as compared to the output rows and hence it is an active transformation.

93. What are the different ways to filter rows using Informatica transformations?

Ans:

  1. 1. Source Qualifier
  2. 2. Joiner
  3. 3. Filter
  4. 4. Router

94. What are data driven sessions?

Ans:

When you configure a session using update strategy, the session property data driven instructs Informatica server to use the instructions coded in mapping to flag the rows for insert, update, delete or reject. This is done by mentioning DD_UPDATE or DD_INSERT or DD_DELETE in the update strategy transformation.

“Treat source rows as” property in session is set to “Data Driven” by default when using an update strategy transformation in a mapping.

95. What are mapplets?

Ans:

  • A Mapplet is a reusable object that we create in the Mapplet Designer.
  • It contains a set of transformations and lets us reuse that transformation logic in multiple mappings.

96. What is the use of a source qualifier?

Ans:

The source qualifier transformation is an active, connected transformation used to represent the rows that the integrations service reads when it runs a session. You need to connect the source qualifier transformation to the relational or flat file definition in a mapping. The source qualifier transformation converts the source data types to the Informatica native data types. So, you should not alter the data types of the ports in the source qualifier transformation.

Are you looking training with Right Jobs?

Contact Us

Popular Courses