Ab Initio Interview Questions and Answers

Ab Initio Interview Questions and Answers

Last updated on 21st Oct 2020, Blog, Interview Question

About author

Prithivraj ((Sr Technical Manager ) )

High level Domain Expert in TOP MNCs with 8+ Years of Experience. Also, Handled Around 20+ Projects and Shared his Knowledge by Writing these Blogs for us.

(5.0) | 16547 Ratings 2773

These Ab Initio Interview Questions have been designed specially to get you acquainted with the nature of questions you may encounter during your interview for the subject of Ab Initio . As per my experience good interviewers hardly plan to ask any particular question during your interview, normally questions start with some basic concept of the subject and later they continue based on further discussion and what you answer.we are going to cover top 100 Ab Initio  Interview questions along with their detailed answers. We will be covering Ab Initio  scenario based interview questions, Ab Initio  interview questions for freshers as well as Ab Initio  interview questions and answers for experienced. 

1) What is the relation between eme, gde and co-operating system?

Ans:

Eme is said as enterprise meta data enc, gde as graphical development env and co-operating system can be said as abinitio server relation b/w this co-op, eme and gde is as fallows co operating system is the initiation server. This co-op is installed on particular o.s platform that is called native o.s .coming to the eme, its just as repository in Informatics, its hold the metadata, transformations, db config files source and targets information’s. Coming to gde its is end user environment where we can develop the graphs (mapping just like in Informatics) designer uses the gde and designs the graphs and save to the eme or sand box it is at user side. Where eme is at server side.

2) What are the benefits of data processing according to you?

Ans:

Well, processing of data derives a very large number of benefits. Users can put separate many factors that matters to them. In addition to this, with the help of this approach, one can easily keep up the pace simply by deriving data into different structures from a totally unstructured format. In addition to this, processing is useful in eliminating various bugs that are often associated with the data and cause problems at a later section. It is because of no other reason than this, data processing has wide application in a number of tasks.

3) What exactly do you understand with the term data processing and businesses can trust this approach?

Ans:

 Processing is basically a procedure that simply covert the data from a useless form into a useful one without making a lot of efforts. However, the same may vary depending on factors such as the size of data and its format. A sequence of operations is generally carried out to perform this task and depending on the type of data, this sequence could be automatic or manual. Because in the present scenario, most of the devices that perform this task are PC’s automatic approach is more popular than ever before. Users are free to obtain data in forms such as a table, vectors, images, graphs, charts and so on. This is the best things that business owners can simply enjoy.

4) How data is processed and what are the fundamentals of this approach?

Ans:

There are certain activities which require the collection of the data and the best thing is processing largely depends on the same in many cases. The fact is data needs to be stored and analyzed before it is actually processed. This task depends on some major factors are they are:

  • Collection of Data
  • Presentation 
  • Final Outcomes
  • Analysis
  • Sorting

These are also regarded as the basic fundamentals that can be trusted to keep up the pace in this matter.

5) What would be the next step after collecting the data?

Ans:

Once the data is collected, the next important task is to enter it in the concerned machine or system. Well, gone are those days when storage depends on papers. In the present time, data size is very large and it needs to be performed in a reliable manner. The digital approach is a god option for this as it simply let users perform this task easily and in fact without compromising with anything. A large set of operations then need to be performed for the meaningful analysis. In many cases, conversion also largely matters and users are always free to consider the outcomes which best meet their expectations.

6) What is a data processing cycle and what is its significance?

Ans:

Data often needs to be processed continuously and it is used at the same time. It is known as data processing cycle. The same provide results which are quick or may take

 extra time depending on the type, size and nature of data. This is boosting the complexity in this approach and thus there is a need of methods that are reliable and advanced than existing approaches. The data cycle simply make sure that complexity can be avoided upto the possible extent and without doing much.

7) What are the factors on which storage of data depends?

Ans:

Basically, it depends on the sorting and filtering. In addition to this, it largely depends on the software one uses. 

8) Do you think effective communication is necessary in the data processing? What is your strength in terms of same?

Ans:

The biggest ability that one could have in this domain is the ability to rely on the data or the information. Of course, communication matters a lot in accomplishing several important tasks such as representation of the information. There are many departments in an organization and communication make sure things are good and reliable for everyone. 

Subscribe For Free Demo

Error: Contact form not found.

9) Suppose we assign you a new project. What would be your initial point and the key steps that you follow?

Ans:

The first thing that largely matters is defining the objective of the task and then engages the team in it. This provides a solid direction for the accomplishment of the task. This is 

important when one is working on a set of data which is completely unique or fresh. After this, next big thing that needs attention is effective data modeling. This includes finding the missing values and data validation. Last thing is to track the results.

10) Suppose you find the term Validation mentioned with a set of data, what does that simply represent?

Ans:

It represents that the concerned data is clean, correct and can thus be used reliably without worrying about anything. Data validation is widely regarded as the key points in the processing system.

11) What do you mean by data sorting?

Ans:

It is not always necessary that data remains in a well-defined sequence. In fact, it is always a random collection of objects. Sorting is nothing but arranging the data items in desired sets or in sequence.

12) Name the technique which you can use for combining the multiple data sets simply?

Ans:

It is known as Aggregation

13) How scientific data processing is different from commercial data processing?

Ans:

Scientific data processing simply means data with great amount of computation i.e. arithmetic operations. In this, a limited amount of data is provided as input and a bulk data is there at the outcome. On the other hand commercial data processing is different. In this, the outcome is limited as compare to the input data. The computational operations are limited in commercial data processing.

14) What are the benefits of data analyzing

Ans:

It makes sure of the following:

  1. 1. Explanation of development related to the core tasks can be assured
  2. 2. Test Hypotheses with an integration approach is always there
  3. 3. Pattern detection in a reliable manner

15) Mention what is Abinitio?

Ans:

“Abinitio” is a latin word meaning “from the beginning.” Abinitio is a tool used to extract, transform and load data. It is also used for data analysis, data manipulation, batch processing, and graphical user interface based parallel processing.

16) Explain what is the architecture of Abinitio?

Ans:

Architecture of Abinitio includes:

  • GDE (Graphical Development Environment)
  • Co-operating System
  • Enterprise meta-environment (EME)
  • Conduct-IT

17) Mention what is the role of Co-operating system in Abinitio?

Ans:

The Abinitio co-operating system provide features like:

  1. 1. Manage and run Abinitio graph and control the ETL processes
  2. 2. Provide Ab initio extensions to the operating system
  3. 3. ETL processes monitoring and debugging
  4. 4. Meta-data management and interaction with the EME

18) Explain what does dependency analysis mean in Abinitio?

Ans:

In Ab initio, dependency analysis is a process through which the EME examines a project entirely and traces how data is transferred and transformed- from component-to-component, field-by-field, within and between graphs.

19) Explain how Abinitio EME is segregated?

Ans:

Abinition is logically divided into two segments:

  1. 1. Data Integration Portion
  2. 2. User Interface ( Access to the meta-data information)

20) Mention how can you connect EME to Abinitio Server?

Ans:

To connect with Ab initio Server, there are several ways like:

  • Set AB_AIR_ROOT
  • Login to EME web interface- http://serverhost:[serverport]/abinitio
  • Through GDE, you can connect to EME data-store
  • Through air-command

21) List out the file extensions used in Abinitio?

Ans:

The file extensions used in Abinitio are:

  1. 1. .mp: It stores Ab initio graph or graph component
  2. 2. .mpc: Custom component or program
  3. 3. .mdc: Dataset or custom data-set component
  4. 4. .dml: Data manipulation language file or record type definition
  5. 5. .xfr: Transform function file
  6. 6. .dat: Data file (multifile or serial file)

22) Mention what information does a .dbc file extension provides to connect to the database?

Ans:

The .dbc extension provides the GDE with the information to connect with the database are:

  • Name and version number of the data-base to which you want to connect
  • Name of the computer on which the data-base instance or server to which you want to connect runs, or on which the database remote access software is installed
  • Name of the server, database instance or provider to which you want to link

23) Explain how you can run a graph infinitely in Ab initio?

Ans:

To execute graph infinitely, the graph end script should call the .ksh file of the graph. Therefore, if the graph name is abc.mp then in the end script of the graph it should call to abc.ksh. This will run the graph for infinitely.

24) Mention what the difference between “Look-up” file and “Look is up” in Abinitio?

Ans:

Lookup file defines one or more serial file (Flat Files); it is a physical file where the data for the Look-up is stored.  While Look-up is the component of abinitio graph, where we can save data and retrieve it by using a key parameter.

25) Mention what are the different types of parallelism used in Abinitio?

Ans:

Different types of parallelism used in Abinitio includes:

  1. 1. Component parallelism: A graph with multiple processes executing simultaneously on separate data uses parallelism
  2. 2. Data parallelism: A graph that works with data divided into segments and operates on each segments respectively, uses data parallelism.
  3. 3. Pipeline parallelism: A graph that deals with multiple components executing simultaneously on the same data uses pipeline parallelism. Each component in the pipeline read continuously from the upstream components, processes data and writes to downstream components.  Both components can operate in parallel.

26) Explain what is Sort Component in Abinitio?

Ans:

The Sort Component in Abinitio re-orders the data. It comprises of two parameters “Key” and “Max-core”.

  • Key: It is one of the parameters for sort component which determines the collation order
  • Max-core: This parameter controls how often the sort component dumps data from memory to disk

27) Mention what dedup-component and replicate component does?

Ans:

  1. 1. Dedup component: It is used to remove duplicate records
  2. 2. Replicate component: It combines the data records from the inputs into one flow and writes a copy of that flow to each of its output ports

28) Mention what is a partition and what are the different types of partition components in Abinitio?

Ans:

In Abinitio, partition is the process of dividing data sets into multiple sets for further processing.  Different types of partition component includes:

  1. 1. Partition by Round-Robin: Distributing data evenly, in block size chunks, across the output partitions
  2. 2. Partition by Range: You can divide data evenly among nodes, based on a set of partitioning ranges and key
  3. 3. Partition by Percentage: Distribution data, so the output is proportional to fractions of 100
  4. 4. Partition by Load balance: Dynamic load balancing
  5. 5. Partition by Expression: Data dividing according to a DML expression
  6. 6. Partition by Key: Data grouping by a key
Course Curriculum

Get Practical Oriented Ab Initio Training to UPGRADE Your Skill Set

  • Instructor-led Sessions
  • Real-life Case Studies
  • Assignments
Explore Curriculum

29) Explain what is SANDBOX?

Ans:

A SANDBOX is referred for the collection of graphs and related files that are saved in a single directory tree and behaves as a group for the purposes of navigation, version control, and migration.

30) Explain what is de-partition in Abinitio?

Ans:

De-partition is done in order to read data from multiple flow or operations and are used to re-join data records from different flows. There are several de-partition components available which includes Gather, Merge, Interleave, and Concatenation.

31) What Is The Function That Transfers A String Into A Decimal?

Ans:

  1. 1. Use decimal cast with the size in the transform() function, when the size of the string and decimal is same.
  2. 2. Ex: If the source field is defined as string(8).
  3. 3. The destination is defined as decimal(8)
  4. 4. Let us assume the field name is salary.
  5. 5. The function is out.field :: (decimal(8)) in salary
  6. 6. If the size of the destination field is lesser that the input then string_substring() function can be used
  7. 7. Ex : Say the destination field is decimal(5) then use…
  8. 8. out.field :: (decimal(5))string_lrtrim(string_substring(in.field,1,5))
  9. 9. The ‘ lrtrim ‘ function is used to remove leading and trailing spaces in the string

32) Describe The Evaluation Of Parameters Order?

Ans:

Following is the order of evaluation:

  • Host setup script will be executed first
  • All Common parameters, that is, included , are evaluated
  • All Sandbox parameters are evaluated
  • The project script – project-start.ksh is executed
  • All form parameters are evaluated
  • Graph parameters are evaluated
  • The Start Script of graph is executed

33) Explain Pdl With An Example?

Ans:

To make a graph behave dynamically, PDL is used:

  1. 1. Suppose there is a need to have a dynamic field that is to be added to a predefined DML while executing the graph
  2. 2. Then a graph level parameter can be defined 
  3. 3. Utilize this parameter while embedding the DML in output port.
  4. 4. For Example : define a parameter named myfield with a value
    • string(“ | “”) name;
  5. 5. Use ${mystring} at the time of embedding the dml in out port.
  6. 6. Use $substitution as an interpretation option

34) State The Working Process Of Decimal_strip Function?

Ans:

  1. 1. A decimal strip takes the decimal values out of the data.
  2. 2. It trims any leading zeros
  3. 3. The result is a valid decimal number

Ex:

  • decimal_strip(“-0184o”) := “-184”
  • decimal_strip(“oxyas97abc”) := “97”
  • decimal_strip(“+$78ab=-*&^*&%cdw”) := “78”
  • decimal_strip(“Honda”) “0”

35) State The First_defined Function With An Example?

Ans:

  • This function is similar to the function NVL() in Oracle database
  • It performs the first values which are not null among other values available in the function and assigns to the variable
  • Example: A set of variables, say v1,v2,v3,v4,v5,v6 are assigned with NULL. Another variable num is assigned with value 340 (num=340)
    num = first_defined(NULL, v1,v2,v3,v4,v5,v6,NUM)
    The result of num is 340

36) What Is Max Core Of A Component?

Ans:

  1. 1. MAX CORE is the space consumed by a component that is used for calculations
  2. 2. Each component has different MAX COREs
  3. 3. Component performances will be influenced by the MAX CORE’s contribution
  4. 4. The process may slow down / fasten if a wrong MAX CORE is set

37) What Are The Operations That Support Avoiding Duplicate Record?

Ans:

Duplicate records can be avoided by using the following:

  • Using Dedup sort
  • Performing aggregation
  • Utilizing the Rollup component

38) What Is A Deadlock And How It Occurs?

Ans:

  1. 1. A graphical / program hand is known as deadlock.
  2. 2. The progression of a program would be stopped when a dead lock occurs.
  3. 3. Data flow pattern likely causes a deadlock
  4. 4. If a graph flows diverge and converge in a single phase, it is potential for a deadlock
  5. 5. A component might wait for the records to arrive on one flow during the flow converge, even though the unread data accumulates on others.
  6. 6. In GDE version 1.8, the occurrence of a dead lock is very rare.

39) What is surrogate key?

Ans:

Surrogate key is a system generated sequential number which acts as a primary key.

40) Differences Between Ab-Initio and Informatica?

Ans:

Informatics and Ab-Initio both support parallelism. But Informatics supports only one type of parallelism but the Ab-Initio supports three types of parallelisms.

  • Component
  • Data Parallelism
  • Pipe Line parallelism.

 We don’t have scheduler in Ab-Initio like Informatics , you need to schedule through script or you need to run manually.

Ab-Initio supports different types of text files means you can read same file with different structures that is not possible in Informatics, and also Ab-Initio is more user friendly than Informatics .

Informatics is an engine based ETL tool, the power this tool is in it’s transformation engine and the code that it generates after development cannot be seen or modified.

Ab-Initio is a code based ETL tool, it generates ksh or bat etc. code, which can be modified to achieve the goals, if any that can not be taken care through the ETL tool itself.

Initial ramp up time with Ab-Initio is quick compare to Informatics, when it comes to standardization and tuning probably both fall into same bucket.

Ab-Initio doesn’t need a dedicated administrator, UNIX or NT admin will suffice, where as Informatics need a dedicated administrator.

With Ab-Initio you can read data with multiple delimiter in a given record, where as Informatics force you to have all the fields be delimited by one standard delimiter

Error Handling – In Ab-Initio you can attach error and reject files to each transformation and capture and analyze the message and data separately. Informatics has one huge log! Very inefficient when working on a large process, with numerous points of failure.

41) What is the difference between rollup and scan?

Ans:

By using rollup we cant generate cumulative summary records for that we will be using scan.

42) Why we go for Ab-Initio?

Ans:

  1. 1. Ab-Initio designed to support largest and most complex business applications.
  2. 2. We can develop applications easily using GDE for Business requirements.
  3. 3. Data Processing is very fast and efficient when compared to other ETL tools.
  4. 4. Available in both Windows NT and UNIX

43) What is the difference between partitioning with key and round robin?

Ans:

PARTITION BY KEY: In this, we have to specify the key based on which the partition will occur. Since it is key based it results in very well balanced data. It is useful for key dependent parallelism.

PARTITION BY ROUND ROBIN: In this, the records are partitioned in sequential way, distributing data evenly in block size chunks across the output partition. It is not key based and results in well balanced data especially with block size of 1. It is useful for record independent parallelism.

44) How to Create Surrogate Key using Ab Initio?

Ans:

A key is a field or set of fields that uniquely identifies a record in a file or table.

A natural key is a key that is meaningful in some business or real-world sense. For example, a social security number for a person, or a serial number for a piece of equipment, is a natural key.

A surrogate key is a field that is added to a record, either to replace the natural key or in addition to it, and has no business meaning. Surrogate keys are frequently added to records when populating a data warehouse, to help isolate the records in the warehouse from changes to the natural keys by outside processes.

45) What are the most commonly used components in a Ab-Initio graphs?

Ans:

  1. 1. input file / output file
  2. 2. input table / output table
  3. 3. lookup / lookup_local
  4. 4. reformat
  5. 5. gather / concatenate
  6. 6. join
  7. 7. run sql
  8. 8. join with db
  9. 9. compression components
  10. 10. filter by expression
  11. 11. sort (single or multiple keys)
  12. 12. rollup
  13. 13. partition by expression / partition by key

46) How do we handle if DML changing dynamically?

Ans:

There are lot many ways to handle the DMLs which changes dynamically with in a single file.

Some of the suitable methods are to use a conditional DML or to call the vector functionality while calling the DMLs.

47) What is meant by limit and ramp in Ab-Initio? Which situation it’s using?

Ans:

The limit and ramp are the variables that are used to set the reject tolerance for a particular graph. This is one of the option for reject-threshold properties. The limit and ramp values should pass if enables this option.

Graph stops the execution when the number of rejected records exceeds the following formula.

limit + (ramp * no_of_records_processed).

The default value will be set to 0.0.

The limit parameter contains an integer that represents a number of reject events The ramp parameter contains a real number that represents a rate of reject events in the number of records processed.

Typical Limit and Ramp settings:

  1. 1. Limit = 0  Ramp = 0.0 Abort on any error
  2. 2. Limit = 50 Ramp = 0.0 Abort after 50 errors
  3. 3. Limit = 1   Ramp = 0.01 Abort if more than 2 in 100 records causes error
  4. 4. Limit = 1  Ramp = 1 Never Abort

48) What are data mapping and data modeling?

Ans:

Data mapping deals with the transformation of the extracted data at FIELD level i.e. the transformation of the source field to target field is specified by the mapping defined on the target field. The data mapping is specified during the cleansing of the data to be loaded.

For Example:

  • source;
  • string(35) name = “Siva Krishna  “;
  • target;
  • string(“01”) nm=NULL(“”);      /*(maximum length is string(35))*/

Then we can have a mapping like:

  • Straight move.Trim the leading or trailing spaces.
  • The above mapping specifies the transformation of the field nm.

49) Can you explain the performance and scalability of Co> Operating System?

Ans:

The Co>Operating System was designed from the ground up to achieve maximum performance and scalability. Every aspect of the Co>Operating System has been optimized to get maximum performance from your hardware. And you don’t need “cloud” technology because the Co>Operating System naturally distributes processing across farms of servers.

50) Can you explain the Co>Operating System’s processing model?

Ans:

The Co>Operating System is a distributed peer-to-peer processing system. It must be installed on all the servers that will be part of running an application. Each of these servers may be running a different operating system (Unix, Linux and Linux, Windows, or z/OS).

51) How Co>Operating System Integrates with legacy codes?

Ans:

Ab Initio enables to build end-to-end applications completely with the Graphical Development Environment, and run those applications completely within the Co>Operating System, users often have existing applications or 3rd party products that run fine and that are not worth re-implementing.

Ab Initio makes it easy to reuse those existing applications, whether they were coded in C, C++, Cobol, Java, shell scripts, or whatever. In fact, the Co>Operating System makes it possible to integrate those applications into environments they were not originally designed for.

Legacy codes are integrated into Ab Initio applications by turning them into components that behave just like all other Ab Initio components.

52) What is Ab Initio Enterprise Meta>Environment (EME)?

Ans:

Ab Initio Enterprise Meta>Environment (EME), a centralized repository, and reused both within and across applications.

53) What you can store, manage and reuse centrally in Ab Initio Enterprise Meta>Environment (EME)?

Ans:

Here are the elements of what we can centrally store, manage, and reuse: 

  1. 1. Record formats
  2. 2. Business and logic rules
  3. 3. Sections of applications (applications are called “graphs” and the sections are called “subgraphs”)
  4. 4. Orchestrations of applications (“plans” in Ab Initio’s Conduct>It)

54) What metadata importer can do in Ab Initio?

Ans:

The Metadata Importer can load external metadata such as:

  • Reporting tools: MicroStrategy, Business Objects, Cognos, …
  • Modeling tools: ERwin, ERstudio, and Rational Architect, …
  • Database system catalogs for all major and most minor relational database management systems
  • Tabular metadata, usually stored in spreadsheets using either predefined templates or customer-specific layouts
  • Industry-standard protocols for metadata exchanges, including Common Warehouse Model XML Metadata Interchange Format (CWM XMI)

55) What is the Ab Initio Business Rules Environment (BRE)?

Ans:

The Ab Initio® Business Rules Environment (BRE) allows business analysts to specify business rules in a form that is very familiar and comfortable to them: grid-like spreadsheets.

In the BRE, the rules are specified in business terms, not technical terms, and with expressions that are clear to anyone who has worked with Microsoft Excel. As a consequence, not only can rules be specified quickly and accurately, they are also easily understood by other business people.

56) What are “business rules” in Ab Initio Business Rules Environment (BRE)?

Ans:

The BRE supports three different styles of business rules: decision rules, validation rules, and mapping rules. While they are fundamentally similar, business users are comfortable thinking of rules as belonging in one of these categories.

AB Initio Sample Resumes! Download & Edit, Get Noticed by Top Employers! Download

57) How does the BRE work with the Co>Operating System?

Ans:

It’s straightforward: the BRE takes the rules created by the user and puts them into a component in a graph run by the Co>Operating System.

58) What is Conduct>It?

Ans:

Ab Initio Conduct>It is a high-volume data processing systems developing tool. It enables combining graphs from Graphical Development Environment with custom scripts and programs from other vendors.

59) What is Ab Initio?

Ans:

“Ab initio” is a Latin word which stands for “from the beginning”. Ab initio is a tool that helps in extracting, transforming and loading data. It can as well be used for analyzing data, processing batch, and processing of graphical user interface based parallel.

60) What is de-partition?

Ans:

De-partition is done for the purpose of reading data from multiple flow or operations and is used for rejoining data records from different flows. Numerous de-partition components are on hand, such as Merge, Concatenation, Gather, and Interleave.

61) Describe the Grant/Revoke DDL facility and how it is implemented?

Ans:

It is a part of D.B.A responsibilities. GRANT means permissions. For instance, GRANT CREATE TABLE, CREATE VIEW AND MANY MORE

REVOKE, on the other hand, stands for cancel the grant (permissions). Therefore, Grant or Revoke both commands depend upon D.B.A.

62) What is local lookup?

Ans:

  • Local lookup file has records that you can place in main memory
  • Local lookup uses transform function to get back records quicker than having to retrieve them from the disk.

63) How can you force the optimizer to use a particular index?

Ans:

Use hints /*+ <hint> */, they act as directives to the optimizer

64) What is Rollup parameter?

Ans:

Roll-up component assists the users in grouping the records on certain field values. It is a multiple stage function and it comprises initial 2 and Rollup 3.

65) What is local and formal parameter?

Ans:

Both are graph level parameters. In local, you need to initialize the value at the declaration time whereas global does not need to initialize the data as it will prompt at the time of running the graph for that parameter.

66) Define a local lookup?

Ans:

Local lookup file has documentations which can be located in major memory, and they are also used to change function for retrieving records much earlier than retrieving from the disk.

67) How can you process data, and what are the fundamentals of this approach?

Ans:

There are convinced activities which need the compilation of the data, and the best object is processing depends on the same in many cases. The fact is data needs to be stored and analyzed previous to it is processed.

This task depends on some main factors are they are:

  1. 1. Analysis
  2. 2. Collection of data
  3. 3. Presentation
  4. 4. Final outcomes
  5. 5. Sorting
  6. 6. Data sorting: It is not always essential that data remains in a well-defined series. It is always a chance compilation of objects. Sorting is nil but arranging the data items in preferred sets or progression.

68) What do you mean by primary keys and foreign keys?

Ans:

In RDBMS, the association between the two tables is represented as Primary key and foreign key relationship. Whereas the primary key table is the close relative table and the foreign key table is the infant table.

69) Explain an outer join?

Ans:

An outer join is utilized when one wants to choose all the records from a port whether it has fulfilled the join criteria or not.

70) How can you improve the performance of the graph?

Ans:

To improve the performance of the graph:

  • Use an incomplete number of workings in a particular phase.
  • Minimize the figure of sort components
  • For a large datasets, don’t transmit as a partitioned.
  • Use only necessary fields in the sort, reformat, join components.
  • Use the most favorable value of max core values for the kind and join components.
  • Avoid re partitioning of data without reason.
  • Use phasing buffers in case of a combine, sorted joins.

71) Mention the kind of layouts does Abinitio supports?

Ans:

There are successive and parallel layouts supported by Abinitio. A graph can have both at the same time. The parallel one depends on the amount of data parallelism. If the

 multi-file system is 4-way parallel, then a part in a graph can run four-way parallel if the layout is distinct such as it’s same as the degree of parallelism.

72) State the difference between check-point and phase?

Ans:

The difference between them is:

CHECKPOINT: 

  1. 1. The break of the procedure will be sustained following the checkpoint
  2. 2. Data from the checkpoint is gathered and carry on executing after correction.
  3. 3. When a graph fails in the center of the procedure, a recovery point is shaped known as a Checkpoint

PHASE:

  1. 1. All the phases will sprint one by one
  2. 2. The middle file will be deleted
  3. 3. If a graph is shaped with phases, each phase is assigned to some division of memory one after another.

73) What are the layouts that ab initio supports?

Ans:

Ab Initio supports two kinds of layouts.

  1. 1. Serial layout
  2. 2. Parallel layout.

A graph can have both the layouts at the same time but the parallel layout depends on the degree of data parallelism. The layout is defined such that it is same as the degree of parallelism like for a 4-way parallel multi-file system, the component in a graph can run 4 way parallel.

74) What is the difference between roll-up and scan?

Ans:

Using ‘scan’, we can create cumulative summary records, whereas using roll-up, we cannot.

75) Difference between conventional loading and direct loading?

Ans:

  1. 1. Conventional Load: In this, the Table constraints will be checked against the data, before loading the data
  2. 2. Direct Load: This is used for fast loading. Here the data is loaded first irrespective of the Table constraints and later checked. Unmatched or bad data will not be indexed then.

76) Explain the procedure of running the graph without GDE (Graphical Development Environment)?

Ans:

In RUN ==> Deploy >> As script, it creates a .bat file at your host directory

Now, run the .bat file from Command prompt.

77) What are the continuous or continuously enabled components in Ab Initio?

Ans:

Continuous components are used to create graphs. Along with continuous running, it produces a useful output file. E.g.: continuous rollup, batch subscribes continuous update etc.

Are you looking training with Right Jobs?

Contact Us

Popular Courses