Informatica training in Hyderabad

Getting Informatica online training in Hyderabad From The Leading Institute


Informatica software is specially designed for projects related to data warehousing. In the business world, more or less every organization is dependent on Informatica because it helps in the proper maintenance and record data accurately and appropriately. There are wide varieties of ETL tools in the software that enables companies to progress and aids in the effective functioning of the software. It helps in comprehending information in a powerful style. In the recent years, Informatica has also offered lots of career scopes and opportunities. Many contenders have thought of making a career in this software.

Time To Build Your Career:

Therefore, if you are looking forward to building your career in this field, it is wise to acquire Informatica online training in Hyderabad. Hyderabad is one of the hot destinations for this software training and there are many institutes offering effective training in the same. However, while looking for the ideal institute, you should come to none other than us. We are one of the leading institutes offering some of the best training techniques in the entire course. Whether you are a total novice or have some knowledge about the software, we will guide you and train you in the best way. This in turn will effectively shape your career.

Learning The Technological Features:

Informatica is such a course that requires learning the technological features of the product. At our institute, you will find experienced and trained professionals giving you an in-depth knowledge on the entire technological features. In addition to that, with us, you will learn many new aspects of the product, which is not taught at most of the institutes. We also include management skills in our program, and it is completely focused on the use of data related to this field. Therefore, you will get the opportunity to learn lots of things by enrolling in our course.

Learning With Ease:

Even if, you are completely new to the product, we assure you that learning Informatica can be easy and fun. We adopt the best tools and techniques to provide the most appropriate Informatica training in Hyderabad Ameerpetrequired to know the software in detail. We believe that if candidates are willing to learn, they will find out the easy process to enhance the process and experience of learning. The significance and elements of the software can be quickly recognized while the performance of the application can be recognized with exercise applications that we provide. In addition to that, we also provide diverse course materials that might enable you to get through the entire process with ease.

Research The Actual Life Problems:

As you get started with Informatica training from the best Informatica training institutes in Hyderabad, it will help you to research on the various problems that are confronted in a complete style. Consequently, it will enable you to deal with data and records quickly. We also provide the necessary documentation on Informatica for appropriate exercising. We even prepare students for documentation examination regardless of what you are and what you want to do. In any case, you can be assured that the documentation will help in improving your efficiency and you can obtain the preferred outcomes for managing the everyday needs of the customers.

Some of the Common Highlights:

Now that you know about the importance of the Informatica training in Hyderabad in today’s business scenario, you should find out what we have to offer you through training. In fact, some of our highlights include:


  • In-depth course material and real time scenarios,
  • We provide classes and demos at the convenience of students
  •  We have 24/7 technical support and assistance
  • Our highly qualified and experienced trainer conducts the Informatica online training session
  •  Every topic coverage comes with real-time solutions
  • Every session is recorded so that they can be played later
  •  We provide certification oriented training and ensure 100% pass
  • We give placement support to our candidates through multiple consultancies in India, Australia, USA, and The UK
  • We also provide complete support while attending interviews.


A Structured Way Of Learning:

Informatica training teaches you how to integrate different data sources and converting raw data into useful information. Our training program is Hyderabad aims to deliver a structured way of learning for different profiled candidates. The course curriculum talks about configuring the data governance tools, matching the various servers and workflows etc. Apart from that, each session aims to offer the brief explanation of different modules and doubt clarification. We give extra effort to make sure that every candidate understands the concepts clearly adhering to the requirements of the industry.

Our Informatica training online course also aims to equip the trainees with adequate skills so that they can become Informatica Certified Developer on the completion of the course. Our courses are led by live instructors and they hold interactive sessions with candidates to ensure that they thoroughly understand the concept. Needless to say, we are practical in our approach due to which it becomes easier for candidates to grab information and knowledge quickly. Thus, they will learn how to transform their data in multiple ways like never before.

Quality Training Course Topics:

We aim to offer quality training at our institute with wide varieties of the course topic. Some of the common course topics on data warehousing concept covered by our professionals are:


  • Dimensional table
  •  Metadata
  • Types of fact tables
  • Types of dim tables
  • Online analysis and processing
  • Online transaction processing
  • OLTP vs OLAP
  •  Different methodologies of DWH
  • Benefits of OLAP etc.



Apart from these, there are many sub-topics that we aim to cover. Therefore, if you join our institution, you will certainly learn Informatica in the best way by means of which you will have a bright and solid career. Hence, studying Informatica will provide you with a resource of advantage in every way because it will provide great outcomes in the years to come.

It is high time to get in touch with us directly or through email and we will be more than happy to guide you in the training procedure. You should not miss out the opportunity to frame your career with our expert training and guidance on Informatica.


Informatica Notes:



Key Points:

Data Warehouse – Ralph Kimball:

  1. Definition: A Data Warehouse is a Relational database Management System (RDBMS) which is especially designed for analyzing the trend of the business and making decisions to achieve the business and making decisions to achieve the business goals, but not for business transactional processing.
  2. A DW House is an integrated database which consolidates data from multiple OLTP data bases.


Diagram 1

  1. A DW House is a Historical Database.
  2. A DW House is a read – only database (Select Query)
  3. A DW House is a Decision supporting system (DSS)


*What I need to build Data Warehouse?

I want build DW House we need Relational databases.

1. Relational database

2. Database that supports enormous storage capacity

(Billions of rows and Tera bytes of data)

3. Database that supports parallel reading. (Select)

4. Database that supports unlimited scalability (Extension)

Data which can be Expandable. We start business 54B DWH I want to increase / Terabytes we can increase.

5. Database that supports High availability.

Software Components, Hardware components, Software is down 100% data is available to users i.e., High availability.

Types of ETL:

If we want to consolidate the ETL we want to Implement EIL process.

Data Acquisition: is a combination of E + T + L

(Extraction + Transformation + Loading)

Data Acquisition: It is a process of extracting data from multiple operational source systems transforming the data to the desired business format and delivers the data to the target or destination systems.

There are 2 types of ETL

  1. Code based ETL:
  • Our primary Role is developing ETL Application.
  • We develop ETL application use some programming Language.
  •  Develop ETL application using programming Language such as SQL, PL/SQL.
  • Current project how many applications is developed (design) 50 applications.
  • Could you please tell your project application?

In your project?

  • Applications are conciliated from Multiple OLTP’s

Ex. For code based ETL:



3. TERA data ETL utility

Code based ETL having limitations in the Market.

2. GUI based ETL:

* GUI based ETL having no limitations. Some part for source is in oracle, some part of source is in Tara data having GUI based ETL.

Type of source we connect. GUI connects to any type of source system.

ETL Applications are developed using simple Graphical User interface using point and click techniques.

Why you’re using information, it is costly tool why u used it?


GUI based ETL tools are connects to any type of the source to pull the data.

  • Debugging is faster.



Data Stage

OLTP System = OLTP Application + OLTP Database Systems means domains,

We have many domains,





Retal etc.



Preparing food


To collect row materials


To go to vegetable Market


Not eating in the Kitchen


Going to dining table


Every Application configures to connect to OLTP Database.

Data Warehousing: is a process of developing Data Warehouse.

Data Warehouse: is a storage guadown, is a container,

Is a concept not a technology not a side?

OLTP can store only 1 week or 1 month or 6 month transactions for that processing.

For that we design separate database is called DWH

Here we preserve Transactions History


OLTP best Ex:


ATM Withdraw


Tatkal – is one transaction


Every transaction stored in database.

For Ex:

ABC Corporation – is a company

(Sales and Retails

In ABC Corporation we have many branches.


Diagram 2


  • Every Company has separate Databases

ELT is an advanced concept.

(Extraction, Loading, Transformation)

Every day transactions are stored in DW House.

Enterprise data scattered geographically in Multiple Locations.

Goal of OLTP:

Complete the transaction within seconds.

Transaction carried out in OLTP.

1 year, 1 week, 1 Month transactions stored in OLTP Systems. It is not useful for analyzing.

  • Last 10 years, 20 years transactions preserved in Data Warehouse.
  • Without Database we can’t design Data Warehouse.
  • Informatica is an ETL Tool; it performs Extraction, transformation, loading.
  • Data Warehouse is a large database, have very massive data. We shortly called as VLDB (Very Large Database).
  • Data bases:

Oracle:  Sequential Reading

Tera data: Parallel Reading

Sequential Reading: Query the data Record by records.

*We have develop DW House database Evolution is most important.

Not Every database is recommended to DW House

Low Range database: Storage Capacity in Mega bytes


Mid Range database:  Storage capacity in Giga bytes

Ex: Oracle, SQL Server, Sysbase

High Range database: Storage Capacity in Tera bytes.

Ex: Tera data, Netizza database

  • Netizza is powerful database 10 times powerful than Tera data. It is an IBM product. Retrieval is faster. The product Architecture is 10 times More faster.
  • Market wise 75% occupied Teradata. It is called version that why its occupied 75%.
  • In Every database Architecture is different, SQL is common but some difference is there.
  • In developing DW House our Role is,

ETL developer

ETL Tester

  • SQL pre-required for ETL development and Testing.
  • Informatica is 90% GUI, 10 % coding.
  • ETL are 2 types,
  1. Code based ETL : SAS
  2. GUI based ETL : Informatica
  3. Code based ETL:

Develop the ETL Application using some programming languages called code based ETL, SAS Programming means: c, c++

GUI based ETL:

Develop the ETL application using point and click techniques.

*How many applications in your project?

70 Applications in My project.

Which Application we developed?

Role is: Developing ETL Application.

  • DW House is a concept is consulted from multiple OLTP’s.
  • Data Warehouse is a data centre, integrated database, Read only database.
  • DW House is for analyzing the business and taking the right decisions at right time so DW House is Decision supporting system.
  • Analysis in Database Means: Select Query
  • Transactions involves: Insert


Delete (use some times only)

  • DW House involves: Select

i.e., It will do only retrieving not do any DML operations.

OLTP is especially designed for transactional processing. Contains 1 week, 1 Month of transaction is not effective for analyzing.

  • Where we preserve the transactional processing i.e., Data Ware house.
  • DW House is especially designed for Analyzing and Decision Making.
  • What is the need of DW House?
  • Why OLTP is not a DSS?

OLTP Database is volatile it keeps only current data; it does not preserve historical data.

For ex:


Diagram 3


Primary key (P.K):

Is a constraint. It uniquely identified every row (customer). PK is called Natural keys.

  • In DWH data is non- volatile means static (does not change). Having preserve History.
  • Every update in OLTP it inserts a new record in Data Warehouse.

Artificial or Surrogate keys:

It generates sequence numbers. Surrogate keys are not natural keys.

  • In DW house we maintain current data as well as previous data.
  • In OLTP having normalized data (no repeatency).
  • OLTP designed for clerckical operations.
  • DW House having De normalized data.

DW House is designed for Managerial operations.

DW House database are non- volatile databases.

Disadvantage of DWH:

Occupies (Having) Large amount of disk space.

  • Informatica is a data integration tool, data migration tool. It is used for building DW House.
  • Goal of OLTP: Keep / complete the transaction within seconds.
  • We have business in one place, we will recommend DW House.

ETL – Client Server Process:

Def: ETL Client:

  1. A Client is a graphical user interface which allows you to develop ETL applications.
  2. Client provides a design time Environment or development Environment to design ETL Applications.
  3. ETL Applications:
  4. It logically defines Extraction, transaction and loading.
  5. An application is made up of 3 Meta data components,

a)      Source Definition (E)

b)      Business Rule (T)

c)       Target (L)

a)      Source:

It defines Extraction from operational systems or OLTP database.

b)      Business Rule:

A set of instructions that tells how to process the data into desired business format.

c)       Target:

It defines data loading into a Destination system (DW House).


Meta data:

Defines the data and process

Ii. Meta data is the core component to develop ETL applications.

ETL Repository:

It is Brain of an ETL System that stores Meta data or ETL Objects.

ETL Server:

It is an Engine that performs extraction, transaction and loading.

ii) An ETL Server provides Run time Environment.

iii) Executes ETL Applications.

Without Meta data we can’t develop ETL application


Is a DLTP system is an operational system


Where we perverse the data or loading the data.

  • Every ETL developer work on designing ETL Applications.

Client is a graphical user Interface which provides design time environment.

  • An application logically defines extraction, transformation and loading.

Source is required for Extraction

Business rule is tell what format translate the data target is required for loading (DW House)

  • One of another ETL tool is BOD is related to SAP.
  • Business rule tells what type of desired data we want. We want develop business rule we have minimum knowledge on SQL.
  • For Ex:


Diagram 4


As per business rule data transformed to desired business format.

  • ETL Application logically defines Extraction, transformation, loading.
  • Minimum components of ETL Application,


  • Business rule is an area we want to develop complex applications.
  • Every ETL developer wants client

Meta Data:

Without metadata we can’t develop ETL Application.

For Example:

Diagram: 5

Data Structure or Data Specification or Data Definition is called Meta data.

Diagram: 6

Meta data describes data. Meta data is core component to design ETL Application

I want to develop ETL Applicate we want the components,

Source Metadata Component

Business Rule

Target Metadata Component

  • Un know Character for Gender (UNK)

‘M’         ‘F’           , ‘UNK’

  • Decoding Gender column as 0 – M   1 – F Other As – UNK

Business Rule:

Decode (Gender, 0, ‘M’, 1, ‘F’, ‘UNK’)

  • In DWH primary keys are called surrogate key. Surrogate keys are artificial keys.
  • Client is GUI here we developed ETL Application.
  • Applications stored in Repository.
  • Application Executes in Server.

Client: Provides design time Environment.

Server: Provides run time Environment.

  • Data Migration from source to Target only at runtime.
  • As per Tool terminology ETL application means Mapping.

1 Mapping = 1 ETL Application. (In Informatica)

1 Job = 1 ETL Application (In Data Stage Terminology)

1 Graph = 1 ETL Application (In Abinito)

  • Oracle data types automatically convert into Informatica data types.
  • Informatica is a point & Click and drag and drop technique.
  • In design time we want only Meta data. Design time which is provided by client.


Where we develop ETL Application.

Parts Means Columns:

In design time data never moves.

At the time of design ETL Application Meta data of source, Meta data of Target, sometimes need business rule.

  • Data Moves only at run time.


Diagram: 7


In Human body we explain:

Client: Only designing


is a system. What application we develop that stored in Repository

Server: Only Execution


Diagram: 8


In client we design ETL Application that means we design Meta data.


Data Mart:

  • Tell me the current project implement DW House or Data Mart?

DW House: is a go down, contains Enterprise information (all branches or all department inform)

All the department information consolidated in DW House.

DW House right terminology is EDW (Enterprise DW House)

Data Mart: is also a data base, it contains department specific information.

  • DW House is design for decision Making Data Mart is also design for decision Making
  • DW House is design for top level decision making.
  • Department level decision making purpose.
  • Data Mart is a subset of EDW House (EDW)

Data Marts are 2 types:

  1. Dependent Data Mart
  2. Independent Data Mart
  • Data Mart is a Concept
  • Father of DW House is Ralph Kimball, Bill Inmon.
  • Inmon is the first person introduce DW House 1987, SBC (Southern Bell Corporation)
  • Ralph Kimball more popular to develop DW House.

Data Mart and Types of Data Marts:

  1. A Data mart is a subset of an Enterprise Data Ware House.
  2. A Data mart is a subject oriented data base that supports the business needs of department specific user community.
  3. Data Mart are also known as HPQS (High Performance Query Structures)

There are 2 types of Data Marts,

  1. Dependent Data Mart
  2. Independent Data Mart

Who will design the tables of the DW House?

  1. Data modeller or Data base Architect

ER Win (Data Modelling Tool) is a product of A (Computer Associates)

  • Schema is nothing but data model. Data Model is nothing but collection of tables (Tables are called Target tables)
  • Using ER Win tool the database Architect design the tables.
  • Some OLTP’s are completely normalized.
  • Each ETL specification we design one ETL application.
  • As per requirement design the schemas by data modeller.
  • Design schemas at Enterprise level [Top Level]
  • Data Mart implemented by EDWH so it is called dependent Data Mart.
  • Top Down [ Bill Inmon]

Diagram: 9

Bottom – Up [Ralph Kimball]

Diagram: 10

Top – down Data Ware housing Approach: [Bill Inmon]

According to the Inmon, design an Enterprise ware house (EDW) at first, from EDW design department specific, a subject oriented schema called data Mart.

Diagram: 11

Dependent Data Mart:

A Data Mart development depends on Enterprise level schema.

Target tables means tables of EDW House or tables of Data Mart.

Data Marts further divided into,

  1. Physical Data Mart
  2. Logical Data Mart

Bottom – Up Data Ware housing Approach [Ralph Kimball]:

According to the Kimball design the departments specific schemas known as data marts, integrate the data marts into an Enterprise level Schema [EDW]

Diagram: 12

Independent Data Marts:

In a bottom – up approach data mart development is an independent of an EDW House.

Difference between Data Ware House & Data Mart:

EDWData Mart
  1. Stores Enterprise level Data
  2. Stores department specific data.
  3. Design for Enterprise level decision making for a top management (CEO)
  4. Design for department level decision making for middle Management (Head of the Department)


  • Target tables are designed by Data Models
  • ETL developer load data into target tables.

Types of Loading:

It is the process of insert the data in to a destination tables.

There are 2 types of the data loads,

  1. Initial Load or full Load
  2. Incremental Load or Delta Load

Initial load means first time load

200 schemas that we develop 200 ETL applications. For every schema we develop one ETL application.

Extraction defines in terms of [Select Query]

Load defines in terms of [insert]

  • The entire source data load into the Target table is called full load.

First time load is the initial load or full load second or third or any load is the Incremental load.

  • OLTP Schemas are base to design the target (DW House) Schemas.

Diagram: 13

This is full load.

After 1 week or 1 month, 100 customers Joined into OLTP.

Diagram: 14

We load only 100 records into DW House.

  • What Schema you’re using in your project?

Star Schema or snow flake schema

Informatica Power Centre 9.5

  • A data integration tool which combines the data from multiple OLTP Source Systems transforms the data into homogenous format and delivers the data throughout the enterprise at any speed.
  • It is a GUI based ETL product from informatica corporation which was founded in 1993 Red Wood city, California.
  • There are many products in Informatca Corporation.
  1. Informatica Analyzer
  2. Informatica Life Cycle Management
  3. Informatica Master data

Having many products in informatica.

Informatica power centre is one of the products A of informatica.

Using informatica power centre we will do the extraction, transformation and loading.

  • Informatica Power Centre Architecture:

The following Components get installed,

  1. Power Centre Clients
  2. Power Centre Repository
  3. Power Centre Domain
  4. Power Centre Repository Service (PCRS)
  5. Power Centre Integration Service (PCIS)
  6. Informatica Administrator

Mapping is nothing but ETL Application.

  1. Power Centre Clients:

There are 4 Client applications or client components get installed,

  1. Designer
  2. Work flow Manger
  3. Work flow Monitor
  4. Repository Manager

Every mapping is made up of source Meta data, Target Meta data, and Business rule.

  1. Designer: It is a GUI based client component which allows you to design ETL applications called Mappings
  2. A mapping logically defines extraction, transformation and loading.
  3. A mapping is made up of following meta data components,
  4. Source Definition(E): Which is desired for extraction
  5. Business Rule  (T)
  6. Target Definition (L)

Diagram: 15

Without metadata we can’t made up of mapping.


Diagram: 16

Session: is a task is power centre object. One session can execute only one mapping

  • Why sessions require?

Session is require to provide run time properties to execute mapping. Define the connection to source DB to target DB.

In session provide,

Normal Load


Bulk Load

Source Data base Connection

Target Data base Connection

  • To execute mapping we create session. Run time information provided by session.

Diagram: 17

  • 100’s of session group to 1 work flow.
  • In sequential process session execute one after another.

Work flow: is a set of instructions, that tells how to when to execute the sessions.

  • Depends on the data loading specification.

Q. I have 2 sessions S1 & S2. Execute S2, if S1 10 times successfully complete then only executes S2.

Ans. Sequential Batch process because Dependences between S1 & S2 that’s the reason to use sequential batch process.

  • In my DWH 500 tables is there. 450 tables requires daily load & 50 tables requires weakly load. (Daily load 7 times success then only weekly load will be done)

Mod ($$CNT, 7) = 0

  • Extraction: Reading
  • Loading: Writing


Diagram: 18


  • We run (Start) work flow automatically or scheduled.
  • Session succeeded means loading data into mapping successfully.
  • To execute 50 mapping, to develop 500 sessions and group 500 sessions and connect to work flow.
  • Processing is done by Dependency (Sequential, batch)
  1. Work flow Manager:

It is a GUI based client which allows you to create following ETL Objects.

  1. Session
  2. Workflow
  3. Scheduler

1. Session: a session is a task that executes mapping.

i. A session is created for each Mapping

ii. A session is created to provided runtime properties

A session is a set of instructions that tells ETL Server to move the data from source to destination.

Work flow: is a set of instructions that tells how to run the session tasks and when to run the session tasks.

How to run means: Sequential or parallel

When to run means: Time

A work flow is designed with 2 types of the batch process.

  1. Sequential batch process
  2. Parallel or Concurrent batch process

1. Sequential batch process:

i. Work flow runs the sessions one after another

ii. It is recommended to design. When there is a dependency between data loads.

Diagram: 19

2. Concurrent or parallel batch process:

i. Work flow runs all the session tasks at once.

ii. It is recommended to design when there is no dependency between data loads.

Diagram: 20

3. ETL Objects:

1. Mapping [Designer]

2. Session [Workflow Manager]

3. Work flow [Workflow Manager]

* Without session we can’t run mapping.

Session having runtime properties [Source DB connection, Target DB Connection]

Top Object in development Hierarchy:

Work flow without a workflow we can’t run sessions.

Summary point: all objects stored in repository.

3 ETL Objects.

  1. Mapping [Designer Client]
  2. Session [ Workflow Manger]
  3. Work flow [ Work flow Manager]
  • A Work flow run one or more sessions at a time.

Normal Load:

Load the data from source to target record by record is called normal load.

Bulk Load:

Load the packet data from source to target i.e., in source data is bulk that data is clubbed like a packets that source data packets load into target Session having runtime properties i.e.,

Loads (Normal Bulk Load) and Connections (Source DB Connection, Target DB Connection)

Batch is nothing but group of sessions assign to one work flow.

Target tables means destination table.

Session is a set of instructions that moves the data from source to Target.

Run the sessions by using 2 methods,

  1. Automation (System do this)
  2. Run on demand (Manually run sessions)

Summary Point:

Note: The top object in the development hierarchy is workflow.

Scheduling Work flow:

It is a process of an automation that runs the work flow at a given date and time.

The following are the schedulers that can schedule the work flows,

  1. CRON (a UNIX based operating system scheduling tool)
  2. Auto sys (Very rare Situations it is used)
  3. Control M
  4. WLM
  5. Tivoli
  6. Tidal

There are the third party schedulers.

  • If informatica have own scheduler why using third party scheduler.

Ans. The client uses various applications (Main frames, Oracle Apps use Tivoli Scheduling tool). And integrate different applications and schedule that applications it is very easy by using third party schedulers.

Workflow Monitor:

  1. It is a GUI based client application which allows use to monitor ETL Objects running on ETL Server.
  2. Collect runtime statistics such as
  3. Number of records extracted
  4. Number of records loaded
  5. Number of records Rejected
  6. Fetch session log
  7. Through put
  • Complete information can be accessed from work flow Monitor.
  • For every session one log file is created?
  • Cloud u tell what type of errors u get in your project

Ans. Fatal or non fatal Errors

  • Types of errors in Informatica?

Fatal Error occurs the session will be stopped immediately and return the status failed.

  • Non- fatal error occurs the session will not be stopped immediately.


Example: Primary Key Violation Errors:

Source: Empno Number(5)


Target: Emp Num(3)

  • Here we get precession error.
  • 2. Session status is successes but ‘0’ records are loaded. In this time first we have to check the precision.
  • Objects are developed by using designer. Work flow manager.
  • Throughput is nothing but efficiency (Rate of efficiency) means number of records loaded per second we expect how time to load.
  • Advantage of Throughput:

You can estimate the performance.

Through put depends on Row/sec and Bytes/sec (is depends on Row/sec)


Through put: It defines the rate at which number of records extracted per second and number of records loaded per second.

Through put can be expressed in 2 variance,

  1. Rows per second
  2. Bytes per Second

Bytes per second calculation depends on Rows per second, data types and number of columns.

By using through put we estimate the performance.

A throughput is evaluated to determine the performance statistics.

  • As a ETL developer, your authorised to these 3 components,


Work flow Manager

Work flow Monitor

  • Source: 10 records
  • Target: 8 records are loaded

Trouble shooting is done by using work flow monitor.

  • Could you please tell me what type of errors 4 implements in your project?

Ans: Common Errors:

Fatal, Non fatal

Error Threshold:

Database Errors

By default session is not stored automatically.

Stop on Error – 0 (Does not stop session) unlimited nonfatal errors.

Stop on Error – 5 is error threshold.

Error log types:

We can log the Errors

Flat file


Error code, Error Message we captured in Tables.



Overflow Rows:

Source: Empno num (5)

Target: Empno num (3)

Numeric case records will be rejected

But character case records will be loaded

ETL development Processing Steps:

Step1: Create Source definition

Step2: Create Target definition

Step3: Design mapping with or without business rule.

Step4: Create session for each mapping

Step5: Design work flow

Step6: Run work flow

Step7: Monitor ETL process (Check whether the success or UN success)


First 3 Steps are implemented by designer.

Work flow Manger: Next 4, 5, 6 steps are implemented by work flow manager.

Work flow Monitor: Step 7 can be check from work flow monitor.

Note: Step 1, 2, 3 are implemented using designer client.

Note: Step 4, 5, 6 are implemented using work flow manager client.

Note: Step 7 can be check from work flow monitor client.

These are the 5 object to more data from source to target.

  1. Source
  2. Target
  3. Mapping
  4. Session
  5. Work flow

Summary Point:

The following are the 5 basic ETL objects which are required to move the data from source to destination,

  1. Source – is required for extraction purpose
  2. Target
  3. Mapping
  4. Session – To execute mapping
  5. Work flow – batch the sessions either in sequential or parallel group the sessions and connect to work flow.
  • Without source and target we cannot design mapping

We never user informatica scheduler in infomatica we use third party schedulers.

How to organise or identify the ETL Object?

Ans: By creating folder:

Organises the ETL Objects using folder.

Folders are created by developer name.

Folder is important Folder

In real time we can’t create folder.

  • Why folders are created by only Administrators in real time?

Administrators create folders access to the folders given by Administrators.

Administrators give access permissions with required permissions.


Diagram: 21


Other folders we can view, we can’t get any write and execute permissions to objects.

We can’t do write and execute.

Testers have: Read, execute permissions

  • Through version control tool (Check in, Check Out) we identify the users.
  • Brain death: Repository many be crashed.
  • Regular or daily basic we can take the back up of ETL objects from repository.
  • We take back up from repository to .xml (xml) is the Meta data file format.
  • Developer Role is:

Designer Client

Work flow Manger (The major clients Designer and Work flow)

Work flow Monitor

  • Non- Power Centre Clients:

Ds- Prompt

Power Centre Clients:

Designer, Workflow Manager, work flow monitor, Repository manager.

Iv.Repository Manager:

It is a GUI based administrative client which allows performing the following administrative tasks.

  1. Create, Edit and delete folders
  2. Assign users to access the folders with read, write and execute permissions.
  3. Backup and Restore repository objects



It is a process of exporting repository objects in a Meta data file format .xml file

3 life cycle process in real time,

  1. Development: I will export the objects in xml file.
  2. Testing: those objects we test.
  3. Production: Testing is success those objects move to real time environment.

One xml contains source, Target, mapping, session, mapplet etc any object.


It’s the process of importing objects from Meta data file called .xml

Code means Meta data (source, target, mapping, session)

Power cent clients are 4 but have permission only to 3 GUI based clients: Designer, work flow manger, work flow monitor.

  • Repository data base: store only Meta data objects.
  • Before install informatica first we install

RDBMS: Oracle. After install RDBS we install informatica 9.5

  • Repository (Resides on – is a) Relational database which consists of system tables. Where our ETL Objects Store.

Power Centre Repository:

Repository is the Brain of ETL Application. It can store ETL Objects.

Mapping is not saved in single, if is saved in multiple tables.

Repository stores Mapping, repository stores system tables.

  • Meta data simply we called ETL Code.
  • Current flavour: Informatica windows flavour
  • Informatica Unix flavour schedulers’ ( in real time only available)

Power Centre Repository:

  1. Repository resides on relational data base Example: Oracle that stores metadata or ETL Objects. Or ETL code which is required to perform ETL.
  2. Repository data base consists of system tables where ETL Objects are saved.
  3. Repository is also known as data dictionary of ETL.

The following are the objects that gets saved in the repository,

  1. Source
  2. Target
  3. Mapping
  4. Session
  5. Work flow
  6. Scheduler
  7. Source and target data base Connections

There are many objects (Mapplets mapping, user defined functions, transformations,)

  • What are the objects developed in client stored in repository.

Objects: Source, Target, Session, Mapping

Client: Designer, Work flow manager, work flow monitor

Execution log contains:

Number of records rejected, accepted

  • Client access only runtime information.
  • Run time information provided by ETL server and the information stored in Repository that is across by work flow monitor.
  • Run time information: What is error, how many records are accepted, rejected.
  • Run time information created by ETL server
  • Repository is represented like
  • Repository consists system tables (Meta data)
  • Repository service (manages the dialogues, client and repository) is server not a ETL Sever is responsible for saving meta data (objects) in the repository, retrieve the meta data, update the meta data delete the meta data.
  • This service does not perform ETL
  • When you’re development environment we develop object this architecture is important
  • Work flow Monitor: Run time statistics.
  • If the repository service is down (Not running) we can’t develop Meta data.
  • Repository manger: Consists Objects, Assignments, internally permissions.
  • Repository service internally permissions developed by Java.

Power Centre Repository Service:

i.The power centre client applications connects to the repository through repository service.

ii. It is a set of process that inserts, updates, delete and retrieve Meta data from repository.

Iii. It ensures that there is a consistency of the Meta data stored in the repository.


Development Environment:

Diagram: 22

This architecture is called development architecture.

Power Centre Integration Service:

  1. It is an ETL server that executes ETL Objects.
  2. It provides a run time environment and performs extraction, transformation and loading.
  3. It reads Meta data from repository through repository service.
  4. It stores Meta data such as number of records extracted loaded, rejected and session log in the repository through repository service.
  5. An integration service consists of following components.

i.Reader (is basically for Extraction) [E]

ii.DTM (is basically for Transformation) [T]

iii.Writer (is basically for Loading) [L]

i.Reader:  It extracts the data from mapping sources.

ii.DTM (Data Transformation Manger): It processes the data according to the business rules that you configure in the mapping.

iii.Writer: It inserts the data into the mapping targets.


We need two runtime connections,

Source Connection (Reader Connections)

Target Connection (Writer Connections)

  • Run request is initialized from this client that client name is work flow manger.
  • Client & Server connection is done by TCP/ IP protocol.
  • As soon as run request is initialized immediately request go to repository.

Diagram: 23

Original Objects available in Repository.

Session Instance is not an original object it is an image.

Integration service cannot connect directly to the repository.

Target DB Means Ware House

Using Repository Objects the data always moving from source to target.

By using Objects or with the help of the objects we move the data from source to target.

  • Integration Service need original Objects.
  • Using Objects we can move the data from source to target.

Integration Service:

Entire business rule i.e., entire ETL Process done by integration service.

Entire ETL is done by integration service

Entire process is done by integration service.


Runtime Environment:

Diagram: 24

Reader will be perform extraction, writer will be perform loading transformation in between process will be developed.


Log Contains:

How many records are rejected, accepted, loaded?

It contains the information about the records log created by server saved in repository and accessed by work flow monitory.


  1. An integration service creates session log and that can be saved in repository through repository service.
  2. Session log can be accessed from work flow monitor client.

Pre requisite to develop an ETL Process:

We move data from source to target we develop 7 steps or Objects. Before develop these steps we develop some pre- requests,

Step1: Set up source and Target data base:

  • Source Database means already existing data base. It is have already existing tables. Similar to us need to setup target DB. I want to build my target DB with oracle. But we can work any data base, source is oracle, target is Tera data. Any data base we can work
  • Here my source DB taken oracle and my target DB also taken oracle.
  • Source DB always existing DB similar to source DB we need to setup target DB.

Diagram: 25

  1. EMP is the source table which belongs to oracle Scott schema from which data extracts.
  2. T-Employee is the target table which belongs to oracle ‘Batch 7 AM’ schema to which data loads.

Setup target Database:

  • Start: programs: Oracle: Application Development: SQL Plus
  • Log on the oracle with the following details

Diagram: 26


  • In SQL prompt,

Sql: Create user Batch7am identified by target;

  • Grant DBA to Batch7am;
  • Connect Batch7am/target@ORCL;



Show user user is “Batch7am”

Sql: Create table T_Employee (Employee_ID Number (4) primary key, Name varchar2 (10), Job varchar2 (10), Salary Number (7, 2), Deptno number (2));


  • Without Meta data of the source & without Meta data of target we can’t design mapping or ETL application.
  • Common language: ODBC (is act as a interpreter interface middleware between client & DB) is a interpreter
  • Designer is a client it is belongs to GUI community.
  • Scott belongs to RDBMS Community.
  • Without ODBC Connections we can’t create meta data of the source and Meta data of the target.

Creation of ODBC Connections:

An ODBC is a middleware or interface that connects to various databases.

I.Source ODBC Connection:

DSN Stands for data source name i.e., Connection Name.

User DSN is recommended for standalone system.

In real Environment we go for System DSN.

  • Start: Settings: Control Panel: Administrative Tools: Data Sources (ODBC)
  • In ODBC select System DSN tab click on Add
  • Select the driver Oracle in oraDblog_home\ click on finish.

Oracle ODBC Driver Config:

Data Source Name:  (is any name)

Description: (optional)

TNS Service Name: ORCL (We give ORCL we get Error.) Transmission Service Name: TNS





  • Enter the following details,


Data Source Name: Scott_ODBC_ORACLE


TNS Service Name: ORCL

User ID: Scott

Click on (Test Connection)

Enter the password ‘tiger’ click OK

Connection is Success: Next Click OK

Target ODBC Connection:

  1. From system DSN tab click on Add select the driver oracle in oradblog_home Click on Finish.
  2. Enter the following details.

Oracle ODBC Driver Configuration:

Data Source Name: Batch7am_ODBC_Oracle


TNS Service Name: ORCL

User ID: Batch7am

Click on Test Connection

Enter the Password target click OK. (Connection is success we click)

Next Click OK.

3rd Prerequisites:

Start up power centre Services:

  1. Start: Settings: Control Panel: Administrative Tools: Services

Select Informatica 9.5.0 Right Click

Click on (Start)

  1. We can also services in another way,

Start: Programs: informatica 9.5.0: Server: Start informatica Services




4th Prerequisite:

Create a Folder

We create any objects that store in our folder that folder store in the Repository.

Creation of the Folder:

  1. A folder is created using power cent repository client.
  2. A folder is created to organise ETL Object.
  3. Every Client should connect to repository service
  4. Repositories: (Repository Service)

Nipuna_rep: This name is given when installation time.

In folder Configuration is default.


  1. Start: Programs: Infromatica 9.5.0: Client: Power Centre Client: Power Centre Repository Manager.
  2. From the left window double click on a repository service with the name nipuna-rep
  3. Connect to the repository with the following details,

Uname: Administrator

Password: Administrator

  1. Click on Connect
  2. From Folder menu Click on Create

Enter the folder name: Radha

Click OK

ETL development Process Steps:

  1. Step1: Create source definition

i.A source definition is created using source Analyzer tool in the designer client component.

ii. Source Analyzer connects to the source Data base using ODBC.

  • Design belongs to GUI, Source DB belongs to RDBMS to communicate each other we need ODBC Connection.

Diagram: 27

Step1 Procedure:

Start: Programs: informatica 9.5.0: Client: Power Centre Client: Power Centre Designer Every Client should connect to Repository Service.

  • In real time we have own user name and password to connect to Repository.



  • When we start power centre the repository service will be started but does not connect to repository. We connect to repository.
  • Left hand side window is called repository space
  • Right hand side window is called work space.
  • Folder in open mode.
  • At a time one folder can be active.
  • How to know folder is active?

Active folder represented as bolt (Folder in open mode)

Batch 7 AM

  • Business Components
  • Source
  • Target
  • User defined functions (Sub Folders)

Sub folders are automatically created.

We will not create sub folders, we can’t delete sub folder.

  1. From the left window double click on the Repository service with the name nipuna_Rep.
  2. Connect to the Repository with the following details

UName: Administrator

Password: Administrator

Click on Connect.

  1. From left window select the folder.
  2. From tools menu select source Analyzer.
  3. From sources Menu click on Import from Data base.
  4. Connect to the data base with the following details,

ODBC data source: SCOTT_ODBC_ORACLE (Oracle in ordaDblog_home)

Uname: Scott

Owner Name: Scott

Password: tiger

Click on Connect

Select the Table (Emp) Click OK

  1. From Repository menu click on ‘Save’.

2. Step2: Create Target Definition.

i. A target definition is created using Target Designer tool in the designer client component.

ii. The target designer tool connects to the Target Database using ODBC.

Diagram: 28


Step2: Procedure:

  1. From tools menu select target designer.
  2. From targets menu click on imports from Data base.
  3. Connect to the target database with the following details,

ODBC Data Source: Batch7am_ ODBC_ Oracle

User Name: Batch7am

Owner Name: Batch7am

Password: target

Click on connect.

Select the table T-Employee

Click OK.

  1. From Repository menu Click on Save.

Sources Organized in source sub folder.

Targets Organized in target sub folder.

Step3: Design Mapping Without business rule (Flat Mapping)

  1. A mapping without business rule is known as Flat Mapping or Pass through Mapping
  2. A Mapping is created using “Mapping designer tool” in the designer client Component.
  3. Every mapping is uniquely identified by its name. For Example: m – fatal- oracle

In real time in ETL Specification they give the mapping names.

We will not create mapping names.

In ETL testing testers test the naming standards.

Peer review:

Is done by another developer

Code review: is done by developer

  • Why source definition associated with Source Qualifier?

Ans. Informatica has own data types.

Oracle data types are native data types.

Native data types translated into power centre (informatica) data types by using source qualifier.

  • In real time source have 200 records, manually (Connect) source to target is difficult that’s why we use Auto link
  • We have to design mapping in real time.
  1. Every source definition by default associate with source Qualifier.
  2. A source qualifier translates native data types in to power centre data types.


  • Why session need?

Ans: Session execute mapping. Session provides runtime information.

Later we know much functionality in source Qualifier.

Step3: Procedure:

  • From tools menu select mapping designer
  • From mapping menu select Create.
  • Enter the mapping name (m – flat- mapping – oracle) Click OK.
  • From the left window expand sources sub folder drop the source definition (Emp) to the mapping designers work space.
  • From left window expand the targets sub folder drop the target definition to the mapping designers work space.
  • From source qualifier Map the columns or link the columns using drag and drop technique.
  • From repository menu click on Save.
  • ODBC Connections are used at development. We used run time connections (Reader & Writer connections) provided at session level.
  • Unix shall command execute by command task.

Command Task: execute OS Commands Integration services configure with unix commands (Shall Scripts)

  • Session is task that execute mapping.
  • Follows naming standards not compulsory (Mandatory)
  • Sessions:

S- m –fatal: Original Session.

Reader: Relational Reader means our data base oracle is relational reader.

Step4: Procedure:

Create Session for each mapping:

i.A session is a task that executes mapping.

ii. Reusable Session tasks are created using Task developer tool in work flow Manager client Component.

  • Reusable Session Tasks: Those number of session we use any number of work flows.

Non –  Reusable Session Tasks:


  1. Access power centre work flow manager client.
  2. From the left window select the folder (Batch7am)
  3. From tools menu select “Task developer”
  4. From tasks menu select create
  5. Select the task type ‘session’

Enter the name (s – m – flat – mapping – oracle)

Click on Create

  1. Select mapping click OK. Click on done.

Every Session Configure with reader & Writer connections why because reader read the data from source DB load into the Destination tables (DB).

Configuring Session with reader & Writer Connections:

Creation of reader connection: Procedure

  1. From the click power centre work flow manager

Native drivers: are more faster compare to third party drivers.

Example: Oracle developed by oracle

Third party drivers: Microsoft drivers

Example: ODBC

  1. Select connections menu click on Relational.
  2. Select the type oracle click on new
  3. Enter the following details,

Name: Oracle_Scott_DB


User Name: scott

Password: tiger

Connect String: ORCL

Click OK

Creation of writer Connection:


  1. Select the type ‘Oracle’ Click on new
  2. Enter the following details,

Name: Oracle_ Batch7am_DB

User Name: Batch7am

Password: Tiger

Connect String: ORCL

Click OK

  1. Double Click the session select the mapping tab.
  2. From left window select the source (SQ-Emp)
  3. From connections section [Right Window] click on to Open relational Connection browser.
  4. Select the type ‘Oracle’

Select oracle_scott_DB connections click OK.

  1. From left window select the target (T- Employee)
  2. From connections section (Right Window)

Click on to Open relational connection browser.

  1. Select the type oracle

Select oracle_ Batch7am _ DB Click OK.

  1. From properties section [Right Window]

Select the target load type as ‘Normal’.



  • What is the difference between normal load and bulk load?

Normal Load: record by record loading

Bulk Load: Massive data packets loading

Click apply Click OK.

  1. From repository menu Click on OK.

Step5: Create Work flow:

  1. A work flow is created using work flow designer tool.

Why work flow require?

We group the sessions, batch to work flow.

Session: It is a session instance.

  1. From tools menu select work flow designer
  2. From work flows menu select create.

Enter the name click OK.

  1. From left window drag the session, drop beside the start task.
  2. From tool bar click on link task drag the link from start task, drop an session instance.
  3. From repository menu click on Save.


Start Work flow

From work flows menu click on Start work flow.

Work flow: Work flow is submit session.

Work flow is always success. (i.e., succeeded)

Session Fail: In work flow monitor window directly press “E”.

At a time only one mapping will be select.

CLE SCR: Clean the screen in oracle

  • After the table columns in Oracle,

Alter table t_Emp modify (deptno number(2), dname varchar2 (10), Loc varchar2 910));

Table altered.

*** Note: I create one table with following details in the source and target meta data without information



Empno (2)Ename (10)Job (10)
10RK ReddySalesman






Emp no (1)          Ename (5)           Job (5)

  • In this situation session is success data not loading into target table because Emp no width or length is 2 but in target we declare  Emp no length is 1.

Who will design the Schema of DW House?

Ans: Data base Modeller design schemas by using ER Win tool.

  • We don’t have permission to modify or create target tables in real time. We just use target tables in real time. Target tables design by architect.
  • Create target table using Front End Environment, I have no SQL Knowledge.

Target designer: Target: Create

  • My Target table structure very similar to source table structure some simple changes

Ans: Drag the source table in Target designer.

Note: We design target table by using front end we must export to backend.

Ans: Target designer: Target: General / Export.

Before Configuring, session we create reader and writer connections

  • In run time our reader connect to source DB and extract data from source. Like that writer connects to target DB and load data into target
  • Another way to run work flow?

By using pmcmd

Pm cmd: start workflow –f Batch7am W-S-M-red

Press Enter 0 – success; 1- UN success;



  1. A transformation is a power centre object which allows you to develop the business rules to process the data into desired business formats.
  2. Transformations are categorized into 2 types,
  3. Active Transformation
  4. Passive Transformation
  • Informatica can’t understand SQL functions, Informatica has own functions. By using informatica function is very easy to use or understand or implement, business rule.



  • In for functions:

is null ()

is space ()

is integer ()

  • All represent And
  • Any represent OR
  • I want to check 21003 columns are not null

Note (is null (sal)


Is null (comm.)


Is null (tha))

*** In your current project which is the transformations can to filter the rows?

Ans: All active transformations filter the number of  rows.

  • In our source I have 6 columns but I want 2 extra columns in target Tax, HRA Columns.

Ans: By using

Expression Transformation

Tax = sal * 0.15

HRA = sal * 0.11

It does not affect the records that’s why expression T/R is passive T/R.

  • Java transformations, customs T/R are the passive transformations. Rare situations only we have to use.
  • To develop 16 transformations to develop business rule.

Active Transformation:

A transformation that can affect the rows or that can change the number of rows is known as active transformation.

The following at the list of active transformations used for processing the data,

  1. Source Qualifier T/R
  2. Filter T/R
  3. Router T/R
  4. Aggregator T/R
  5. Joiner T/R
  6. Union T/R
  7. Rank T/R
  8. Sorter T/R
  9. Update Strategy T/R – DML functions implement here
  10. Transaction control T/R
  11. Normaliser T/R
  12. Look up T/R


  • Why cache is required to sort the data or join the data?

Passive Transformation:

A transformation which doesn’t change the number of rows or doesn’t affect the number of rows while processing the data is known as passive T/R’s used for processing the data,

  1. Expression T/R
  2. Stored Procedure T/R – In red time we use compulsory
  3. Sequence Generator T/R
  4. Look up T/R
  5. XML Source Qualifier T/R


Up to 8 -6 Version look up T/R is purely passive when the new version 9.0, 9.1, 9.5 (9 series)

As made look up is active as well as passive. By default look up T/R is active.

In SQL regularly ask Questions:

What are cursors? Cursor Methods?

Types of Cursors?

Stored Procedures in SQL?

  • Without look up no real implementation take up.
  • What are the best practise methods in look up?
  • In look up we can write SQL Queries?

Note: In SQL co- related sub Query’s are most important.

A transformation allows the business rule.

  • Those active & passive T/R’s are further divided into 2 types,
  1. Connected T/R
  2. Un Connected T/R

Ports & Types of ports:

A port represents column of the table or a file.

There are 2 types of the ports,

  1. Input port (I)
  2. Output port(O)

1. Input Port:

A port which can receive the data is known as an Input port.

2 .Output Port:

A port which can provide the data is known as output port.

  • Port is nothing but a column.
  • Table is created with columns. In DB terminology.
  • A column is called as port as a Transformation.

Ex: Salary Port, Ename Port


Diagram: 29


  • Connected T/R returns multiple ports.
  • Un Connected T/R returns always one ports.

Connected & Un Connected T/R’s:

Connected T/R:

  1. Transformation is a part of mapping dataflow.
  2. It can receive the multiple input ports and can return multiple output ports.
  3. It is connected to the source and connected to the target.
  4. All active and passive transformations can be configured as connected transformations.


A transformation which is connected to the source as well as target.

Every transaction is a part of data flows.

Data flow starts from source

Every T/R is a Input ports as well as output ports.

Example: 16 transformations are connected T/R’s

Dimension key or Data Ware House key or surrogate key

If we want to represent diagrammatically,

Diagram: 30

  • Why required as connected & UN connected T/R?
  • One of the characteristic feature of the connected T/R it can receive multiple ports and can provide multiple ports.

For Example:

8 ports input, provide 9 output ports. (Normal ports in input but in output we derive the derive port.)

  • Target table have index loading performance degrades. We improve loading by using 2 methods


  1. Create Index
  2. Drop Index

Why We create index and drop index?

Without drop index before loading the data without create index after extracting the data that time target table loading performance is degrades

  • Only 2 T/R’s we can as unconnected T/R
  1. Look up
  2. Stored Procedure
  3. Lookup are 2 types-
  4. Connected lookup
  5. Un connected lookup
  6. Stored Procedures are 2 types,
  7. Connected stored procedure.
  8. Un Connected stored procedure.

Note: Un connected T/R always return only one port.

Un Connected T/R:

  1. It is not part of mapping data flow direction
  2. It is neither connected to the source nor connected to the target.
  3. It can receive the multiple input ports but always returns a single output ports.
  • Is there any possible to get multiple output ports by using un connected T/R?

Ans: 4 individual columns we concatenate as a single port. Split the single port into multiple ports using transaction.

The following transactions can be configured as un Connected

  1. Stored Procedure T/R
  2. Lookup T/R
  • What is transformation language?

Ans: Infomatica T/R language is a set of built in functions like SQL functions.

Why functions are require?

Answer: To built the business rule.

Trimming means removing

  • We want to remove spaces both sides but there is no trim function every time we write rtrim (ltrim (Ename)) like that it is very difficult. That’s why informatica provide the user defined functions. By using user defined function we create own set of functions.
  • Functions are like SQL functions.

Ltrim, rtrim () : built in functions

  • Set of operators and set of function we called as transformation language in information.


Transformation Language:

  1. It is a set of built in functions like SQL functions which are used to develop the business rules to process the data.

The functions are categorised as follows,

  1. Character Functions
  2. Numeric Functions
  3. Date Functions
  4. Aggregate Functions
  5. Variable Functions
  6. Test Functions
  7. Conversion Functions
  8. Data Cleansing functions
  9. Special Functions
  10. Financial Functions
  • Informatica Help
  • Help: Contents: Contents – Transformation language Reference: Functions
  • Select * from Emp where substr (Ename, 1, 1) = ‘S’
  • Language is nothing but built in functions.


Filter Transformation

  1. This is a type of active transformation that can filter the source rows based on the given condition.
  2. Filter T/R supports only single condition.
  3. Integration service evaluates the condition, returns TRUE or False
  4. A True indicates that records are allowed for further processing or loading into the target.
  5. A FALSE indicates that records are rejected from filter T/R.
  6. * Rejected records can’t be captured (can’t be located in a session log file and in a bad file)
  • Filter means data cleansing or sanitization.


Sal > 3000 and Job = ‘Manager’ – Here we have 2 conditions

Is null (COMM) and sal > 8000 – Here we have 2 conditions

(Dept no= 30 OR Deptno =20) and Job = ‘Clark’ – Here We have 2 conditionsThere

These 3 are multiple Conditions.

In this three conditions either we create only one condition. At a time we will not create 3 conditions.

My requirement specification I want to create multiple conditions?

Ans: There is a way to create multiple conditions

i.e., Router T/R

* At runtime integration service evaluate condition

Any possible to view the rejected in session log file? Can u see rejected records in bad file?

Ans: Can’t view the rejected records in session log. We can’t see the rejected records in bad file.

Draw backs of Filter T/R:

1. We can write only one condition in Filter T/R

2. Any where we can’t capture rejected records.

3. TRUE: Source row is satisfied with condition

I have a source with 10 rows I want to load all 10 records in the target using filter T/R.

Ans: Using flat mapping also we can load all 10 records

  1. In Filter T/R we use TRUE indicates all records load into target.

Note: TRUE is the default filter condition.

  1. The default filter condition is TRUE.
  • Filter T/R is a GUI, it function as where Clause in SQL.
  1. It function as where clause in SQL


  1. Single Condition
  2. Rejected records can’t be captured.
  • What is cache? Why cache is need/ required? Without cache u requirement is process or not?
  • In any domain (banking , telecom, retail, Insurance)

We develop business rule through expression Transaction.

The project logics develop from expression,

For example: Total Sal for each Employee = sal + COMM (May having null values)

IIF (Condition, V1, V2)

(is a test function)

Condition is true returns V1

Condition is false returns V2

NVL () is equal to is null

If condition (comm) is null return sal, otherwise return sal + comm.

IIF (isnull (comm), sal, sal + comm)


  • Expression T/R is used process row by row.


0         is decoded by   M
0                                        M
1                                         F
2   others decode as     UNK


  • IIF (Gender = 0, ‘M’, ‘F’)
  • For unknown Gender in this situation we use nested if.

Nested ifs:

IIF (Gender =0, ‘M’, IIF (Gender =1, ‘F’, ‘UNK’))

  • In informatica use Decode () functions when we want to use Nested if.
  • Decode() is more efficient compared with Nested If
  • Informatica supports always single line comment —- or //
  • Decode is a search for the value.

Decode (Gender, 0, ‘M’,

1, ‘F’,




**** Expression Transformation


  1. This is a type of passive transformation which allows you to develop expressions and expressions are calculated on row by row base.
  2. An expression transaction is operated for each source row.
  3. Any expression T/R is created with following types of the ports.
  4. Input Port (I)
  5. Output Port (O)
  6. Variable Port (V)
  7. Any expressions can be developed either in only output ports or variable ports.
  8. Expressions can be developed using functions and arithmetical operators.

All transformation should have Input port and output port.

We develop expression in expression T/R that’s why we use variable port.

Q. What is Variable Port?

Ans. Expression is passive T/R if process row by row that’s the reason expression receive all rows.

  • When to develop expression only in output port and when we develop expression in variable port?

Answer: Derive one new column Tax


HRA                                       IIF (Sal > 8000, sal * 0.25, sal * 0.15)

Sal * 0.16

Any expression which is non reusable such expression directly built in output port. Any expression which is reusable such expression directly built in variable port. Any reusable expression we have to create variable port.

Any non reusable expression created in output port any reusable expression created in variable port.

Informatica power Centre – Comment lines:

The transformation language provides two comments specifies to let you insert comments in expression;

Two dashes (–)

Two slashes (//)

The power Centre Integration service ignores all text on a line preceded by these two Comments specifies.


Scenario. 1:

Calculate the for an employees who belongs to sales department as follows,

If salary is greater than 8000 calculate the tax as sal * 0.25 else calculate the tax as sal * 0.17

The sales dept unique identification number is 30.


Keep expression T/R. Close to source Qualifier.

Diagram: 31

  • This design give accurate result but performance point of its not good. The design takes more time.
  • Some people suggested use filter T/R as close to source Qualifier. It give the best result and performance is better.

Filter T/R – Performance Consideration:

Keep the filter T/R as close to the source Qualifier as possible to filter the rows early in the data flow, there by number of records are reduced for further processing.

  • ETL Specification Document (Mapping Specification Dec)
  • As per ETL mapping specification Document (Spec doc)

Contains details of the source each column description. Data types, source, table name, source column name.

Target: Single target or multiple target

Note: One mapping contains any number of transformations

  • In Router T/R we can add multiple conditions.
  • How to create target database by using front end. In real time we will design target, database that’s the work of data base architect.
  • Minimum 8 years or 9 years people design the target Data bases.

Schema is nothing but collection of tables.

  • In real time our role start from designing environment design mapping
  • Data never load directly from OLTP to OLAP in between use staging Database (stg)
  • Number (P, S) is a kind of decimal.
  • Scrubbing: deriving new attributes called as scrubbing i.e., deriving new columns

Creation of the target table using target Designer tool: Procedure

  1. From the client power centre designer select tools menu click on target designer.
  2. From the left window expand sources sub folder
  3. Drop the source definition (EMP) to the target designer work space.
  4. Double click the target definition click on Re name.

Enter the table name (xyz_emp). Click OK.

  1. Select columns tab, from tool bar click on new column add new columns.


Column NameData TypesPrecScaleNot NullKey Type
TaxNumber (P, S)72Not a Key


Click apply and click ok.

  1. From targets menu click on Generate & Execute SQL.
  2. Click on Connect.
  3. Connect to the database with the following details

ODBC Data Source: Batch7am_ODBC_Oracle (Oracle in_)

User Name: Batch7am

Password: Target

Click on Connect

  1. Select create table: Click on Generate & Execute: Click OK
  2. Click on ‘Close’
  • Any transformation we use in mapping that is non reusable. Any transformation we create in mapping by default non- reusable.

We want to convent non_ reusable to reusable.

Double click on Transformation: Select: make usable: Check





Scenario 1:


  1. Create a mapping:

Drop the source and target definition to the mapping designers work space.

  1. From Transformation menu select ‘Create’: Select the transformation type filter: enter the name: Click on Create and done.
  2. From source Qualifier copy the required ports to the filter transformation.
  3. Double click the filter transformation: select properties tab,

Transformation Attribute                     Value

Filter Condition                                         Dept no =30

Tracing level                                               Normal

Clicks apply and click OK.

  1. From transformation menu select create: select the transformation type expression: enter the name: click on create and done.
  2. From filter T/R copy the ports to expression
  3. Double Click the Expression transformation: Select ports tab
  4. From tool bar click on add a new port,


Port NameData Type Prec Scale1   0    VExpression
TaxDecimal      7        2IIF (Sal > 8000, sal *0.25, sal * 0.17)


Clicks apply and click OK

  1. From expression T/R connect the ports to the target definition.
  2. From Repository menu click on save.

Now our mapping is ready.

Can create session

Execute the session

Scenario: 2

Verify the null for the following list of columns Ename, Job, Sal, deptno, Comm. If anyone column is having the null value in a row, filter that row.

  • Is null: is consider under testing function we can verifying column.
  • Is null return null: The column having the value it returns null.
  • For develop any scenario we have to follow 2 rules,
  1. What is transformation we have to identify?
  2. Inside the transformation what business rule we have to implement.

All: Logical represent and

Any: OR

  • Non reusable transformation mapping converted into reusable.
  • In filter T/R: FALSE means rejected.

: TRUE Condition means load.


  1. Double click the filter T/R: Select properties tab: Create the following filter condition,



ISNULL (comm) OR


Clicks apply and click OK.

  • Without IIF we can directly use NOT.

Not (ISNULL (Ename) or ISNULL (JOB) OR


ISNULL (comm.) or ISNULL (deptno))

Default value to this condition is True Here we use Expression T/R for filtering row

  • For filtering rows why we have to use Expression T/R using expression T/R I am not going to filter, expression T/R does not do filtering but it can verify the null using expression T/R verify the null each row returns a flag

i.e., Y or N (TRUE, FALSE)

Rows having nulls return: ‘yes’

A row does not having return: ‘No’

Flag is nothing but string literal.

  • After Expression T/R we have to use filter T/R
  • In 3rd way: we use 2 transformation i.e., first we use Expression T/R after we use Filter T/R.
  • Any way we have to used

Record – flag: Data type is string, we have to check only output port i.e., Output

Record- flag return either ‘yes’ or ‘no’.

3rd way:

Diagram: 32

Record- flag: is case sensitive. It can accept only upper case. i.e., (Yes or No)


  1. Create the transformation tag Expression: double click the expression T/R: Select ports tab.
  2. From tool bar click on add a new port,

In filter T/R we have to use,

  • In filter T/R we have to use,

Sal> 3000 or job = ‘Manager’ is null (comm) or sal > 8000

Note: (Dept no = 20 or dept = 30) or job = ‘clerk’

Dept no = 30 or job = ‘clerk’

In filter T/R these 2 conditions not showing correct result

Errors:  unique constraints violated

Oracle Fatal Errors

Rolling back all the targets due to fatal session error

Writer execution failed

Writer run terminated [commit Error]


IIF (ISNULL (comm.), sal, sal + comm.)

  • Decode (Gender, 0, ‘M’

1, ‘F’

‘UNK’) for this we develop expression T/R, in that we write condition in Output port.

For this condition we have take source columns and target columns like this,


Name varchar2 (10) Gender number (2)


Name varchar2 (10)

Gender number (2)

Decode varchar2 (6)

Scenario 4:

Design a mapping to decode the genders as follows, 1 decodes as ‘F’, ‘0’ decode as ‘M’ and others decode as ‘UNK’.

Source Data:



  • It is a row by row operation that’s why it is a passive transformation, here no filtering rows for that only we use expression T/R.




In the target I want to take,

Gender varchar2 (3)

Gender is coming from taken as only Input port.

Note: Power Centre supports only single line comment.

By using comment line any user can

  • Rather than using nested IIF condition we used Decode () is best.
  • Decode can improve the performance.
  • Transformation language: is a set of built in functions.
  • Decode (): Searches a port for a value you specify, If a function finds the value it returns what value specify.
  • Decode (Gender, 0 , ‘M’)
  • In gender port search for value is ‘0’ and return ‘value’,
  • In decode (Gend, 0 , ‘M’,

1, ‘F’, None)

  • Every search separate other search with ‘,’
  • Keep filter T/R close to source Qualifier.
  • Informatica suggest that in expression T/R we use Decode () it can improve the expression efficiency.
  • Important functions:

Date_Compare, Error, Greatest, To_Char, Substr

Scenario 4:

Calculate the total salary for each employee based on salary and commission. The commission column may contain the Null values.

Total_Sal = sal + comm.

In this condition, here we have to do row by row operation. i.e, passive T/R anything added to null, result is null.


Variable Port:

  1. A port which can store the data temporarily is known as “variable port”.
  2. The data is erased when the session completes
  3. Variable ports are created to simplify complex expressions and reuse expressions in several output ports.
  4. Variable ports are created to improve the efficiency of calculations.
  5. Variable ports are not visible in the normal view of the transformation but can view in edit view.
  6. Variable ports are local to the transformation.


  1. Calculate tax, HRA, Annual sal and total sal based on total salary.
  2. The total salary is calculated as,  sal + comm.
  3. The commission column may contains the NULL values.
  4. Calculate the tax as follows,

If total salary > 10000

Then calculate tax as total salary * 0.28 else total salary * 0.17

  1. The HRA is calculated as, total salary * 0.15
  2. Annual salary is calculated as, total salary * 12
  • When the session starts variable port holds the data the session completes variable port will the data erase.

Benefits of variable port:

We can simplify complex expression. You can reuse the expression in several output ports.

  • Power centre data type to number, number (P, S): Tax
  • Here we develop expression in output port.
  • IIF (Total Salary > 10000, total salary + 0.28, total salary * 0.17)

Here total- salary repeated in 3 times,

In condition

  • Total salary calculated condition.

Tax: IIF (IIF (is null (comm.), sal, sal + comm.) > 1000,

IIF (is null (comm.), sal, sal + comm.) * 0.28,

IIF (is null (comm.), sal, sal * comm.) * 0.17)

HRA: IIF (is null (comm), sal, sal + comm) * 0.15

An usal:  IIF (is null (comm), sal, sal + comm.) * 12

Total:  IIF (is null (comm.), sal, sal + comm.)

  • All are output port which provide the data to output.
  • Overall 6 times total salary repeatedly calculated. Same calculate.
  • This time expression T/R takes more time.
  • Why can’t u calculate once calculate store temporarily
  • Reusable expression assign to several output ports.
  • V – Total Salary

(is a notation not a mandatory)

  • Here we develop expression in variable port,

V- Total Salary Decimal 7 2

IIF (is null (comm.), sal, sal + comm.)

O – Tax

IIF (V-total sal > 1000, V- total sal * 0.28, V- total sal * 0.17)

O- DRA – V- Total

O – Annal Sal: V – TotSal * 12

O – Total Sal: V- Total Sal

Drawbacks of variable port:

  1. Variable ports are local to the Transformation. In which transformation u create with the same T/R it is reusable.
  2. Temporary store the data.
  • Order of variable port is most important First we have create variable port. Below variable port we have to create output ports.


Month ID: Month Name: Quarter: Season

To –  Char (Hiredate, ‘MM’)

  • In real time variable ports are most important very very frequently we used.
  • While creating transformation that time only I have to use power centre data type. For source and data we will use oracle data types.
  • Variable ports are not visible in normal view (i.e., Mapping , Session) it can view only in edit view (Double click T/R)


  1. Source table Emp target table: Stg_ Employee_ Variable Port.

(Emp no, name, Job, sal, comm, tax, HRA, Annual Sal, total sal)


Double click the transformation select the ports tab,

  • Here in source we have 18 records but only 15 records why because precision error. (Actuall it is passive that means load all records from source to target)
  • We can find out bad file informatica 9.5: Target Server: infra Shared: bad file
  • Note: Bad file contains bad records (Constrains errors precision error)

Bad File Directory:

C:\informatica\9.5.0\server\infa-shared\BadFiles (Enter)

  • Scheduling Tool: run is UNIX based scheduling process.


Calculate the gross salary for each employee,

Gross Salary = Total Salary + HRA – Tax

Total- Salary = Salary + comm.

HRA = Sal * 0.17

Tax = Sal * 0.2

I want only one output i.e., Gross Salary. We have 3 inputs (Total Sal, HRA, Tax)



Date Attribute:

DAY_NAME varchar2 (15)

Day_Number_in_Week        Number (5),

Day_Number_ in _Month

Day_ Number_in _Year         Number (5),

Week_ Number_in_Month                Number (5),

Week_Number_in_Year      Number (5),

Month_ Number                     Number (5),

Month_Name                           Varchar2 (20),

Quarter_ Number                   Number (5),

Year                                               Number (5)

Note: Informatica default data format is, MM/DD/YY

Teradata ANSII   – ‘YYYY-MM-DD’

Tera Data Integer – ‘YY-MM-DD’


Source data:

7004JaladiRadh KrishnaVizag


Concatenate two string fields Fname and Lname to a full name.

Performance Consideration:

Use the || String operation to concatenate Strings. Using the || String operator instead of CONCAT improves power centre Integration Service Performance.

  • Here we concatenate ‘2’ Strings this is row by row operation that’s why we will take passive T/R. i.e., Expression T/R.




Cust_ID                        Cust Name          City

Take                              Cust_FName



Concat () – takes 2 arguments,

Special Character – space or comma

  • Rather than using concatenate () use || operator is faster.
  1. Concat (concat (cust_Fname, ‘ ‘), cust_LName)
  2. Initcap (Cust_Fname || ‘ ‘|| cust_LName)

Error: Total run time was insufficient for any meaningfull statistics.


  1. Create the transformation type Expression.
  2. Double click the expression T/R – Select the ports tab- From tool bar click on add a new port.
  • Source System contain full name in the target I want Fname, Lname
  • Source system contain full name in the target I want Fname, MName, LName.
  • Create Variable Ports,

Use Decode functions rather than using use || operator rather than using concat()

Basic Rules to Designing:

Diagram: 0

This flow is allowed

Diagram: 1

Flow equality does not match

Best example for this: Filter


  1. Create the transformation type filter and define the following filter conditions,

Substr (Ename, 1, 1) = ‘S’


Substr (Ename, 1, 1) = ‘J’




ETL Application:


Diagram: 2



Design a mapping with single source with 3 target instances.



Source data:

EID         Ename Salary    Job         Phno (Number (10))

Source data type is Number


Phone (varchar2 (15))

It is row by row operation that’s why we use passive T/R. i.e., Expression T/R

Substr () – it displays only Strings

‘(‘ || Substr (To_Char (phone), 1, 3) || ‘)’ ‘ –‘ || substr (To_char (phone), 4,4) || ‘-‘ || substr (To_ char (phone), 8, 3)


Design a mapping that can load the Employees whose Employee number is even.

MOD (EMP no, 2) =0

  • Here we filter the records that’s why we use filter T/R.

Note: In real time we use MOD () for daily load, weekly load, Monthly load.

In real time one work flow may have more session. In workflow also we write functions.



Design a mapping that can load the employees whose employee name is having more or equal to 6 characters.

  • Here we filtering the rows that’s why we use filter T/R length (Ename) >= 6
  • What is surrogate key? How to generate sequence numbers?

Scenario: Design a mapping to generate sequence numbers.

There are 5 techniques to generate sequence numbers Expression T/R.

  1. Variable port (that is not used in real time)
  2. Sequence generator T/R
  3. Dynamic lookup
  4. Mapping Variable
  5. By writing a function stored procedures.
  • Why we are not use variable port in real time.
  • It stores the data temporarily. When the session completes the last generated sequences when we start the session start the sequence from first.
  • Ex:
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • In real practice we start from,



12 like that

But variable port starts from





Source Table:


Target table:

Stg_Emp_SK (Employee_SK + all Source Columns)

  • OLTP Keys are artificial key or ware house ID,
  • Generally in real implementation,

Employee_SK is primary key.

  • For practice purpose we use,

Employee_SK is not a primary key.


Target table create instances.

  • The default value for numerical variable port is zero ( 0 ) and default value for String Variable port is space (is a String)


  1. Create the transformation type expression
  2. From source Qualifier copy the ports to expression: Double click the expression T/R Select the ports tab


  1. The default value for numerical variable port is zero (0).
  2. The default value for string variable port is space.


Design a mapping that can load alternative records in the target. (It is odd number records)


Informatica power centre supports sysdate that returns current date of the system. (i.e., machine running which integration service date)

In Tera data – date

  • Date_Diff ()

Experience for each employee it is a row by row operation i.e., Expression T/R

Date_Diff (sysdate, Hiredate, ‘YYYY’)

This expression in output port in Expression T/R.


Source data:

Cust_IDCust_NameE- mail



User- is a reserved word in oracle we mention ‘user’ that time session will be failed.

Target data:

Cust_IDCust_NameEmailUser Name


First find out position of special character position if returns integer

Substr () – it returns part of the String.

Substr (Email, 1, Instr (email, ‘@’) – 1) here we use Expression T/R.


Router Transformation

  1. This is a type of an active T/R which allows you to create multiple conditions and passes the data to the multiple target tables.
  2. Here we implement simple conditions or complex conditions.
  3. Here we write multiple conditions and we can rooting the data to multiple targets.
  4. Here rejected records can be captured in filter T/R.
  5. What are the demerits we have that will be overcome by using Router T/R.


The router T/R is created with 2 types of groups,

  1. Input Group
  2. Output Group

Input Group:

  1. Always one input group that can receive the data from source pipeline.

Output Group:

There are 2 types of output groups

  1. User defined output group
  2. Default
  3. User defined output group:
  4. Each group has one condition.
  5. All group conditions are evaluated for each row
  6. One row can pass multiple conditions.
  7. Default:
  8. Always one default group.
  9. Capture rows that fails all group conditions (rejected rows)

Performance Consideration:

The Router T/R has a performance advantage over multiple filters T/R’s because a row is read once into the input group but evaluated multiple times based in the number of groups.

Whereas using multiple filters T/R requires the same row data to be duplicated for each filter T/R.

Filter T/RRouter T/R
  1. Single Condition
  2. Multiple Conditions
  3. Single Target
2.Multiple Targets
  1. Rejected rows cannot be Captured
3. Default group captures rejected rows.


Filter T/R:

Diagram: 4

  • Just we spend 3000 milliseconds for only duplicate the rows. It takes a lot of time. That’s why Informatica introduced Router T/R.
  • The Router T/R is created with 2 groups (Input, default)
  • Router has always one Input group that can be received by Source Input (pipeline). Source pipe line means before router we will use expression T/R also a source pipeline.
  • Default group returns only one.



Router T/R:

Diagram: 4

  •  We can’t edit default group.
  • Default group capture the rows that all the group conditions.
  • One row passes multiple conditions.
  • At least one condition is satisfied that doesn’t go to the default group. Here in this 3 condition any one condition satisfies.


There are two targets with the name T1 and T2. Load the T1 which contains the records that does not have null values. Load the T2 that contains the records having null values.


  1. Source table: Emp
  2. Target table: Stg_Emp_NOTnull


  1. Create a mapping drop the source and target definitions.
  2. Create the transformation type Router.
  3. Double click the router T/R. Select the groups tab
  4. From toolbar click on add a new group.

Group Name:

Employee_ Not _ Nulls

Group Filter Condition:



Click Apply: Click OK

  1. Connect the output groups to the corresponding targets.


There are 2 targets with the name T1 and T2. Alternative rows pass to T1 and the remaining rows pass to T2.

  • Only one user defined group: MOD (Empno, 2) = 0

One default group Ignores.



If we have multiple targets that time we have to use Router T/R.

When we want to apply Filter T/R. We can’t capture rejected records and we can’t view in session log file.

I want to see rejected records in filter T/R by using error function in the session log file.


  1. Causes the power centre Integration service to skip a row and issue an error message, which you define. The error message displays in the session log. The power centre Integration service does not write these skipped rows to the session reject file.
  2. Use Error in Expression transformations to validate data. Generally, you use error with in an IIF or Decode function to set rules for skipping rows.
  3. Error(), ABORT () by using this two functions to validate function (useful in testing), skipping rows, analyzing log files (i.e., Transformation Errors)
  4. By using these two functions I can view the rejected records in session log file. But not capture in bad file. Informatica default date format MM/DD/YYYY


IIF (ISNULL (COMM), ERROR (COMM Column Having Null value), TRUE)

Note: In filter T/R Create the following conditions and study the session log.

IIF (ISNULL (comm.), error (‘comm column having null values’ TRUE)

  • Purging log:

10 days log will be purge

  • I am using Error () in filter T/R


Diagram: 5


IIF (ISNULL (comm.), ERROR (‘comm column having NULL values’), TRUE)

By using Error() function we view the rejected records in session log. Actually by using filter T/R we can’t  capture rejected records and also can’t view rejected records but we want to view rejected records by using ERROR().


Diagram: 6


D-Comm: IIF (ISNULL (comm.), Error (‘Comm column having Null Values’), TRUE)


Rank Transformation

  1. This type of an active T/R which allows you to find out either top performance or bottom performance.
  2. Rank T/R is created with the following types of the port,


  1. Input Port (I)
  2. Output Port (O)
  3. Rank Port (R)
  4. Variable Port (V)

Rank Port (R):

I.A port which is participated in determining the ranks is known as rank port.

ii. Only on port can be designated as rank port.

  1. The following properties can be set to determine the ranks,


  1. Top / Bottom
  2. Number of Ranks

Key point:

Integration service uses the cache memory to process the rank T/R.

  1. Rank T/R can build two types of the cache memory
  2. Index Cache
  3. Data Cache

First “N” rows are loaded into cache memory.

N = number of ranks

Once the data is loaded into cache memory then the integration service compares the next row from source with values of cache.

Case i:

Integration Service builds single Cache memory called ‘data cache’ – for a simple rank (Without group by option)

Case ii:

When u select group by ranking, the integration service builds’ 2 types of the cache memory,

  1. Index Cache
  2. Data Cache
  • Why cache is required otherwise why u calculated without cache can’t?
  1. When the mapping contains Rank T/R the Integration Service creates cache.
  2. Informatica adapted one method to improve the performance. To keep temporarily first n rows.
  • By using Cache we reduce 80% comparisons.


Simple rank Without Group by:

For Example, we configure a rank transformation to find the top three sales.







The integration Service caches the first three rows (10,000, 12,000 and 5000). When the Integration Service reads the next row (2000), It compares it to the cache values. Since the row is lower in rank than the cached rows, it discards the row with 2000. The next row (6000), however, is higher in rank than one of the cached rows. Therefore, the integration service replaces the cached row with the higher ranked input row.

  • What is normal rank and dense rank
  • Department wise top 5 employees based on sal?

Note: Our rank T/R support only normal rank.

Rank T/R doesn’t support Dense Rank.

  • What is the difference between Rank and Dense Rank?

Normal Rank:

It skip the values when we get same value.

Dense Rank:

It doesn’t skip the values.

Note: Rank T/R supports only normal ranking, doesn’t support the dense rank.

  • If the rank can built 2 caches, What values stored in Index cache and what value data?

Note: Rank T/R can build 2 types of cache memory when u select group by port option?


If the source contains n rows load n+5 rows in the target. (Here we choose top/bottom any algorithm we have to choose)

GUI Solution:

Diagram: 7


SQL Code:

Select * from Emp

Union All

Select * from emp where row num <=5

  • Row num supports < , < =, =| does not support
  • When created Rank T/R by default.
  • Rank Index can be created (Ordering Rank and Skipping is done by Rank Index)


  1. Create the T/R type Rank
  2. Double Click the Rank T/R
  3. Select the ports tab for a port name salary: Select Rank port
  4. Select the properties tab,

Transformation Attribute            Value

Top/Bottom                                       Top

Number of Ranks                             5

  1. Click Apply: Click OK


Calculate the tax for a top 3 employees who belongs to the sales department. Top 3 employees based on identified by the Salary. Sales department is decoded as 30. If the salary> 5000 then calculate the tax as, sal * 0.25 else calculate sal * 0.15.

Sorter Transformation

  1. These types of an active T/R that can sort the source data either in ascending order or in descending order.
  2. A key is required to define the sort on single column or multiple columns.

Simple Sort:

A key is selected on single column. A column which participated in sorting.

Key: is a column where we want to perform sort.

Complex Sort: A key is selected on more than one column.

Sorter Transformation Properties:

Property 1: Case Sensitive:

  • The case sensitive property determines whether the integration (integration) service considers case when sorting data. When you enable the case sensitive property, the integration service sorts uppercase characters higher than lower case characters.

Property 2: Null Treated (low):

  • Enable this property if we want the integration service to treat null values as lower than any other value when it performs the sort operation.

Property 3: Distinct

  • If u enables this property the integration service can eliminate duplicate rows. Hence sorter is known as Active T/R.
  • All columns get selected as key.

Sorter Cache:

  1. It the mapping contains the sorter T/R the integration service uses the cache memory to process the sorter T/R.
  2. All the source data is loaded into the cache memory before sort operation is performed.
  3. The integration service build’s the single cache memory called data cache.
  4. If the cache size specified in the properties exceeds the available amount of memory on the integration service process machine then the integration service fails the session.
  5. If the amount of incoming data is greater than the cache size specified then the power centre will temporarily store the data in the sorter transformation work directory.


It is a temporary path for sorting.

  • Why sorter is an Active T/R?

Sorter could be a passive T/R

Sorter T/R can eliminate duplicate rows.

When we double click Sorter T/R select the properties distinct, this property make as Active T/R. Using sorter T/R we eliminate duplicates that’s why sorter T/R is active T/R.

  • Key is compulsory select either single column or multiple column.
  • Key is selected to single column (Simple Sort)

Complex Sort (nested sort):

We can sort on multiple Columns.

Case Study 1:

  1. Cache size specified in properties tab of the sorter T/R is 6MB.
  2. The available amount of memory on integration service machine is 3 MB.
  3. The amount of incoming data size is 9MB.


When u run the session, integration service fails the session with fatal error.

Case 2:

  1. Cache size specified in properties tab of sorter T/R is 6MB.
  2. The amount of incoming data size is 9MB.
  3. Amount of memory on integration service machine is 8MB.


  1. Integration Service never fails the session.
  2. 6 MB of the cache memory is allocated on RAM.
  3. 3 MB of space is allocated in a disk memory.

Case 3:

  1. Cache size is specified in properties tab is Auto (default)



  1. Session never fails
  2. Integration service may use RAM memory And/or disk memory.
  • By using Aggregate [Sum, avg, first(), last(), varience(), radian(), Median(), Count()] we do calculations.
  • Aggregations are 2 types
  1. Sorted Aggregation
  2. Unsorted Aggregation


***Aggregator Transformation

  1. This is a type of an Active T/R that calculates Aggregate Expression for group of detailed records.
  2. An aggregator T/R is created with following components,
  3. Group by
  4. Aggregate Expressions
  5. Sorted Input
  6. Aggregate Cache
  7. Group by:

Defines a group on a single or multiple ports.

Example: Date(or Day) , product, store, deptno etc.,

  1. Aggregate Expressions:
  2. Aggregate expressions are develop either in only output port or variable port.
  3. An aggregate Expressions are developed using following aggregate functions,

SUM ()

Avg ()

Max ()

Min ()

Count ()

  1. Aggregate functions are available within the aggregate.
  2. An aggregate function returns a single row or single value for each group.
  3. One Aggregate function can be nested within another aggregate function.
  4. Sorted Input:
  5. Keep the sorter T/R prior to aggregator T/R
  6. Group by ports are sorted using sorter T/R
  7. Sort the data according to the order of the group by ports.
  8. Sorted data can be aggregated more efficiently then unsorted data.
  9. Aggregate Cache:
  10. When the mapping contains an aggregation T/R the integration service uses the cache memory to process the aggregator T/R.
  11. There are 2 types of cache memory’s
  12. Index Cache
  13. Data Cache

Index Cache: It contains the group by port values.

Data Cache: It contains,

  1. Non- group by input port
  2. An output port which contains an aggregate expressions.
  3. Local Variable Ports.

Types of Aggregation:

  1. Un Sorted Aggregation:

1. No rows are released from cache until the last record is aggregated.

  1. More Space is required in the cache memory.
  2. A heavy wait process
  3. Takes more time to perform aggregation.

Sorted Aggregation:

An integration service will cache the data for each group; release the group upon reaching the 1st record of next group.

  • Less space required in the cache memory.
  • A light weight process.
  • Aggregate is a process of calculate aggregates for a group of detail rows (Source contains detailed rows: Every Employee had detail information i.e., Detail Data]

Nested Aggregate:

One Aggregate function nested with another Aggregate function.

Example: Sum (Avg (sal))

Sum (count (*))

  • Sorted is a combination of aggregation T/R.
  • In ETL we have 2 types of aggregation


  1. Unsorted Aggregation
  2. Sorted Aggregation
  • Sorted Data:
  • Entire Aggregation process is done in cache.

80% occupied by source data

20% for calculations [heavy wait process]

If there is freer space means aggregation functions are doesn’t very fast (les wait process).

UN Sorted Aggregation:

  • In case of UN sorted aggregation no rows are released from cache until the last record is aggregated.
  • All rows (10000 or any number of) loaded into cache then perform calculations.

Two types of Cache:

  1. Index Cache
  2. Data Cache

Rule of Thumb in sorted T/R:

Group by key treated as sorted key.


Lookup Transformation

  1. This is of type Active/ passive transformation which allows you to lookup on Relational tables, flat files, synonymous and views.
  2. By default lookup T/R functions as Left outer join.
  3. By default lookup T/R act as passive T/R.
  4. Lookup T/R supports multiple lookup conditions.
  5. It supports only “AND” operator between conditions (Doesn’t support “OR” conditions)
  6. Lookup T/R supports inequality comparisons such as < , <=, > , >=, !=
  7. Lookup T/R is created with following type of the ports,
  8. Input Port (I)
  9. Output Port (O)
  10. Lookup Port (L)
  11. Return Port (R)
  12. A lookup source is defined with lookup ports.
  13. Lookup source can be database, flat file etc.,
  14. Database can be source Database or target database.
  15. A lookup T/R is configured in a 2 different ways,
  16. Connected Lookup
  17. Un connected Lookup
  18. Connected Lookup:

It is connect to the source and target.

It can receive the multiple input ports and can return multiple output ports.

  1. Un Connected Lookup:

It is neither connected to the source nor connected to the target.

It can receive the multiple input ports but always returns a single output port that should be designed as Return port (R).

  1. By default Integration Service built the cache to process lookup T/R.
  2. By default the cache is erased when the session completes. Hence it is known as non- persistence lookup cache.
  3. By default the Integration service built read only cache, cache can’t be updated. Hence it is known as static lookup cache.
  4. The following are types of lookup caches built,
  5. Non – Persistent lookup Cache (Default)
  6. Static Lookup Cache (Default)
  7. Persistent Lookup Cache
  8. Dynamic Lookup Cache
  9. Shared Lookup Cache
  10. Named Cache
  11. Un Named Cache

Lookup Cache – how it works:

  1. There are 2 types of cache memory,
  2. Index Cache
  3. Data Cache
  4. All ports values from the lookup table where the ports is part of the lookup condition are loaded into index cache.
  5. The index cache contains all port values from the lookup table where the port is specified in the lookup condition.
  6. The data cache contains all port values from the lookup table that are not in lookup condition and that are specified as “Output” ports.
  7. After the cache is loaded, values from the lookup input port (s) that are part of the lookup condition are compared to the index cache.
  8. Upon a match the rows from the cache are included in stream.


  • Why Lookup T/R?

Ans: Lookup is created to check whether the record is exists or not. If no similar match exists in lookup table it returns null (SQL Terminology outer Join)

Inner Join: Matching records only combine

Outer Join: Matching records and non – matching records also combine.



Key Points:

  1. Create a lookup to check whether a record exists or not.
  2. If the record exists lookup returns not nulls
  3. If the record doesn’t exist lookup returns null values.
  4. A lookup condition is created to check for an existence of a record.


Display the grade for each employee whose salary is between low sal and high sal.

Source: Emp, Salgrade (Grade, lowsal, high sal)

Joiner Transformation

  1. This type an active T/R that combine the fields from two data sources into a single combined dataset based on one or more common fields. (Join Condition)
  2. Joiner T/R supports only two input streams.
  3. The inputs to the Joiner T/R should be designated as Master Source and Detailed Source.
  4. A Master Source is defined with master ports (M).
  5. A Source which contains the lesser number of records should be designated as master source (Primary Table or Parent Table)
  6. Integration Service built the cache for only master source.
  7. Joiner T/R supports Multiple join Conditions.
  8. Joiner T/R supports only AND operator between join conditions. Or conditions doesn’t support.
  9. Joiner T/R Supports only equality comparison.
  10. Joiner T/R supports relational and non relational sources
  11. Non- relational sources includes flat files and xml files
  12. Joiner T/R supports heterogeneous data sources.
  13.  A Join which is made on same data base is 1 crown as homogenous joins.

Example: Emp (oracle) + Dept (oracle)

  1. A join which is made on two different databases is known as heterogonous joins.
  2. Joiner T/R supports the following types of the joins,
  3. Normal Join (Equi Join or Inner Join)
  4. Master Outer Join
  5. Detail Outer Join
  6. Full Outer Join

The default join type is normal Join.

  1. Normal Join:

It keeps only matching rows from master and detail source.

  1. Master Outer Join:

It keeps all the rows from the detail source and matching rows from master source.

  1. Detail Outer Join:

It keeps all the rows from the master source and matching rows from the detail source.

  1. Full Outer Join:

It keeps all the records from both master and detail source.

Joiner Cache – How it works:

  1. There are two types of cache memory,
  2. Index Cache
  3. Data Cache
  • All rows from the master source are loaded into cache memory.
  • The index cache contains all port values from the master source where the port is specified in the join condition.
  • The data cache contains all port values not specified in the join condition.
  • After the cache is loaded the detail source is compared row by row to the values in the index cache.
  • Upon a match the rows from the data cache are included in the stream.

Performance Considerations:

The master source should be the source that will take up the least amount of space in cache.

  • Joiner T/R Support both homogenous and heterogonous relational sources.
  • Text file we called in called flat files.
  • Joiner T/R supports non- relational sources (It may be a text file or XML file)


Diagram: 8

  • Joiner T/R support only equality ( = ) Comparisons.
  • What is Difference between Joiner T/R and Lookup T/R?

Lookup T/R:

  1. By default lookup T/R is active
  2. By default lookup T/R is left outer join
  3. It supports in equality comparisons

Joiner T/R:

  1. By default Joiner T/R is active.
  2. By default Joiner T/R is normal join or equi Join
  3. It supports equality comparisons.

Note: One Joiner T/R Combines only two inputs. The two inputs should be designated as master source and detailed source.

In SQL we can write number of Join conditions (i.e., n number of Join Conditions)

Parent Table – Master Table

  • Primary key table is called parent or primary or master table.


  • Foreign key table is called child or detail.


  • What is the need of declare name as master and detail.

Ans: The integration service builds cache only for master Source. It doesn’t build cache for detail source.

  • As a developer we should able to designate as Master Source and detail source.
  • Master Source is identified by master ports.

Diagram: 9

  • A primary key table is known as master table or parent table.
  • A foreign key table is known as child or detailed table.
  • Sources are different (Oracle, SQL server) [Flat file, XML file] that time use criteria we follow lesser number of records designated as Master Source.
  • Joiner T/R Supports inner Join [Equi Join], Outer Join.
  • In SQL  terminology joins are,

Inner join

Outer join- Left Outer Join, Right Outer Join, Full Outer Join

  • In informatica terminology Joins are,

Normal join

Master Outer join

Detail Outer Join

Full Outer Join

  • I have 3 sources that I will use 2 Join conditions

Diagram: 10

  • One Join T/R Join 2 data sources (2 inputs)
  • I have 3 tables or sources or Input Stream i.e., any relational tables (i.e., Homogenous, heterogeneous)

Diagram: 11


Like this we write number of Join T/R’s.

Heterogeneous take master source based on number of rows.

Joiner T/R – Advantages:

  1. Can join heterogonous Sources.
  2. Can join non- relational Sources.


  1. Can only join 2 input data streams per joiner
  2. Only supports equality comparisons
  3. Does not support “OR” Conditions.

How to reduce the amount of memory:

In SQL – Where condition

In Informatica – Filter Condition


Lookup T/R: Advantages

  1. Can re-use cache across session run (persistent lookup cache)
  2. Can re-use cache with mapping (Shared Lookup cache)
  3. Can modify the cache dynamically (Dynamic Lookup Cache)
  4. Can query relational table or flat file
  5. Inequality comparisons are allowed.
  6. Can choose to cache or not to cache.
  7. SQL override supported.
  8. Can unconnected and invoked as needed.


  1. Unconnected can or return only single port only have 1 return port.
  2. “OR” conditions not supported.

Differences between Joiner and Lookup T/R?

Joiner T/R:

  1. Active T/R
  2. Supports only equality comparison.
  3. Can’t be invoked as unconnected.
  4. The default is normal join or equi join.
  5. Can be cached (joiner should be cached)

Lookup T/R:

  1. Default is passive
  2. Supports inequality comparison
  3. Can be invoked as un connected.
  4. Default is left outer join
  5. Can choose to cache or un cache.

Formula for Outer Join:

Combine the records there is no equality match.

Outer Join = inner Join (Normal Join) + non Matching rows from detail source

Inner Join Prode = Only equal records.

Implementation of Homogeneous Joins:





Emp (Oracle)

Dept (Oracle)


Emp-Dept (Empno, Ename, Job, Sal, Deptno, Dname,Loc)

  1. Design a Mapping drop the sources and target definitions.
  2. Create the transformation type joiner
  3. From SQ-Emp copy the following ports to the joiner (Empno, Ename, Job, Sal, deptno)
  4. From SQ-Dept copy the following ports to the joiner transformation (Deptno, Dname, Loc)
  5. Double Click the Joiner T/R: Select the condition tab – From toolbar click on add a new condition.

Clicks apply and Click OK.

  1. From Joiner connect the ports to the target.



Union Transformation


  1. Similar data sources are combined into a single result set using union T/R (Similar source Definitions)
  2. Union T/R functions as ‘Union All’ set operator.
  3. Union T/R is created with 2 types of the groups,
  4. Input (I/P) group
  5. Output (O/P) group

Input Group:

  1. It receive the data from source pipeline
  2. Multiple I/P groups
  3. Number of sources = Number of Input groups

Output Group:

  1. Only on output group that provides combined result set.
  2. Union T/R supports ‘Heterogeneous’ data sources.

Source definition means Metadata.

  • Data Validation Test:

While mapping is loading right amount of data into the target.

  • In informatics we have write sub Queries. For slowly changing dimensions we will use sub Queries.

Set Operators in SQL:


Union all

Union: is a set operator that combines data sets (not 2 or 3 data sets means more than 2) Combines data set. It eliminates duplicates.

  • Union T/R functions act as Union all Set operator in SQL.

Union all: is combines data sets (more than 2) it doesn’t eliminates duplicates. It stores all data.

  • Union T/R supports Homogenous and Heterogeneous data source.
  • In this T/R we can change data types.
  • Union T/R provides always one output group only.
  • In this T/R we has 2 groups,
  1. Input group
  2. Output group
  • Union T/R not like other T/R. It doesn’t change number of records.

Stored Procedure Transaction

  1. This is of type passive T/R that imports the stored procedure from Database into the mapping.
  2. Stored Procedure T/R can be configured in  a 2 different ways,
  3. Connected Stored Procedure
  4. Un Connected Stored Procedure

Connected Stored Procedure:

  1. Connected to the source and target.
  2. Can receive the multiple input ports and can return multiple output ports.

UN Connected Stored Procedure:

  1. Neither Connected to the source nor connected to the target.
  2. It can receive the multiple input ports but it returns a single output port.
  3. Create the following stored procedure in scott database, create or replace procedure Stg_Emp_Calc [ V_Empno In number,

Totalsal out number,

Tax out number,

HRA out number]



Select Sal + NVL (coom, 0), sal * 0.1, sal *0.4 INTO from Emp

Where Empno = V_Empno;


  • PL/SQL: is an extension to SQL.

Is a procedural language a stored procedure like a program which consists of set of SQL statements? Conditional statements (if then else), control statements (for loop, while).

  • SQL doesn’t support conditional and control statements.

Every Stored Procedure have 3 arguments:

  1. Input – Receive any argument
  2. Out – Provide Output
  3. In/ Out
  • Stored procedure which is creating for operating every row.

Connected Stored Procedure:

It is connected to the source or connected to the target.

It can receive multiple input ports and provide multiple output ports.

UN connected Stored Procedure:

It can receive multiple input ports and provide only one output port.

While design a mapping why stored procedure is stored procedure is a reusable template. The template can be used any number of Mappings by reusing business rule.

Stored procedure contains Business rules best real time

Example: Index

Index: is created for retrieve the data very fast (more efficiently)




–>Database space calculation we use stored procedure. (Find out how much space is there)

—>For calculation we use stored procedures.

—->What is the need of creating stored procedures?

—>In Informatica not possible to calculate space before loading, this is possible through creating stored procedures.

—->create or replace procedure EMP – calcs


V – Index – nam in varchar2




Execute immediate

‘Drop index’ // v – Index – nam


—>Generally team leads prepare stored procedures, team leads are busy that time in team most expirer cc people prepares stored procedures.

—>Stored procedures created in Scott

—>Frow which database you want to invoke stored procedure. We use Scott database.

Select procedure

[+] Scott                      [connect]

—->All t / r are connecting T / R and 2 T / Rs are connected and unconnected. Only 2 transformations are unconnected, lookup procedure.


Source table:                                                              Target table:

EMP                                                                                 Stg – EMP – proc (empno, ename, job, sal, comm,

Deptno, totalsal, tax, HRA)

1.    Create a mapping (M – Stg – EMP – connected – proc)

2.    Drop the source & target definitions

3.    Transformation menu —>select create —>select the transformation type stored procedure —->enter the name —>click on create

4.    Connect to the database with the following details,

ODBC Data source: Scott – ODBC – oracle

Username: Scott

Owner name: Scott

Password: tiger

Select the procedure (Stg – EMP – calc) —>click ok click on done.

5.    From a source, Qualifier connects the port Empno to the stored procedure.

6.    From stored procedure connect the ports to the target (Totalsal, tax, HRA)

7.    From a source, Qualifier connects the remaining ports to the target.

8.    Double click the stored procedure T / R —->select properties tab,

Transformations attribute                                                             value

Connection information                                                                oracle – Scott – DB

By default $target but here we create stored

Procedure for source DB

Click apply —-> click ok

—>Why stored procedures are required?

—>In Informatica point of view it is a GVI. In real time, we have some code that is SQL code (or) PL / SQL Code.

—>A source qualifier converts into native data type into Informatica data type.

—>What is the diff between joiner & look up T / R?

—>Diff between joiner T / R & source qualifier T/R

—>Files also associated with source qualifier



1.    SQL Queries

2.    User defined join

3.    Source filter

Above three are disables why because file doesn’t understand SQL statements


1.    An active transformation that reads the data from relational tables and flat files.

2.    Use the source qualifier T/R to implement SQL override

SQL override: it is a process of writing user define SQL statements or changing the default SQL at runtime

3.    SQL override supports when source is relational database

4.    Define the following properties of the source qualifier

I.    SQL Query: 1. this property supports user defined select statement or can change default SQL query

2. User can write only select Query. (No other statement can support that is, no insert, update, delete)

II.    Source filter: 1. It will insert where clause to filter the source rows.

2. Performance consideration: keep the filter T/R as close to the source qualifier as possible to filter the rows early in the data flow, if possible move the same filter

III.    User defined join: 1. can join any no of tables

2. Support standard SQL joins,

a)    Inner join

b)    Left outer join

c)    Right outer join

d)    Full outer join

—->Using joiner T/R also we can do self-join. In this, we use sorted input. In source qualifier T/R also we implement self-join

—->How can we implement self-join using joiner T/R?

a)    Here we have 4 thumb rules.

3. Supports only homogeneous relational tables (That means doesn’t support heterogeneous text files, XML files)

—>Source qualifier joins any No of tables

4. Defines the where clause to join separate data sources.

IV.    No of sorted ports: 1. it will insert an order by clause to perform sorting either ascending or descending order default is ascending.

2. Can define the sort on single or multiple ports.

V.    Distinct: eliminates duplicated rows

—–>Source filter property filter the source rows & distinct property eliminates duplicates that’s why source qualified is active T/R

VI.    Pre-SQL: 1. SQL statements are executed using source database connection before extraction begins.

2. Multiple SQL statements are separated using (semi-colon)

VII.    Post SQL: 1. SQL statements are executed using source database connection after extraction completes.

—->Source associated with source qualification. Source qualifier converts native data type into

—->Using source qualifier T/R user can write SQL queries. While writing, SQL statements submitted the source database.

—->SQL statements submitted to database server by integration services. Because integration service can’t understand SQL statements.

—->Tera data doesn’t support OLTP. Only for decision supporting system & OLAP.

—->SQL override means nothing but user-defined SQL statements or changes default SQL statements.

—->source qualifier by default generate SQL statements we can change that default SQL statements.

—>Look up should be relational database

SQL override supports when RDBMS

1)    SQL T/R

2)    Look up T/R

—>Look up on source or target tables


The following transformations support SQL override when source is relational database,

a)    Source qualifier T/R

b)    Look up T/R

—->Using source qualifier T/R what can I do?

—>What is the diff between joins define source qualifier & joins define joiner T/R?

(A) 1. Using source qualifier T/R can join any No of tables.

2. Supports only homogeneous relational tables (coming from same database i.e., EMP (oracle) employee (oracle)). It doesn’t support heterogeneous relational table (coming).

(Q) Differences between joiner T/R & source qualifier T/R?

Joiner T/R                                                                                     Source qualifier T/R

1. Supports only two tables to make joins                                    1. Supports any No of tables to make joins

—>What are the Queries we write in source qualifier that is submitted by integration service to database server.

—->If the source database server has not enough database cache it degrades the performance.

—->We want to use more joins that time we have more database cache size

2. Supports heterogeneous                                                2. Supports only homogeneous relational tables

3. Supports non – relational sources (flat files)               3. Doesn’t support non – relational sources

—->Joiner T/R use integration cache, source qualification use database cache.

—->Filter T/R place as close to source Qualification to improve the performance.


—->Ur source relational database that time use source qualifier T/R. In source qualifier, we write filter condition.

—->I have 2 statements:

1.    Extract all the data & filter it.

2.    Extract relevant data


1)    Extract all the data & filter it: (relational reader)

At runtime extract record by record will all rows.


2)    Extract relevant data:


3)    Flat file:-

File reader


In this situation, we use filter T/R

—->Using source qualifier T/R,

1.    Filtering

2.    Joining

—->why use joiner T/R & sorter T/R we have source qualifier T/R?

(A) Source qualifier T/R have the text file, the flat file that time we use sorter T/R, filter T/R, joined T/R. Source qualifier T/R supports only relational homogeneous sources.

Using filter T/R: -In a flat file, relational sources we can filter the data.

—->whatever we want to do everything by using select statements. The source is the relational database that time we use source qualifier T/R.

—–>Even source is relational sometimes we can’t recommend source qualifier.

Source qualifier joins – advantages:-

—->we can join any No of table this is one major advantage

Full functionality of standard SQL:-

It supports standard SQL joins. We can implement cross joins, all joins.

May reduce volume of data on network:

Where emp. Deptno = dept. dept no

Effect performance

May or may not effect

(A) When the joins Query runs on source database server have not enough database memory (size) may be performance degrades. But not conform performance degrades.

Source qualifier join:-


1.    Can join any No of table’s full functionality of standard SQL available

2.    May reduce volume of data on network (where clause)



1.    Can only join homogeneous relation tables.

2.    Can affect performance on the source table. (Source database that doesn’t have enough database cache or memory)

—->If we have strong SQL Knowledge. We need to use any other property in source qualifier T/R.

—->SQL override: we can write SQL query from scratch or change default SQL query we called SQL override

—->source filter: here don’t write where clause and select statement also.

Ex:-EMP – sal > 4000

This condition mentions in SQL

—>No of sorted ports: perform sorting on only 1 port that is no of sorted ports. 1 is a default ‘0’ (0 means zero)

—->I have 8 ports but perform is sorting only on one port. I don’t know which are ports

I take 2 ports for sorting



—->No of sorted ports 6 that is all first 6 columns (ports) we take based on salary we want to sort

In this situation, we charge default SQL query, select EMP.EMPNO, EMP. ENAME, EMP.SAL

From EMP order by EMP. [EMPNO] —>SAL I.E. [EMP.SA]

—->before load. Execute statements

—->pre SQL: before extraction begins Ur integration service submits SQL statements to source database server with the help of source DB connection. We have multiple SQL queries every statement is separated by; (semicolon)

Here we write any type of SQL statements,

Select ——-

Update ——


Above three are executes on source database server

Like post SQL we use like pre SQL

—>In pre-SQL we can write any type of SQL statements. But in SQL query property we use only select statement.

—->Update override: this is target side property in source qualifier T/R.

—>Source table doesn’t have any index:

My source table doesn’t have any index. But I want to improve extraction efficiency, before extraction I create index (that is primary index) after extraction drops the index

Target table has index:-

—->before load drop index, after load create index our target table have index

—->integration service sorting is faster compare to database service sorting, may or may not have buffer size.

Sorter T/R – performance consideration:-

Using a sorter transformation may improve performance over an “order by” clause in a SQL override when the source is a database because the source database may not be tuned with the buffer size needed for a database.

Scenario 1:-

If the source contains ‘n’ rows load n+5 rows in the target.

(A) I will implement by using SQL override (union all).

We use rank T/R, union T/R

—>I have source 20 rows – how many rows we get by using row num>10

(A) Row num doesn’t support > operator

It supports, <, <=, =1 (other than 1 doesn’t support)

–>Materialized views are important to implementing warehousing.

—>before we write Query in source qualifier copy the ports to any T/R (or) target from source qualification reason is, we can validate the error for any SQL query.

—>Select * from EMP                              ODBC data source: Scott – ODBC –

Union all

Select * from EMP where row num < 6                             [validate]

The yellow color is wrong.

For this reason I recommended I connected source qualifier connect to the target database before we write SQL query copy the ports to any T/R (or) target that’s the reason we will not get any syntactical error.



1.    Double click the source qualifier T/R —->Select the properties tab.

Transformations attribute                                                    value

SQL Query                                                                                 select * from EMP

Union all

Select * from EMP. Where row num < 6

Scenario 2:-

If the source contains ‘n’ rows where n=even (not odd number). Load second half records into the target.

Suppose I have 18 ———– 9

[9] —-> I want 2nd half

We can try SQL solution & without SQL also we can try.


With SQL solution:-

Select * from EMP —->14


Select * from EMP Where row num <=

(Select count (*)/2 from EMP)

—->Query 2 – Query 1

Scenario 3:-

If the source contains ‘n’ rows. Where ‘n’ even loads first half records into the target.

(A) Select * from EMP where row num<= (select count (*)/2 from EMP)



If the source contains ‘n’ rows. Where ‘n’ is odd. That is n = 5 that time we get 2 records for this query (select * from where row num < = (select count (*)/2 from s))

Note:-‘n’ is even that time we get exact results compare to odd records.

—->how to use joins in source qualifier T/R?

—->Scenariou: we write this Query in, SQ T/R —>Properties —>SQL

Select * from EMP —>Have 14 records


Select * from EMP where row num < 6 —-> statement is not executed

We get 14 records because union doesn’t allow duplicates


Select * from EMP —-> 14 records


Select * from EMP Where row num < 6 —> 5 records

—–>Result is 9 records.



Select * from EMP where row num < 6


Select * from EMP

We get ‘0’ records why because in 1st Query we have 5 records, in the second Query we have 14 records. By using set operator minus in middle of Q1 & Q2. We get zero records Q1 result only we get when we use minus operator.


If source contains ‘n’ rows Where is odd here n = 5 (records)

For this Query, we get 1st 2 records only because n is odd.

Select * from sal grade where row num < = (select count (*)/2 from sal grade)



If source contains ‘n’ rows where is odd here n = 5 (records),

For this Query, we get last 3 records,

Select * from sal grade


Select * from sal grade where row num < = (select count (*)/2 from sal grade)

—>No of sorted ports: SQ T/R —>Property —>no of sorted ports

No of sorted ports

SQL Query






—>file version      mapping

Check in       create

Check out      edit




M – Even           Validate

M – Odd   —–>Mapping is modified but not saves.

—–>Export object (file) are by default XML format.

—-> I want to export objects (Mapping, source, target, session, workflow). I will export workflow that time we will export source, target, mapping, session, workflow at a time. Now I am export one workflow from one folder (suja) to (Renu) another folder.

Procedure: Go to power center repository manager—-> or designer also we can do

Select folder (suja) —>click on folder

After selecting export, it asks the path,


Clicks save


—->To which folder we want to export that folder we select next go to repository menu —>select import objects.


—>Select that workflow (W – S – M – CUTS – email) —–> click ok

Then press [next] in import Wizard

Then we get,


Then we get,


Folder navigation window selects Renu folders —> them click ok. Now, destination folder is

Nipuna – rep:: Renu

Then press next


—->next we get,


We get,


—>then we get,


—->Then we get


—>then we get,


—->Then we will go and check Renu folder that workflow imported or not. Before check, Renu folder selects that

—>My mapping is invalid mapping. We didn’t give the connection to source or target. Sometimes we forget to take target. That time we get mapping like this,

*M – even —->is invalid mapping

—>Right click —>in this there is no delete option that’s why we drag the mapping in mapping designer.

I want to delete this mapping open mapping designer in workflow designer in the workflow designer. Which mapping we want to delete drag that mapping and go to mapping menu —>select delete then we get,


Now I want to save this mapping again drag that map into mapping designer then go to mappings menu —->select validate (i.e., mappings —->validate)

Then it will automatically refresh or save like this we can validate the mapping.

—->In Wizard I have developed mappings (i.e., SCD’S). I want to check,

(A) Go to mappings menu –>select wizard —->slowly changing —->slowly changing dimensions

—->Then we get,


—>Then we get


–>we get


Press [finish]

Then we get


Note: -Lookup is mandatory to SCD’S without lookup we can’t develop slowly changing dimensions.


M – EMP —–>Select this mapping —->right click on mapping

Select versioning —-> check in

Then we get,



Then that mapping is check in i.e.,

*—— m – emp

Drag this map in mapping designer. In this source ports, target ports, transformation ports are disabled why because here we check in the mapping.

—->now we select above mapping (check in mapping)

*—— m – EMP —>select these mappings ->right click on mapping select versioning —->check out.


Then we get,


Then that mapping is checked out i.e.

*——m – EMP

Drag this mapping in mapping designer in this source ports, target ports, transformation ports are enabled. Why because here we check out the mapping. Check out means we do any modifications.

—>I want to select all ports at a time without using select all option.

(A) Layout (in power center designer —->select mapping designer)

—->Auto link by name

By using this without select all, we select all ports & connected to other T/R’S or target.

Source Qualifier – homogeneous joins:-

Sources:                                               Target: EMP – Dept – SQL – Joins

EMP (oracle)                                       (EMPNO, ENAME, JOB, SAL, Deptno, Dname, Loc)

Dept (oracle)

1.    Create a mapping —>drop the source & target definitions

2.    Source qualifier T/R supports only homogeneous relation.

3.    At one source qualifier join any No of table

4.    We drop 30 sources, that time we delete 29 source qualifier use one source qualifier. Use one source qualifier for all sources. Here we use n-1 join conditions in source qualifier

5.    A single source qualifier gets associated with multiple sources.

6.    Delete SQ – DEPT —->from dept source definition copy all the columns to SQ – EMP

7.    From SQ – EMP connect the ports to the target.

8. Double clicks the source qualifier —–>select the properties tab,





Transformations attribute                                                     value

SQL Query                                                                               Select EMP.EMPNO, EMP.ENAME, EMP.JOB,




—->Standard SQL joins available in SQL Query property

—–>Self-join —->Alias names

—->Cartesian join

Accidental Cartesian join

—–>Join index: Index created for join. For large complex Query, we create index.

—->outer join: Inner join + non – matching rows

—->Left outer join: Matching rows from left side table + non – matching records from right side table

—->We can join 64 columns in Teradata

—->We can join 32 columns in oracle.

Click apply —-> click ok

—>I want to join 40 tables. If you drag 40 tables, you may have 40 source qualifiers. Out of 40 we can delete 39 source qualifiers

40 sources 1 SQ + 1 Target = 42 objects

Performance issue will come

Rather than 40 sources I create 1 source. One source represents 40 sources.

i.e., sources —>create —>performance point of view we use in real time.

Scenario: -Database is different account.


—–>In the two sources using source Qualifier?

Home preparation:

1.    What is inlining view (In Teradata derived views)

2.    What is materialized view

3.    Client can’t directly give access permissions to a table that may be flat file (or) XML file.

Scenario: database in different account


Grant select on dept to Scott —–>in my own user (suja)


Flat files

1.    A text file which is saved with an extension .txt, .csv (comma separated values), .dat

2.    There are 2 types of the flat files,

1.    Delimited flat file

2.    Fixed width flat file

—->what is the flat file?

An ASCII characterized text file. Which is saved as .txt, .csv (data appears in excel), .dat

.txt —->normal note pad (the data gets stored in notepad)

—–>In real time we get flat files.

Delimited: -every column (field) separated by some special characters

Symbols: 1, tab, space, (comma), semicolon

Delimiter means column separator.

Ex:-Empno/Ename/Job —–>pipe delimiter


—–>performance wise fixed length files is better

—–>How to extract the data from the file?

Indirect method:-

—–>single session can read the data from multiple files.

—->what is file list? (List of files that can be merged)


—->Without UNIX can I manage activity?

—->General UNIX not required. Shell script is required in this pmcmd utility we use

Shell script is used for administration

A command is useful for a developer?

—->UNIX is very important to flat files.

In UNIX, we use escape characters.

—->Interview point of view:




Project explanation

Delimited flat file:

Every column or field is separated with some. Special characters such as comma, tab, space, semicolon, pipe etc.

—->In real time files are kept in server side (next)

—>How to extract data from single text file?

—->How to extract data from multiple files (list file)

Ex: -customer – east – region – notepad

Customer – id, customer – name, account

7369,                   James,                      7500

7499,                  David,                        4000

7566,                  Krishna,                     8000

7001,                   Ravi,                           9000


Step1: import metadata from flat file


1.    from tools menu —>select source analyzer

2.    from sources menu —>click on import from file

3.    Select the file directory (c:\flatfiles) —->where we store file

4.    Select files of type —->All files

5.    Select the file (customer – east – region. txt) —–>click ok.

—–>In files always default is *delimited

—>Information treated as always the first row is a data integration service starts the reading from 2nd row onward start IMPO [2]    *Import field names

6.    Select the flat file type —->delimited

7.    Select the import field names from the first line

8.    Click on next —->click on next

9.    Click on finish

10.    from repository menu —>click on save

—>sometimes we receive the data from the single code, or double code no single or double ignore it.

—> 1 (Pipe) that time we select others [1]

In case file has no metadata, I want to give own column names,

Start import [1]   [] import field names

—>For .csv files, we have to open XML file,

Start —>MS office —->Microsoft excel



At the times of saving save with .csv —> (comma delimited)

(Default is.XLS)

Just give the file name: Employee

—>click ok —>click ok

It asks anything means we get any message we click ok

—>Sometimes we receive.XLS file.

.XLS we need XLS reader drivers that drivers not available

.XLS save as .csv, .csv is simple flat file

—->sometimes we design the mapping with flat file

Target flat file


—->Source is flat file that also associated with SQ


Step2: procedure:

1.    Create the target table with the database type oracle.

Note: Information power center supports target as flat files.

Step3: procedure:

1. Design flat mapping

Step4: process

1. Create a session —>Double click the session —>select the mapping tab —–>From left window select the source (SQ – Customer – region)

—>Source File type:

Direct: is a method to read the data from the single file.

Indirect: is read data from multiple files.

2. From properties, section set the following attribute

Attribute                                             value

Source file type                                 Direct

Source File directory                        c:\flatfiles

3. from left window select the target

4. Set writer connection with the load types normal

5. Click apply —>click ok

File list

1.    A file list is a list of flat files with similar metadata definitions that can be merged with the source file type indirect

2.    In file list minimum 2 files required

Source data files:-

1)    Customer – east – region. Txt

Customer – id, customer – name, amount

7369, James, 7500

7499, David, 4000

2)    Customer – north – region. Txt

Customer – id, customer – name, amount

5001, patro, 1500

5002, Mohanty, 5000

3)    Customer – south – region. Txt

Customer – id, customer – name, amount

4001, Shobha, 1900

4002, Divya, 3000

4)    Customer – west – region. Txt

Customer – id, customer – name, amount

6001, Verma, 1500

7002, Rama, 6000

—–>How to create file list,

Open notepad —–> pic31:-

—->file list is a file, but that contain the list of data file.

Creation of file list:-

A file list is a file that contains the list of source data files.


Open notepad —->type the path of source data files as given below

C: \ flat files \ customer – east – region. Txt

C: \ flat files\ customer – north – region. Txt

C: \ flat files \ customer – west – region. Txt

C: \ flat files \ customer – south region. Txt

–>we need one metadata (any one of file metadata is enough) in source analyzer.



1.    Import metadata from any of the source data file

2.    Create a target table with the database type oracle.

3.    Design flat mapping

4.    Create session

5.    Double click the session

6.    Select mapping tab

7.    from left window select the source

8.    From properties, section set the following attributes

Attribute                                       value

Source file type                                Indirect

Source file directory                        c: \ INFA – File – list —->

File list directory

(Where we saved all file)

Source filename                              Customer – region. Txt —> file list name

9.    from left window select the target —>set relational writer connection with load type normal.

10.    Click apply —->click ok

Scenario: I have 3 columns, but I want to add extra column name that is the filename (4th column). Display the name of the file. We should create target table with 4 columns

Scenario: Display the name of the file along with each record in the target.

Workflow Scenarios:


1.    Source —->Right click select edit

Select properties —>Add currently processed flat file name port

2.    Decimal separator (property)

Dot is default

Ex:-4000.00 —>. (Dot is decimal Sep)

3.    Date format

[Mm / dd / yyyy hh24: mi: ss] —>UA: (Altra sector)

Dd —>What format data will be coming

Yyyy —>Informatica data format

4.    Thousand Separator [None]

Fixed width flat files

1)    Every record is of same length (width)

2)    Break the record into the multiple columns using breakpoints.

3)    Operating system dependent

4)    For UNIX operating system set record delimiter as 1 byte

5)    For windows operating system set record delimiter as 2 bytes

6)    Fixed width flat files can improve the performance over delimiter flat files

—>Delimiter flat files are very simple to create

—->In fixed width flat file every record is same length fixed width flat file

—->In real time we receive delimited & fixed length

—>Design a mapping source (oracle) generate target is the fixed length flat file?

Ex: -Employees. Txt

7369, Krishna, 500030

7566, Kumar, 200010

7499, Satish, 600010

6363, James, 400020

Step1: Import metadata from fixed width flat files


1)    From sources menu —>click on import from file.

2)    Source the file directory (c: \ flat files)

3)    Select the type of files all files

4)    Select the file (Employee.txt) —>click ok

5)    Select the flat file type fixed width —>click on next

Break line: -Break the record into multiple columns.

6)    Click at the desired location to create break line



Click on next

7)    Provide the column name for each field —>click on finish

8)    Right click on the source definition —->click on edit

9)    Click on advanced

10)    Set no of bytes to skip between records 2

11)    Click ok

12)    Again click apply —>click ok

XML Source qualifier T/R

1.    Reads the data from XML files

2.    XML source definition associates with XML Source qualifier

3.    XML files are case sensitive markup language

4.    Files are saved with an extension.XML

5.    XML files are hierarchical or parent-child relationship file formats

6.    Files can be normalized or denormalized

—>While create XML file either we follow upper case or lower case

—>while creates XML File either we follow upper case or lower case


But every XML files associated with XML source qualifier

—->One parent —->many Childs (data repeated)

Venkat —-> nishal                                    Venkat —-> Nipuna

Venkat —-> Nipuna                                      (p)                  (c)

Denormalized format                                      normalized

—–>XML file has no precision scale

Ex: -Employee. XML
















Import XML source definition:-

1.    from sources menu click on —–>Import XML definition

2.    Select the file directory (c:\XML)

3.    Select the files of type XML files

4.    Select the file (EMP.XML) —>Click on open

—>Click on yes —->click ok —>click on next

Select *hierarchy relationships

Select *denormalized XML views


Click on finish

—>XML source def drop to the target designer

Relational target —>may be oracle, SQL server

XML target

—>Information support XML as a target

—>when we install repository service, the repository that time we use target database oracle that’s why we drag the source into target designer. It is automatically taken as target is oracle (database)

—>XML source definition associated XML source qualifier


1)    Create the target table with the database type oracle

2)    Create a flat mapping

3)    Create a session

4)    Double click the session –>select the mapping tab

5)    From left window select the source

—>We see 3 readers

1.    Relational readers —->it can read the data from databases (oracle, SQL server)

2.    File reader —>it can read the data from the file that is a flat files, the flat list, fixed width flat files.

XML Reader —>It can read the data

6)    From properties section —>set the following attributes,

Attribute                                                                 value

Source file directory                                           c:\XML

Source file name                                                   employee. XML


7)     From left window —>select target —>set the target load type normal —>set the writer connection to the target

8)    Click ok —>click apply —>click ok

Note: -In this we mention target load type is Bulk that times session will be fail.

—>Data pivoting: one record in the source translated to multiple records. It is possible by using normalized T/R

Reverse pivoting: multiple records converted into one record. By using normalizer T/R, it can’t be possible.

—>for interview point of data pivoting is most import normalizer T/R is most in interview point of view but in real time some rare situation only we have to use.

–>using normalizer T/R we can read the data from COBOL files (or) VSAM sources.

—->by using normalizer T/R we can do data pivoting

—>to read the data from rational tables & flat files —->SQ T/R

XML files —>XML SQ T/R

COBOL files —>normalizer T/R

—->We have migration projects (COBOL —>Other technologies) that time we will use normalizer T/R

Scenario: I have 3 columns, but I want to add extra column name that is file name (4th column). Display the name of the file we should create target table with 4 columns.

Normalizer transformation


1)    This is of type an active T/R which reads the data from COBOL files and VSAM sources (virtual storage access method)

2)    Normalizer T/R act like a source qualifier T/R while reading the data from COBOL files

3)    Use normalizer T/R that converts a single input record into multiple output records. This is known as data pivoting.

Source data file:


Year, account, month1, month2, month3

2001, salaries, 45000, 90000, 60000

2002, benefits, 15000, 20000, 35000

—->In any project we use 8 T/R. we use normalizer T/R may be work on banking, finance domain. In a tally system, this kind of example will come. In a tally system, financial data may be organized. Finance data coming like this that is a source. My tally system converts like this that is target.

–>month id: -automatically generated by normalizing T/R. we need month that time we use after normalizer T/R we have to use expression T/R with decode function.

—>month column called as GCID (Generated column id)

—>reverse pivoting not possible in normalizer T/R

(A) Aggregator T/R group by (year)

Group by (amount)

Reverse pivoting is done only by aggregator T/R.

In variable port, we have to use If function

IF month – id =






Year           account                     month1              month2           month3

1997           salaries                     75000                   86000            94000

1999          benefits                     15000                    17000            14000


Year                   accounts                 month                   amount

1997                  salaries                       1                          75000

1997                  salaries                       2                          86000

1997                   salaries                       3                           94000

1999                  benefits                     1                           15000

1999                   benefits                     2                           17000

1999                    benefits                    3                            14000


—->create target from scratch:

In power center designer —->select target designer —->from target menu select create —->enter target tally name.



1.    Import metadata from an account. Txt flat file

2.    Create the target table Stg – accounts with the database type oracle (year, account, month, amount)

3.    Create a mapping (m – data – pivot)

4.    Drop the source & target definitions to the mapping designers workspace

5.    Create the transformation type normalizer.

—->It doesn’t copy the ports from source Qualifier to normalizer T/R that time double click normalizer T/R          level —>we have to use in COBOL files

6.    Double click the normalizer T/R —>select normalizer tab —>from toolbar click on add a new column.

Column name             Occurs          data type       precision         scale

Year                                   0                 number             4                     0

Account                            0                 string                10                    0

Amount                            3                  number             7                     2

Click apply —->click ok

7.    From source, qualifier connects the ports to normalizer

8.    From normalizer connect the ports to the target

Year —–>year

Account —->accounts

GCID – Amount —>month

9.     GCID Stands for generated column ID

10.    GK stands for generated key

GCID —>for each records starts

GK —>is nothing but direct continuous number first time run self from we get num (1 – 6) and we run session next time we get GK num (7, 8, ———- 12)

Month —>GCID                   GK

1                                               1

2                                               2

3                                                3

1                                                4

2                                                 5

3                                                 6


Converts column into rows (data pivot)




Source data:

ID               JAN — Sale                  FEB — sale             MARCH —- sale

1                         100                         120                            135

2                          110                        130                            120

ID                        Month             sales

1                            JAN                100

1                            FEB                 120

1                            MARCH           135

2                              JAN                 110

2                              FEB                  130

2                             MARCH            120

—>By default normalizer T/R generate month – ID occurs property 3


—>I had a mapping. Mapping run more time (In this we have bottle neck (some problem)). I want to optimize the mapping.






*What is the diff B/W ETL & ELT?

A. Abbreviations are same but more diff B/W ETL& ETL.

*Paid options: for this we take license (paid money) ELT (9.5 that had a license) session partitions.

*By using unconnected we get data to multiple targets

Substring ()

In string ()

Informatica Advanced course

1. Dimensional Modeling

a. Star schema

b. Snow-Flake schema

c. Galaxy-schema

2. Implementation of Type-1Dimentation of Type-1 Dimension

3. Type-ii Dimension to preserve historical data

4. Type-iii Dimension with partial historical data

5. Unconnected lookup (Conditional lookup)

6. Unconnected stored procedure to drop & recreated indexes

7. Differences between connected & unconnected lookup?

8. Types of lookup cache?

9. Implementation of persistence lookup cache

10. Implementation of shared lookup cache

11. Implementation of dynamic lookup cache

12. Mapping parameters for standardizing business rules

13. Mapping variables for incremental extraction

14. Session parameters

15. Mapplets & types of mapplets

16. Reusable T/Rs

17. User defined functions (UDF)

18. Diff between variable port and mapping variable

19. Link conditions

20. Task & Types of Tasks

21. Worklet & Types of worklets

22. Objects sharing

23. Pushdown optimization (ELT)

24. Session partitions

25. Session Recovery

26. Diff between Normal & bulk loading

27. Importing & exporting repository objects

28. Constraint-based load order (CBL) primary key, foreign key

29. Target load plan (TLP)

30. PMSMD utility

31. PMREP utility for administration

32. Data validation testing

33. RTL integration testing

34. Performance testing (Identify performance bottle neck)

35. Version control (Check in, checkout)

36. Mapping debugger

Home Assignment Topics:-

1. What is Error Threshold?

2. What are the types of Errors I power center?

3. Type of Error Logs?

4. Different between Stop & Abort?

5. What is user defined commit?

6. Different between Source based commit & Target based commit

7. Define incremental Aggregation?

8. Different between ETL & ELT?

*9. What updates override?

[Override ids related to updating strategy T/R (DML comm. In SQL)

We discuss in slowly changing dimension]

10. Define enable high precision data

11. Different between cached & un cached lookup?

* Could you please tell where un ached lookup improve the performance?

*Cache can improve the performance In look up T/R by default caches are, S

12.  Explain reject truncated/Overflowed rows

13. How to implement self-joins using joined T/R?

*In joiner we need to sort Input (Aggregator T/R)

Joiner—->double click select property tab—->Sort Input belongs to joined T/R Aggregator

14. What is code page compatibility?

15. What is a tracing level? And types of tracing levels?

16. What are the types of data movement mode in Informatica power center?

*a. ASCII mode

b. Unicode mode (Types of data movement modes

17. Collect performance data

*Performance monitoring (how much cache is used)

Normalized T/R ——>Properties—–>collect performance data

18. Target – pre SQL

Target – Post SQL

19. Explain what is work directory? Which T/R is required to work directory?

2. There are 2 Types of Errors (session Errors)

a. Fatal Error: A fatal error occurs when the integration service can’t access the source, target (or)

repository (session will be stopped immediately when the fatal error occurs)

b. Non-Fatal Error: A non-fatal error is an error that doesn’t force the session to stop on its first occurrence(Session will not stop immediately when non-fatal error occurs)

3.Dimensional Modeling:-

Data Modeling:-

1. It is a process of designing the database by full filling business requirements specifications.

2. A data modeler (or) Database architect designs the warehouse database using a GUI based data modeling tool called “ERWIN”

3. EWIN is a data modeling tool from computer associates (CA)

4. A dimensional modeling consists of following types of schemas designed for data warehouse

a. Stare Schema

b. Snowflake schema

C. Galary schema

5. A schema is a data model which consists of one or more tables

*Could U please designed with star schema?

Tera data D base designed with the star schema.

*What are the different between star schema & snowflake schema?

*What schema ur using in ur project?

*Why ur create target table?

Just we want to check business rule in real time we never create target tables we can’t drag & drop

*Data modeler (or) Architects will design schema (Schema is nothing but data model)

*Administrator maintain the database

*What is schema?

*What is data model?

A. Data model consists of one (or) more tables targets

*In real time already existing target tables

*The offset part in a data warehouse in data modeling?

*Business analyst gather the requirement.

*Using ERWIN tool data modeler design schemes reverse schema

Repairing schema——> chaining data type

Forward schema

Data modeler should expert in ERWIN tool

*My warehouse design with 3 schemas

1. Star schema

2. Snowflake schema

3. Galaxy schema

Interview Questions based on schemas:

Load rules, load frequency, load types, load order, performance techniques, performance techniques





a. Star schema:

1. A star schema is a database design that consists of centrally locate fact table which is ideally surrounded by multiple dimension tables

2. Since the database design looks like a star hence it is called as star schema (or) star model

3. There are 2 variances of the star schema,

a. Simple star schema

b. Complex star schema

A. Simple Star schema:-

A simple star schema contains only fact table

B. Complex star schema:-

A Complex star schema contains more than one fact table

Fact Tables:-

1. A fact table contains facts. Facts are numeric

2. Not every numeric is a fact, but numeric’s which are of type key performance indicators are known as facts

3. Facts are business measures which are used to evaluate the performance of an enterprise

4. A fact table contains the facts at lowest level granularity

5. A fact granularity defines level of details


1. A dimension is a descriptive data (or) text which describes key performance indicators known as facts

2. A dimension table can organize the data in hierarchical format hence dimension table are denormalized

* What is the problem is start schema? How to overcome using snowflake schema

*Start scheme is one of the data base designing technique

*In Data model table represents in this symbol

We keep only primary above horizontal we keep foreign key (or) non-key (other than key ) attributes below horizontal  we keep foreign key or non-key attributes(other than key)


Fact table(composite key)




*Depends on requirement we have any no.of dimension

What is different between dimension table & fact table?

Fact table have primary key (we called as composite key)(or) Fact table primary key

Composite key means = FK1+FK2+FK3+FK4

Surrogate key (IT is an artificial key)

*Composite key represent combination of 4 FKS here

*Primary key of fact table is combination of all dimension table

Composite primary key:-

A primary key which represents more than one column





*There are 2 variances of schema

1. Simple star schema

2. Complex star schema

*400 tables in tables that represent PDF. Schema diagram is given in PDF format 400table move horizontally vertices

In PDF, we have searched——->type target name

As per spec document (In separate document we have source & target table names with data type & size) we will compare schema names and apes document names not matched, I am doubtful that time we content database architect

What is load order?

We design the data first load the data in dimension tables. Then data load from dimension to fact table

Dimension Table (Fact tables)———–>(means)Target tables

*My Schema contains 40 Dimensions, Fact that time 41 mapping we develop 41 sessions

*Each dimension table having one mapping fact also has 1 map

*Load Rule: based on load rule we design complex workflow

If all dimension loads are success then load data into fact table in such a way we have to design our workflow

Load frequency: -The rate at which data loads (daily basis, weekly loads, monthly loads) refreshing means update with new data

*My DB has 500 tables. 400 refresh with new data tables refreshed on daily load—–>MOD ()

Incremental Load: -Data gets refreshed incrementally

——>Dimension—–>Master (or) parent

Fact—–> Dentail (or)child

Load order:-

Design Mapping applications that first loads the data into the dimension tables and then load the data into the fact table

Load Rule: – If all dimension table loadings are success then load the data into the fact table

Load frequency:-

Database gets refreshed on daily loads, weekly loads, and monthly loads

*I have 2 sessions 1st session 10 times success then execute weekly load (2nd session)

A. In this session execute based on loads (Daily, weekly, monthly load)

Load type:-

a. Initial load (or) full load——>First-time load is initial load

b. Incremental load (o) Delta load—->after 2nd , 3rd ….. all loads are incremental / Detta load

Initially, every target table is Empty

Initial load——>1st load is initial load





Initial load (1st load is initial load)

Fact table is always incremental load

*Based on primary key & foreign key we load the data into dimension

*Daily refresh called Daily load, monthly refresh called table weekly refresh called weekly load, monthly load

*Fact is always incremental load (daily transactions are incremented in fact table)

*What is dimension & what is fact?

EX:ABC corporation—->Sales Business operation

Profit—–>is numeric




Key means not a PK & FK

*Fact are numeric every fact is numeric but every numeric is not fact


168  10

169  20

{This is numeric is not fact why because through this we can’t analyze the business}

*Who prepare reports that team we called BIteam(Business intelligence)team

BI tools mean reporting tool

We generate reports by using select STMT


E Name                                sal

Krishna                                 8000{here salary is KPI}

(this text describes numeric.)

*Dimension is a descriptive data (text) which describes KPI called Dimensions.




*If we want to generate report at least one dimension end one fact

Note: -A dimension table organizes the data in hear



Why dimension table are denormalized?



*Dimension table may organize in hierarchical that times also dimensions are denormalized

For Example above example

*Dimension table may not organize in hierarchical that time also dimensions are denormalized


Customer –SK(PK)







Customer –key   cust-ID  Name                  City

1              7319       Ram       HYD

2              7319      Ram       BNG

{Denormalized Data Pattern}

Q. What tables are big (Factor dimension)? Fact tables

A. Fact is always large when we compare with any dimension table

Q. Why Fact Table is very large?

A. Query performance degrades (select Query)

How to improve performance of fact table is very large?

What happen if a dimension is large?

Deep dimension (having more records)

*Large dimensions are normalized into multiple dimension this technique used in snow-flake schema. For that, we called snowflake schema is derived from star schema

*I want to do daily load, weekly load & monthly load 7 days daily load summarized (clubbed)we get (or)we can do weekly  load

30 days (or) 31 days daily load summarized (grouped) we can do monthly load, based on facts only we do loads

*Fact table contains the facts at lowest level granularity in dimension table


(lowest level)




(lowest level)




(lowest level)

Fact table is very large compared to dimension table

*Why Fact is very large?

Database Architect & Administrator sits together they developed partitions, Partitions do parallel reading






Partitions dividends on monthly bases, weekly bases, some companies do daily bases

Range partition——>1 1000 1001 200

Star Schema Key Points:-

1. A fact table is very large relative to any dimension table in the schema

2. A fact table contains the transactions which are the multiplication of multiple dimensions

3. If the fact table is very large then SELECT QUERY performance degrades

4. A fact table requires to be partitioned to improve select query performance by avoiding full table scan

5. A range partition is required to define on fact table

Partition: -Large sets divided into multiple subsets

*Facts are always normalized fact table have always incremental load

*Large dimension (or) deep dimension (or) wide dimension

*What happen you are dimension is very large?

Denormalized dimensions Split into, Deep dimension (or) Large dimension (or) Wide dimension—>for column

6. Partitions are created based on either monthly base, quarterly base, weekly base (we may not do partitions on days & years)







Very large Dimension:-

A dimension can be set very large when it is a deep dimension (or) wide dimension

A deep dimension contains more no.of records

A wide dimension contains more columns

If the dimension is very large then select query performance degrades

Q. Why dimension is very large?





Summary Point to Report:-

The select request can scan or browse 5lakhs records from dimension table to display or to return 100 records in the report

Select request run for a longer period of time

To improve the select Query performance it is recommended to split a very large denormalized dimension table into multiple normalized dimensions

Every large fact table divided into partitions increase Query performer

*Dimension table is very large that time we can split into normalized dimensions when dimension table is large that time we use snowflake schema

* Snow-Flake schema is an extension of star schema

*Complex star schema is called Galaxy schema


Q. Without using import & export process we will tear many workflows to any group (menu—–>our own name as UR wish)

A. In real time instead of import & Export, we will use deployment groups (in repository manager)





Deployment groups——–>Right click select new group




When we press ok, then automatically we get




Now workflow uncheck——->[]work flows W-S-decode—–>now drag & drop this workflow to RK(o)

Then we get RK (3)

1. Work flow-logs

2. Data

How to create workflow without creating session

Go to mapping designer——->Select mapping menu

Snowflake Schema:-

1. A large denormalized dimension table is split into multiple normalized dimensions


1. Select query performance increases


1. Maintenance cost increases due to number of tables

*No dimension table have any parent is a star schema. In this dimension number, child table itself is a parent no single dimension table has parented.



Here [100] scan 100 records in snowflake schema but in star schema

*Which type of schema can have normalized dimension

(A) Snowflake schema

For D1 patient is D11, for D11 parent is D12

Different between Schema & snow Flake Schema

Star schema                                                        Snow Flake Schema

1. No dimension table having parents                     1. A dimension table may have parents

2. Denormalized dimension                                          2.Normalized dimension

3. Less joins                                                                          3.More joins

Galaxy schema:-

1. Conformed dimensions: -A dimension which is shared by multiple fact tables is known as conformed dimensions

2. Fact constellation: – It is a process of joining two fact tables

3.A Galaxy schema is also known as hybrid schema (or) Multi star schema (or) bus schema (or) complex star schema (or) constellation schema(or) integrated schema


Bus Schema:-

2 ID

*Common dimension to two fact table is called conformed dimension

*In PDF format every developer receive schemas

What are the types of fact tables?

What are types of facts?

A. Facts are 3 types

1. Detailed fact table

2. Summarized (Aggregator)

3. Fact less fact table

Additive facts (sum) i.e, +

Semi-additive Facts


Types of fact tables:-

Detailed fact table

Summarized fact table (Aggregator)

Fact less fact table

  1. Detailed Fact Table:-

A fact table which contains detailed transaction facts same as OLTP known as detailed fact table




2. Summarized fact table:-

A fact table which contains aggregator facts

Sum (Quantity)

Sum (Sale –Amount)

Sum (profit)

Avg (Quantity)

What is a scenario for fact table?

  1. Fact less fact table: A fact table without any facts is known as fact less fact table

Note: -Prepare a case study on fact less fact table


Type of facts:-

There are 3 types of the facts that can be stored in fact table

1. Additive facts

2. Semi-Additive facts

3. Non additive facts

1. Additive Facts: – A fact which can be summarized for all the dimensions is known as additive facts

Ex: Quantity

Sale amount

2. Semi-additive Facts: -A Fact which can be summarized for few dimensions but not for all the dimensions is known as semi-additive fact

3. Non-additive facts: -A fact which can’t be additive or summarized for all the dimensions is known as Non-additive facts

EX:-discount (percentages) anytime in % format

% (percentages) is always non-additive facts




1. Quantity, sales-Amount can be additive with all dimensions

2. Qty, Sales-Amount can be additive with all store dimensions

3. Qty, Sales-Amount can be additive will all product dimensions

*Different companies having different naming standards to fact tables & dimension table, source & target but business rules (logics (or) techniques) are same to sny company





What is the total current balance for all accounts?

Current-balance can be additive with account dimension

Current- Balance cannot be additive with time dimension

Fact granularity: -It defines the levels of details otherwise define the lowest level detail,

Fact defines the lowest level granularity

Slowly changing dimensions:

1. A dimension that can capture the changes which take place over the period of the time is known as slowly changing dimensions

2. There are 3 types of dimensions

a. Type 1 Dimension

b. Type 2 Dimension

c. Type 3 dimension

Type 0——> Always insert into target

Type 1 ——>Insert else update

Type 2 —–> Update else insert

1. Type 1 Dimension:

1. It captures only current data

2. It doesn’t maintain history

2. Type 2 Dimension:-

1. It keeps complete historical data in the target

2. For each update in OLTP, it will insert a new record in the target

3. Type 3 Dimension:-

1. It Captures current & previous information(Partial historical data)

Sequence generator Transformation

This is of type passive T/R which generates unique numeric values use as keys

This T/R creates 2 default output ports



This T/R doesn’t allow you to create new ports and doesn’t edit existing output ports

It doesn’t support input ports

Typically connect the next vial port to any downstream T/R or to the target

Define the following properties to work with the sequence generator T/R


Current value:-

1. Integration service uses the current value as the base to start generating sequence numbers

2. When the session completes integration service updates the original current value to the last value generated plus an increment by value

3. Last stored value = Last generated + as increment by value

4. The default current value is 1

ii. Increment by:

Defines the difference between any two consequential sequence numbers the default is 1

ii. END value:

It is the highest value or the maximum value that the integration service uses to generate sequence number

Session fails if the integration service reaches the maximum value

iv. Reset:-

If enable each session starts generating sequence number from the original current value

Otherwise, integration service starts generating sequence number from the last stored value for each session run

v. Start value:

1. The integration service uses the start value as a base to generate the sequence numbers if the cycle option is enable

2. The default start value is ‘0′ (zero)

vi. CYCLE:-

If enable the sequence generator returns to the start value when the end value is used

It is used to generate repeating sequence numbers for each cycle

Performance consideration:

It is best to configure the sequence generator transformation as close to the target as possible in a mapping otherwise a mapping will be carrying extra sequence number through the transformation process which will not be transformed

*A sequence generator is a row by row operation it generates unique sequence number is a row by an operation. It generates unique sequence numbers used as key [surrogate key, sequence]

*it generates repeated sequence no’s

* Repeated sequence no

2014   ID



12  } this is cycle





*doesn’t allow new port, we can’t edit existing 2 O/P port ( next val, curr val)

Sceudo column in informatica {Next val——->Connect to any other T/R (or)

Curr val                 may be connect to target (i.e, 1,2,——-)


1  } ur val generates always ‘1′


Performance consideration diagrams:-




Here copy operation performs 2 times. It takes some extra time we want to reduce this 2 operations through the following diagram,




Here we can perform 1 copy operation

*When we select cycle [] the integration service reads star value as base star value-0

We can be changed (or)it is replaced with 1. When End value reached the integration service starts the

the value[1] onwards



“                              Start value—1—–>by default we have’0’

12                           Cycle ——-[]

1                              End value—–12…..

Round robin sequence}——–>    2                          ——————————-

(or)  cycle                                        “                                              very big number


*When we didn’t select cycle[]

When we run the session by default current value ‘1’

Current value -1


End value-10{based on targets2 target means, 2 end value, means 1 end value}

Diff{ 1                                    Reset—-[]—->Every session start with—>1

2                                                                                                                       2

3                                                                                                                       “

4}                                                                                                                      10

Means incremented by ‘1’                                           next session         1

Difference means increment by 1                                                              2


When session completes, last value incremented by 1

1                                                              19

I have 18—>      2                                                              “

Records                “                                                              “

18                                                           1

18+1=19—–> Last record

Next tie run the session,                              current value – 19

We get sequence no.’s 19 onwards         incremented by -1

19                                           End value –

“                                              reset  -[]

That’s why when we did not select cycle [] that time we select reset[] means it give come better performance

*Performance consideration for filter T/R is, filter T/R is, filter T/R as close to source(Squalifier)

*Performance consideration for sequence generator T/R is, sequence generator T/R as close to target

In interview imp:

Update override in update strategy

  • Lookup T/R?
  • Sequence generator? No. of caches it improve performance in sequen generator T/R
  • Sequen generator T/R : Sequence generator used as a keys are



Current value—->19

Incremenred by—–>1

End value—–>9(last record  is the highest value (or) last value of source to generate sequence numbers default it is when we restart the session the sequence no.starts from on words(19,20,……)

When session completes sequence no’s are coming like this(1,2,…….18)

Last stored 18+1=19

End value not mention it automatically here current value stored with 18 (no.of records)+1—>19

4 target table end value —->4,3 target table end value —–>3

In router T/R —–>Next val =1 (1,4,7)

Scenario 1:-{I have 14 records in source }

There are 3 target tables with the name T1, T2, and T3. First record pass to T1, second record pass to T2, Third record pass to T3 .4th record pass to T1, 5th record pass to T2. 6 th record pass to T3and so on



  1. Create the transformation type sequence generator
  2. Double click the sequence generator—-> select properties tab,

Transformation Attribute                            Value

Start value                                                              1

Increment by                                                         1

End value                                                                3

Current value                                                         1

Cycle                                                                       []

Click apply ——>Click ok

  1. Create the T/R type router ——>From source qualifier copy all the ports to the router
  2. From sequence generator copy next val port to the Router T/R
  3. Double click the router T/R —–>select the groups tab,

Group Name                                      Group Filter condition

T1                                                                  Next val = 1

T2                                                                  next val = 2

T3                                                                  next val = 3


Click apply & Click ok

Scenariou 2:

There are 2 target tables with the name T1& T2 4th record pass to T1 and 8th record pass T2

  1. Sequence generator & Router T/R with 2 targets



Scenario 3:

There are 2 target tables with the name T1 & T2 alternative records pass to one target & others pass to second target







Scenario 4:-

There are 2 targets with the name T1 and T2 generate the two different set of sequence no’s for each target

Don’t change any property

Double click sequence generator ——>Edit

Take one sequence generator T/R



*Scenario: – Two different set of sequence numbers two different targets

A. This is above scenario






Generates two different sequence block

Sequence block executes one after another


A source contains n rows there are two target tables with the name T1, T2 Generate same set of sequence numbers in both are target

Note: –Don’t change the default property of sequence generator





I have a source with the name orders which is having null values in order ID column Replace the nulls with unique identification numbers












Slowly changing dimensions (updating)

*This topic is based on dimensions

*In OLAP primary keys are called synthetic keys, Surrogate key


Customer-keys (PK)

(SCD change within 2 or 3 years)



EMP no

Whenever change in OLTP data that time we insert a new record in OLAP

*Identify the natural keys in OLTP based on natural key we develop SCD

OLTP (source)

(PK) EMP no E name JOB ——->Intarget we take Employee-key Empno……

*Router T/R capture rejected records

*update strategy, T/R also capture rejected record

Type-1 —->Insert, update(It maintain current data)

Type-2 —–>Up date as insert (insert, Insert)—->It maintain complete historical data

—->Lookup SQL override

Type -3 Insert as update(Insert, update)

—->Partial history

Type -1

Type – 2 {Lookup is mandatory. Update strategy T/R also mandatory}

Type – 3

Lookup on Target table: lookup check source record in target table that record existed not OLTP(Souce)


Emp no                  E name                 Job

73/2                       Ravi                        SE


Emp-Key              Emp no                 E name   Job

1                          7312                         Ravi       SE—–> SSE —->TL

Insert,update} we do 2 operation

Emp-Key              Emp no                 E name                 Job

1                          7312                       Ravi                         SE

2                         7312                       Ravi                         SSE

3                            7312                      Ravi                       TL


Insert  {Here we do insert operations}


Emp-Key              Emp no                  E name               Job         Prev-job

1                               73/2                      Ravi   SE                 —



i.e, 1 7312 Ravi TL SSE we get partial history


Update} we do these & operations

*Initially target table is Empty

Source (compare)——–>Target is null means there are no records

Source (compare)———>Target is not null means target table having records

*Source to target comparison possible through lookup

*There are


An active T/R that flag the source records for insert, update, delete and reject the rows

The Informatica power center supports 2 different methods to build an update strategy

Create an update strategy T/R in mapping

Define the session property called, “Teat Source Rows as”

1. Mapping-Update strategy:-

a. If we want the integration service to perform multiple different database operations then create an

update strategy T/R in mapping this is known as data-driven

Data driven: -It will instruct the integration service to follow the instructions which are coded in update strategy T/R in mapping

2. Update strategy T/R supports expressions using following constants

a. DD-Inset—–>0

b. DD-Update—->1

c. DD-delete —–>2

d. DD-Reject —–>3

The default in DD-insert

3. The integration service prepares SQL stmts for DD-insert, DD-update, DD-delete

4. No SQL is prepared for DD-Reject

5. Update strategy/R supports conditional Expressions using, IIF() and Decode()

Ex: IIF (is null (comm.), DD-Reject, DD-insert means then(OR) IIF(is null(comm.),3,0)

6. Update strategy T/R supports to forward the rejected records to a bad file directory,

C:\ Informatica 9.5.0\server\Infa-Shared\bad files

ii. Session-update strategy:-

1. If we want the integration service to perform multiple, same database operations then define the update strategy at session level using property called “Treat source rows AS

2. The default is insert

Key Points:-

An updated strategy is a concept of DML operations

The target table should require a primary key to update a record or delete a record from


By default, the primary key column appears in where clause of update statement and statements

A user can override the where clause from key attributes to non-key attributes, this is known as update override (not SQL override

An updated strategy T/Ris used in “slowly changing dimensions(SCD)”

*Update strategy(Concept of DML)




The following T/R supports to capture rejected rows

Router T/R

Update strategy T/R

What is Data Driven?

What is Update override?

What is SQL override?

In Type 2 we have 2 scenarios


Insert} scenarou-1 for type-2



Update} Scenario-2 for Type-2

Target table should require a PK to update (or) delete a record from target


101 suja 50000


(without PK we can’t update record)


101suja              50000


BY default the PK in target we can update that record by using where clause of update stmt

Ex: -Update stag-emp set………Where eno = 7369(we can override because it is a PK (column) attribute

Can I override PK column?

A. Yes

Update Stg-Emp set where E name = ‘Smith’

Job= ‘Manager’

These are non-key attributes we can’t override

Implementation of Type 1 Dimension:

In Type 2 —–>we use lookup & SQL function i.e, called lookup override

PK—–>is natural key why because it uniquely identifies the records

Source data(OLTP)   Business Rules is null   Target Data(based on       (Employee- key)    this we)


Emp no                 E name Sal     job    DD-insert (0)                    Tep-key           Emp no       job    Sal

7369                       smith             clerk    7000                    3000 Next val        1               7369  7000

7566                                        7000            ——————–       2 7566       not is null(Emp-key) and

Sec-sal 1 = trg –sal—–>Renaming purpose we use expression T/R

Every develop first identify natural keys

Reads a recode from source lookup on target

Sometimes source record exists but we want to check that is wrong data or correct data

Source7369 smith manager 7000 target7369 smith sales 7000 in this time we use update command

Type-1 dimension doesn’t maintain any history it maintain shows only current data

Intype-1 dimension for lookup the records in target for that we use lookup T/R next lookup T/R for writing business rule purpose we develop router T/R before router T/r for renaming ports we develop expression T/R then we want to insert the record & update the records we use update strategy T/R before this we develop sequence generator T/R connect




*For performance point of view don’t use update strategy T/R for insert records into target only for update we use update strategy T/R

* For Rename a port we use expression T/R

Src-job != Trg-job or SRC –Sal! = %yeas sal

Date method -2 inserts,update——> used in real time

Version -2 insets, -2 inserts, (2 instantiates 2 inserts)


*Flat mapping

Treat source rows as [insert] is by default



Type-1 dimension procedure:

Source: Emp

Target: – Dim-Employee- Type 1(Emp-Key + all source columns)

Make sure that emp-key is a primary key

Create a mapping

Drop the source definition

Drop the target definition with the 2 instances

Create the transformation type lookup that lookup on target table(Dim employee type1)

*Delete unwanted columns it improves the performance why because we take all columns that columns stored in cache that time each performance degrades

*Doesn’t change dept no ———>delete these columns compare source job target job means—->don’t delete

*Based on requirement we implement we implement type 1,2,3

* In our spec document what are the columns doesn’t delete columns

5. From source, Qualifier copies the port EMP no to the lookup

6. Double click the lookup T/R —–> select ports tab—–> delete unwanted lookup output ports

7. Uncheck the output port for a port name Emp no Which is copied from source Qualifier

Port Name          Data       Type     Prec       Sec I  o L-R

Employee-key                        6          0             []            []            []

Emp no                     4         0               []            []

Job                               []       []

Sal                                []     []

Comm.                      []  []

Select the condition tab

Lookup table column Operator Transformation port

Emp no       —    Emp no 1

We write this condition for whether the record exists for or not

9. Select properties tab,

Transformation Attribute   Value

Connection information    oracle- batch 7 am-db

Click apply and click ok

Source port name = target port names we get confusion when we develop business logic for only renaming purpose we use expression T/R best have we didn’t write  any business logic

10. Create the transformation type expression

11. From source, qualifier copies all the ports to expression

12. From lookup, T/R copy the following ports to the expression T/R (Employee-key, job, sal, comm.)

13. Double click the expression T/R select the ports

Tab  ——>Rename the ports as follows






COMM1  —–> Trg – comm.

Click apply and click ok

14. Create the transformation type router

*Performance point of view MD 5(message digest algorithm) when we use more columns that time we use MDJ

For EX: Src- Job! = TRG-JOB or src _sal! = Trg-sal……. 40 column like that time performance degrades

15.From expression T/R Copy all the ports to the router

16.Double Click the router T/R select the groups tab

Group Name    Group Fitter condition

Insert     Is null (Employee –Key)

Update     Not is null(Employee-Key)


(SRC-Job! = Trg – job (Or)

SRC-sal != Trg-sal(or)

SRC-comm! = teg-comm)

Click apply and Click ok

*I Want design flexibility, for insert we use update strategy T/R

*Performance point of view, for insert no need to use update strategy T/R records

*Update in source that time we use EMp-key (Primary key in target ) to target (update strategy T/R)

Insert new records in the target:-

Create the transformation type sequence generator


From router T/R, from insert group ——->connect the ports

To the first target instance

From sequence generator connect next val port to first target instance

Update an existing record in the target:

Create the T/R type update strategy

From router T/R from update group ——–>Copy the following ports to update strategy ( Employee-key, src-job, src-sal, src-comm)

Double-click the update strategy —–> select the properties tab,

Transformation Attribute value

Update strategy expression        1

Click  apply & click ok

4. From update strategy T/R connect the ports to the second target instance

Note:-Src-comm —-> 0(Default: 0)

—->Incremental load —>Adding a new record after initial load






*after writer update state must and should use commit stmt otherwise it will not update in the source(Scott)

*First update is done in source table then we have to check target table after running the session.

Type-3 dimension procedure for how to develop B rule

It maintains partial data

*For update we need PK I.e, employee-key(PK) in target

* If we have more instance performance degrades


Source    Business Rule   Target

Empno    Ename     Job    sal    Emp-key    Empno   Ename job cal per sal

7369       Ram         clerk   3500 is null (Emp-Key)             3500 —

When we run session    6000    DD-insert(0)} next val             6000       3500

New sal will update          prev-sal= Null            9000       6000


Not is null (Emp-key) AND

Src-sal ! = Trg-sal

DD-update =1


—->We maintain uniqueness for that purpose we maintains EMP – key

—->for design type – 3 (or) type – 1 (or) type – 2 no need EMP – key that time EMPNO is pk in target also.


1.    A type – 3 dimensions captures partial history,

Source table: EMP                                 Target table: Dim – EMP – type 3


Make sure that EMP – key is a PK column.

1)    Create a mapping —->drop the source definition

2)    Drop the target definition with the 2 instances

3)    Create the T/R type lookup, that lookup on target table (Dim – Employee – type3)

4)    From source, qualifier copies the port EMPNO to lookup

5)    Double click the lookup T/R —>select the port tab

6)    For port name Empno1 uncheck the output ports

7)    Delete unwanted lookup output ports


8)    Select the condition tab,

Lookup table column                          operator                        transformation port

EMPNO                                                        =                                          EMPNO1

9)    select the properties tab,

Transformations attribute                                                     value

Connection information                                                        oracle – Batch 7 AM – DB

Click apply —->click ok

10)    Create the transformation type expression to rename the ports as follows

11)    From source, Qualifier copies the following ports to expression (EMPNO, ENAME, JOB, SAL)

12)    From lookup, T/R copy the following ports to expression (EMP – key, SAL)

—->EMP – key is required for update a record in the target & for writing the logic

13)    Double click the expression T/R select the ports tab —>Rename the ports as follows,

Sal —> Src – sal

Sal 1 —–> Trg – sal

Click apply —->click ok

14)    Create the transformation type router

15)    From expression copy all the ports to the router

16)    Double click the router –>select groups tab,

Group name                                                  Group filter condition

Insert                                                                 ISNULL (EMP – key)

Update                                                               NOT is null (EMP – key)


Src – sal! = Trg – sal

Clicks apply & click ok

Insert new records in the target:

1)    Create the T/R type sequence generator

2)    From sequence generator, T/R connects the next Val port to first target instance.

3)    From router T/R from insert group, connect the ports to first target instance, expect prev – sal (while insert a record prev – sal is null)

Update an existing record in the target:

1)    Create T/R type update strategy

2)    From router T/R from update group —->copy the following ports to update strategy,

EMP – key, Src – sal, Trg – sal

3)    Double-click the update strategy T/R —->select properties tab,

Transformations attribute                                          value

Update strategy expression                                           1 –>means DD – update

Clicks apply & click ok

4)    From update, strategy connects the ports to second target instance

EMP – key —-> EMP – key

Src – sal —–> Sal

Trg – sal —–>prev – sal

—->First update in source (Scott) show user: Scott

Update EMP set sal = sal + 2000 where EMPNO = 7566; commit 

—->Without Query we can’t implement type -2.

Lookup SQL override —>develop SQL query in this – property

—->For type1, 2, 3 dimensions we can do the lookup on target why because for ex we can see source 7369 exists in the target or not i.e. lookup on target.


Scenario: -Display previous and previous – job

Implementation of type – 2 dimensions:

1)    A type – 2 dimension stores the full history in the target

2)    History can be maintained using 3 different methods,

I.    Using start date & end date —>90% use in real time

II.    Keep the primary version number in a separate column

III.    Mark the current dimension record with the flag (0, 1)

—>Sys date will become start date of Ur carrier

Date range —>start date &end date

—>rather than flag method we use 2 methods in real time (start date – end date, version num)


—->Update as insert —>when Src record is updated

—>For new job by default start – date = sys date.

End date = sys date – 1 —–>not use in Informatica

Add – to – date ()

—–>2 inserts 1 update we have to use 3 target instances & only one update strategy T/R

—>How to find out active record in type – 2

I.    End date is null

II.    Maximum surrogate key

III.    Maximum (or) last start – date value

—->Active record is current record (End date is null)



Historical record —>End date is not null.

Note: In a type – 2 dimension active records can be identified using following techniques,

a)    Write a select Query with where clause, where end – date is null

b)    Max surrogate – key for each employee

c)    Maximum of start – date (greatest)

In – active record: In a type – 2 dimension inactive records can be identified using following techniques,

1.    End – date is not null

2.    Surrogate – key value less than max surrogate – key value


Source: EMP                                                            Target: dim – EMP – type 2



1.    Create a mapping

2.    Drop the source definition

3.    Drop the target definition with the 3 instances

4.    Create the T/R type lookup, that lookup on target table (Dim – EMP – Type2)

5.    From source, qualifier copies the port EMPNO to the lookup

6.    Double click the lookup —>Select the ports tab

7.    Delete unwanted lookup output ports (ENAME, SAL, DEPTNO, START – DATE, END – DATE)

8.    For port – name EMPNO1 uncheck output port

Port name                data type                    prec                scale              I      O       L        R

EMP – KEY                Decimal                        6                                                   *        *

EMPNO                         “                                 “                                                  *         *

Job                           string                                                                            *

EMPNO1               Decimal

9.    Select the condition tab,

Lookup table column                                   operator                                  transformation port

EMPNO                                                         =                                                  EMPNO1

10.    Select properties tab,

Transformations attribute                                value

Lookup SQL override

1)    Select out. Employee – key as employee – key,

Out. Job as job, out. Empno as EMPNO

From Dim – employee – type OUT

Where out. Employee – key = (select max (inn. Employee – key) From Dim – employee – type 2 INN Where INN. EMPNO = OUT. EMPNO)

2)    Select * from Target – table where end – date is null

3)    Select max (EMP – Key) from EMP group by EMPNO;

—>What is diff between traditional sub Queries & co – related sub Queries?

Traditional sub Queries (or) normal sub Queries independents (bottom up) based on inner Query result outer Query executes first.

Connection information                                                oracle – Batch 7 AM

Clicks apply & click ok

11.    Create the T/R type expression

12.    From source, Qualifier copies the required ports to expression (EMPNO, ENAME, JOB, SAL)

13.    From lookup, T/R copy the following ports to expression T/R, EMP – key, job

14.    Double click the expression T/R —->select the ports tab

15.    Rename the ports as follows

Job —–>Src – job

Job1 —->Trg – job

Clicks apply & click ok

16.    From expression T/R copy all the ports to router T/R

17.    Double click the router —->select the groups tab

Group name                                Group filter condition

Insert                                                 ISNULL (EMP – Key)

Update                                              NOT ISNULL (EMP – key)


Src – job! = Trg – job

This condition performs 2 database operations insert, update

Insert new records in target:

1.    Create the T/R type expression & sequence generator

2.    From router, T/R from insert group copy the required ports to expression (EMPNO, ENAME, Src – job, sal)

3.   Double click the expression T/R —->selects the ports tab,

Port name                    data type           O                    expression

Start – date                  date/time        *                      sys date

Clicks apply & click ok

4.    From expression connect the ports to 1st target instance (Don’t connect to end – date)

5.    From sequence generator connect the next Val port to 1st target instance

Update as insert:

1.    When source record is updated a new record will be inserted in the target

2.    Create the T/R type expression

3.    From router, T/R from update group copy the following ports to expression (EMPNO, ENAME, Src – job, sal)

4.    Double click the expression T/R —->select the ports tab

Port name                     O               expression

Start – date                  *                   sys date

5.    From expression T/R connect the ports to the 2nd target instance

6.    From sequence generator (existing sequence generator) connect the port next Val to 2nd target instance

Update as update (END – DATE):

1.    Create the T/R type expression & update strategy

2.    From router T/R, from update group —>copy the port EMP – Key expression T/R (only one port required i.e. EMP – Key)

3.    Double click the expression T/R —->Select the ports tab from toolbar on add a new port,

Port name                data type                 I      O

EMP – Key                                                 *       *

End – date                date/time                        *   add – to – date (sys date, ‘DD’, -1)

Clicks apply & click ok

4.    From expression copy the ports to update strategy.

5.    Double-click the update strategy —->select properties tab,

Transformations attribute                                value

Update strategy expression                                  1

Clicks apply & click ok

6.    From update strategy T/R connect the ports to the 3rd target instance

—->Type 6 —->is a combination of type 1, type 2, type 3 —->type 6

—->type – 4 —>is a combination of type 1, type 3 —>type 4


—->Flag method (0, 1) but in flag method active – 0

—–>version method (0, 1, 2 ————– inactive

—->Why we use 2 insert operations in type – 2

Implementation of type – 2 dimensions with version method:-

Source table: EMP (Oracle)

Target table: Dim – employee – type 2 – version (Employee – key, EMPNO, ENAME, JOB, SAL, Version type)

TASK and Type of tasks

1.    A task is an executable set of actions, commands, and functions

2.    Tasks are categorized into 2 types,

a)    Reusable task

b)    Non – reusable task

Reusable task: -A task which can be assigned to multiple works is known as reusable task. Reusable tasks are created using Task developer tool an work let designer tool


1.    Session

2.    Email

3.    Command

Above three are task developer tool

4.    Work let —>is a designer tool

Non – reusable task:-

I.    A task which is created specific to a work flow is known non – reusable task

II.    Non – reusable tasks are created using work flow designer.

III.    Non – reusable tasks can be converted into reusable task, but not every.


1)    Session

2)    Email

3)    Command

4)    Work let

5)    Decision

6)    Assignment

7)    Event wait

8)    Event raise

9)    Control

10)    Timer

Note: -only session, command, Email, work let are converted into non – reusable but reusable to non – reusable not possible.


First we have to do daily load. Group of daily loads is called weekly (7 daily loads) or monthly (31 (or) 30 daily loads)


—>Transformations before what to do that are related to mapping based

—->But in task and type of tasks onward it is related to work flow based

—->How to convert non – reusable task into reusable tasks,


Link conditions

1.    A link condition controls the execution of the tasks during work flow run.

2.    Link conditions are defined using a predefined workflow variable called “status”

3.    Link conditions are created to control the tasks which are in sequence.

Sequential batch process:-



1.    Double click the link between S1 & S2

2.    From predefined tab —->double click on the variable status

3.    Type = SUCCEEDED

4.    Click on validate —->click ok

5.    Similarly, create the link conditions between other session tasks

—->without link conditions session run flow:


Here s1 completes (may succeeded or fail) s2 start. S2 completes (May success / fail) s3 start. S3 completes (May success / fail)

In the above diagram, I give wrong connections in s2.

Work flow

—-> S1 succeeded

S2 fail —->here s2 fail that time

S3 succeeded        also s3 succeeded

With link condition:


In this diagram (flow) also I give wrong connections in s2

Workflow monitor

Work flow     —–> s1 succeeded

S2   fail —->here s2 fail that’s why s3 will not execute

Concurrent (parallel) batch process:


Here we give wrong connections in s2. That time also s1 run. Here s1, s2, s3 execute at a time (at once) i.e. same time.

—->All dimensions tables are independent in star schema.

—->Dimensions are dependent in snow flake schema.


Event raise task:-

1.    An event raise task triggers user defined events

2.    The user defined events must be declared in the workflow events tab.

3.    An event raise task is always used in conjunction with event wait task.

4.    An event raise task sends a signal to event wait task to start subsequent tasks in the workflow

5.    A user defined event is defined as completion of the tasks from the start task to event raise task

Event wait task:-

1.    It passes the processing of the pipe line until a specified event occurs

2.    It waits for an event to occur to start subsequent tasks in the workflow

3.    There are 2 types of events can be defined,

I.    User defined event

II.    Pre – defined event (file watch event)


I.    If all dimension loads are success then load the data into fact table

II.    There are 3 sessions with the name s1, s2, s3 represents dimensional loading

III.    S4 session represents fact loading

IV.    Design a workflow that means needs above requirement specification

User defined event (all dimensions are dependent)

Pic12: –


I.    Open the client power center work flow manager

II.    From tools menu —>select work flow designer

III.    From work flow menu —>select create

IV.    Enter the work flow name W – star – load – user – event

V.    Select events tab —>from toolbar click on new

VI.    Enter the event name Dim – load – complete —>click ok

VII.    Drop the session’s s1, s2, s3 in sequence

VIII.    Create the task type event raise & event wait

IX.    Drop the session s4 beside the event wait task

X.    Create the links between the tasks as described above

XI.    Create the link conditions between s1, s2, s3 and event raise tasks

XII.    Double click on event raise task —>select properties tab

Attribute                                   value

User defined event                    Dim – load – complete

Clicks apply —>click ok

XIII.    Double click on event wait task —>select events tab —->select user defined —>click on browse events

—>select event Dim – load – complete

Clicks apply —>click ok

XIV.    Save the workflow


1.    There are 3 sessions with the name s1, s2, s3 that represents dimensional loading, loads are independent

2.    S4 is the session that represents fact loading (otherwise s4 represent fact)

3.    Design the workflow in such a way that if all dimension loads are success then load data into fact table

–>procedure: (All dimensions are dependent) for ex we have 40 sessions that time it degrade the performance.


1.    Double click s4 —->from general tab —->set treat the input links as,

* AND                     OR

Clicks apply —->click ok

2.    Save the workflow & start workflow

Note: If there are more link conditions workflow performance may degrades

Decision task:-

1.    The decision task has a predefined variable called $decision – task – name condition that represents the result of the decision condition

2.    The integration service evaluates the condition in the decision task and sets the predefined condition variable to true (1) or false (0).

3.    Use the decision task instead of multiple link conditions in a workflow. Instead of specifying multiple link conditions, use the predefined condition variable in a decision task to simplify link conditions.

$decision – task – name – condition = true

$decision – task – name – condition = false

4.    The decision task functions as filter T/R


1. Double clicks the decision task —>select the properties tab

Attribute                          value

Decision name              $s1. Status = succeeded AND $s2. Status = succeeded AND $s3. Status=succeeded

Clicks apply & click ok

2.    Double click on the link between decision & s4

3.    Create the following link condition,

$Decision. Condition = true

4.    Double click on the between decision & Email

5.    Create the following link condition,

$Decision. Condition = false

6.    Double click the email task —>select properties tab,

Attribute                                   value

Email user name                       RKR – DSR @

Email subject                             dimension load failed

Assignment task:-

1.    It assigns the value or expression to a user defined workflow variable

2.    A user-defined workflow variable must be declared in the workflow variables tab


There are 2 sessions with the name s1 & s2

S1 represents a daily load

S2 represents a weekly load

If the daily load 7 times success then perform weekly load



1.    From workflows menu —->select create enter the workflow name w – daily – weekly – load

2.    Select the variables tab —->from tool bar click on add a new variable,


Name                              data type                         persistent —> (save the value)

$$X                                   integer                                  [*]

Enter the default value: [    0     ]

Click ok

3.    Drop the sessions S1, S2

4.    Create the task type assignment and decision

5.    Make the links between the tasks

6.    Create a link condition between s1 & assignment task ($s1. Status = succeeded)

7.    Double click on assignment task —->select expression tab

8.    From a toolbar click on add a new expression,

User defined variables                      operator                             expression

$$X                                                  =                                          $$X+1

Clicks apply —>click ok

9.    Double click on the decision task —->select the properties tab,

Attribute                                          value

Decision name                             MOD ($$X, 7) = 0

10.    Double click the link between decision & s2 —-> from predefined tab —>double click on variable condition

$Decision. Condition = true

—->What task function act as filter T/R?

(A) Decision task

—>What are the types of events?

1.    Predefined event

2.    User defined event

—->What are the types of variables?

1.    Predefined variables

2.    User defined variables —->$$X, (X)—->as ur wish what name we want

—>How to write condition in decision task?

(A) Based on required write the condition

$Decision (task name). Condition (predefined variable) = true (or false) or MOD ($$X, 7) = 0

When we create the task that time what name you have to give. Give that name here. Another example, $ abc —>task name

Scenario 1: –

There are 3 sessions with the name s1, s2, & s3

If s1 success then executes s2 else execute s3

Scenario 2: –

There are 4 sessions with the name s1, s2, s3 & s4

S1 and s2 are in parallel

S1 and s3 are in sequence

S2 and s4 are in sequence

If s3 succeeded then execute s4


Scenario 3: –

There are 3 sessions with the name s1, s2, & s3 which are in sequence

If s1 is succeeded execute s3 and by pass s2


Scenario 4:-

There are 2 sessions with the name s1 & s2

If s1 success 10 times then execute s2

Scenario 1 solution:

Pic17: –

Scenario 3 solution:



1.    It will send email notifications during run

2.    There are 2 types of email tasks,

I.    Reusable email task

II.    Non – reusable email task

3.    An email task is configured in a 2 different ways in a work flow

a)    Stand alone email task

b)    Post session email task

Stand alone email task:-

1)    It can be used anywhere in the workflow, defined with link conditions to notify the success or failure of prior tasks

2)    Visible in flow diagram

3)    Email variables can’t be defined with stand-alone email tasks

Post session email task: –

1)    An email task is defined within the context of session

2)    Email variables can be defined to provide run-time statistics that include,

No of records loaded

No of records rejected

Session status

Mapping name

Session name

Workflow name

% e ———>return session status

% g ——–>Attach session log

% l ——–>returns no of records loaded

% r ——>returns no of records rejected

% m —->returns mapping name

% n —->returns folder name

% s —>returns session name

% w —->returns workflow name

% v —->returns integration service name

% d —>returns repository name

Procedure: –

1)    From tools menu —>Select task developer

2)    From task menu —>select create

3)    Select the task tab —->Email —>Enter the name

4)    Select the task type email, enter the name on – success – email click on create and done

5)    Similarly, create an email task with the name on – failure – email

6)    Double click the session —>select the components tab,

Task                                 type                               value

On – success – email               reusable                        on – success – email              [   ] —>edit

On – failure – email                reusable                         on – fail – email

7)    Click on edit success email,

Attribute                               value

Email user                            RKR–

Email subject                           % e

Email text                                 % g

Clicks apply and click ok

8)    similarly, edit failure email task,

Attribute                        value

Email user            

Email subject                            % e

Email text                                   % g

Clicks apply and click ok

9)    Create & start workflow

Email user:

A)    RKR– (is dynamic) why because this employee moves to another company this user replaced with new user name at run time

b)    Email user is static means we can’t changed that user name when that person moved to another company also. Email user: RK– (static) —>Static mail: it is developed through session parameters

c)    Email task is mainly required for production environment

d)    How to create static mail,

Start —>Run —>Notepad

[    $   ]

Timer task

I.    It waits for a specified period of time to execute (to start) next task

II.    The timer task is operated in a two different modes,

a)    Absolute mode

b)    Relative mode

Absolute mode: –

a)    Absolute time: the user can specify the date and time to start the timer from

b)    Date time variable: the user can provide a variable that lets the time task know when to start the timer from

Relative mode: – (in real time we use this)

Relative time: -The timer task can start the timer from the start timer of the timer task, the start time of the workflow or work let, or from the start time of the parent workflow


—>Timer task is mainly used for scheduling workflow

Workflow 11AM —>timer (11:05 AM) —>absolute mode

Anytime workflow start after 5 mins Timer (5 mins) will start —->Relative mode

Events wait task – predefined events (file watch event)

Event waits task (cont’d):-

For a predefined event, the task waits for the physical presence of a file in a directory local to the integration service process machine. This file is known as an indicator file. If the file doesn’t exist, the event wait task will not complete. When the file is found, the event wait task completes and the workflow proceeds to subsequence tasks. The event wait task can optionally delete the indicator file once detected or the file can be deleted by a subsequence process.

Loading specifications: – (see diagram in next page same side)

1.    A workflow is schedule to run every day at 11 AM. Data provider provides the source data files through FTP before the scheduled time

2.    When the source data files are available at FTP location he will send indicator file that indicates the source data files are readily available at a given location

3.    If the indicator file doesn’t appear with in the scheduled time wait for 5 hours from the start time of the workflow. Then send an email and stop the workflow

4.    If the file appears within 5hrs then start the session that loads the data

Control task

1.    Control tasks are used to alter the normal processing of a workflow. They can stop, abort or fail any workflow

—–>Module lead architect (or) ETL architect will design the workflows.

—–>stop: Orderly shutdown (orderly stop session)

Abort: immediate shutdown

—->What is the dif between stop & abort?

Stop: – Orderly shut down

Stop = stop executing after orderly shutdown

Abort: -Stop executing immediately.

Fail: – It gives status fail

—->As a developer our main role is,

Develop mapping and sessions.

—->We can’t take OLTP directly. It can’t give accurate data. In real time, we take the data in the form of files.

—–>OLTP’S data are live data. We can’t access the data. Why because,

Ex: -IRCTC (11 AM to 5)

Customer and employees —->access or use at the same time. In this time OLTP data is live data means we can’t get accurate data.

IRCTC down time is 11.30 PM to 12.30 AM

Down time means it takes back up to OLTP

Why we can’t use live OLTP?

Down time (1 hr) but extraction, transformation & loading takes

That’s why we can’t use OLTP live data for extraction, transformation & loading.

—>File watch event: means watch the file how much time it will come

—–>To see how many time our workflow is executed?

Workflow (W – S – M – event)

—->Right click on workflow name

—->Select view persistent values

Loading specification diagram: -Workflow starts after 5 hrs (loading specific time) production people run the sessions manually.


—>Event waits task – predefined event is possible through indicator file. Indicator file also called touch file (or) ‘0’ byte file.


Indicator file exists that time developer understand the data available.


Once location is found out indicator file erased automatically

—->What is the difference between start task & start work flow from task?

(A) Start task: -integration service starts the execution from session or task

Start work from task: integration service starts the execution from workflow

All sessions are succeeded still its wait for timer task success to overcome this problem by using the following diagram:


1.    Double click on event wait task —->select the events tab —>select predefined —->enter the name of the file to watch.

*predefined: [c: /JPMC/test. Txt]

2.    Select properties tab —->select delete file watch file click apply & click ok

3.    Double click on the timer task —>select the timer tab —>select relative time —>set start after as ur wish 2 mins

—->select from the start time of the top level workflow

*relative time

Clicks apply & click ok

4.    Double click on the control task —>select properties tab,

Attribute                                               value

Control options                               stop top – level workflow

Clicks apply & click ok

5.    Define the standalone email task

6.    Save the work flow

Without setting any time in timer task:


Keep on workflow is running


Timer —->Running (In this task we didn’t set time that’s why it keeps on running)

Event wait —–>Running (Indicator file is not received that’s why it keeps on running)


Workflow: Running —->succeeded

Timer: running —-> reach 2 mins —->succeeded

If indicator file come

Event wait: running —->before 2 mins —->succeeded

S1: running

Drawback: –

If sessions are succeeded

It waits for timer task complete then all succeeded.

—->Indicator file


In timer task 2 mins reaches email task didn’t receive indicator file. After 2 mins, timer will execute. Email task sends MSG and control task stop the workflow

1.    Workflow: Running

2.    Email wait: Running (It waits 2 mins for indicator file then timer execute)

3.    Timer: Running (2 mins do not receive any file)

4.    Email: Running —>After 2 mins —->Succeeded (It send message)

5.    Control: Running —>After 2 mins —->Succeeded

—->If the workflow contains work let then we take in timer task select

*relative mode

*start time of the parent workflow/work let


1.    A work let is defined as group of related tasks

2.    There are 2 types of the work let,

I.    Reusable work let

II.    Non – reusable work let

3.    Work let expands and executes the tasks inside the workflow.

4.    A workflow which contains the work let is known as parent workflow.

1.    Reusable work let: –

Created using work let designer tool

Can be assigned to multiple workflows

2.    Non – reusable work let: –

Created using workflow designer tool

Created specific to workflow

Business purpose: –

A workflow may contain dozens of tasks whether they are concurrent or sequential. Doping workflow design they will be developed naturally into ‘grouping’ of meaningfully – related tasks, run in the appropriate operational order

Note: –

The relationship between workflow & work let is like main program & sub program

Creation of: reusable work let

1.    From tools menu —->Select work let designer

2.    From work lets menu —->Select create

3.    Enter the work let name (WKT – Sequence – batch – process) click ok

4.    Drop the sessions in a sequence order

5.    Create the link conditions between the tasks

6.    from repository, menu clicks on save

7.    Create a workflow

8.    From left window drop, the work let beside the workflow

9.    Link the tasks —->from repository menu click on save

—->Before 7 version there is no work let. In 6 version, 5 version it is called as batch. A batch is a group of statements.

—–>Workflow —>start, work let —>start, are having same symbols

—->work let is dependent. We can’t run work let without workflow

—->Batch is a group of task

—-> Before 7.0 version we called work let as Batch.

—->Work lets are simplify the workflow design


a)    It executes operating system commands

b)    There are 2 types of command tasks

I.    Reusable command task

II.    Non – reusable command tasks

c)    Command task can be configured in a two different ways,

I.    Stand alone command task

II.    Pre & post session command task

d)    If the integration service is configured on windows OS then it executes DDS commands & batch files

e)    If the integration service configured on Unix OS then it executes Unix commands & shell scripts

Procedure: –

1)    From tools menu —->select task developer

2)    From task menu —->select create

3)    Select the task type command —->enter the name

(On – success – CMD)

4)    Click on create & done

5)    Double-click the command task —->select commands tab

6)    From toolbar —>click on add a new command

Name                                               command

Command – success                    copy c: \ result. Txt   c: \ JPMC

Clicks apply —> click ok

7)    From repository menu —>click on save


If s1 success executes s2 else execute s3. Implement the above scenario using post – session command task.

Pic27: –

Mapping debugger

1.    Through session log, we find out the error logs but we don’t know correct data is loaded or not when mapping succeeded.

2.    By using debugger, we have to find out what are the columns loaded & port values are correctly loaded or not for every T/R we have to find out how data is load into diff T/RS

3.    running the debugger means to run the session

4.    Debugger option,

O use an existing session instance

O use

O create a bedbug

O —>no session is available to run the debugger (means execute the mapping)

—> *Discard target data

Uncheck this option means no data loaded into target

—->In Debugger,

Instance —>Entire mapping where the data is loading

—>F10 —>Next instance

Note: -As running debugger we can’t close workflow

—>Internally we have many options in debugger

For ex: -EMPNO is PK. But it accepts null value in one record. I want to find out to through create break points (Without using F10 we use break points because we have 10000 million records)

a)    Mapping debugger is the power center designer component which helps on monitoring the data flow at each T/R level.

b)    To start debugger —>go to mapping menu —>Debugger —>start debugger

c)    Now debugger wizard will start

Click on next,

Debug mapping



*use an existing session instance

O use an existing reusable session

O create a debug session instance

Use an existing session instance: –

Select this option when a valid session is available for the mapping. Selecting this option with out having a session will give an error.

Use an existing a reusable session: –

Select this option when reusable session is available for the mapping. Selecting this option without having a reusable session gives an error.

Create a debug session instance: –

a)    Select this option to debug a mapping without creating a session


Mapping debugger

I.    Debugger is a tool by using this we can identity records are loaded or not and correct data is loaded or not from one T/R to other T/R

II.    Session succeeded, but records are not loaded. In this situation, we have to use debugger tool

III.    What is the functionality of F10 in Informatica?

(A) F10 —->Next instance

IV.    What T/R having no cast?

(A) Lookup T/R

Note: -Prevent wait is available in any task. It is available only in event wait task

—->F5 —->Start debugger

—>Debugger is used for test the records are loader or not, correct data is loader or not

—>Debugger is used only for to test valid mapping but not invalid mapping.

Working with post session command task


I.    Create the 2 folders with the name success and fail in ‘c’ drive

II.    Create a zero-byte file with the name result. Txt in ‘c’ drive

Step2: -Creation of reusable command tasks (Using task developer tool):

I.    Create to reusable command tasks with the name (on – success – command, on – failure – command). Double click on – success – command task —->select commands tab —>from tool bar click on add a new command

Name                                                        command

Command – success                             copy c:\resut. Txt   c:\success

—>If the integration service configured with windows means we use DOS command

—>If the integration service config

Clicks apply —–>click ok

II.    Double click on on – failure – command —> select commands tab —>from toolbar click on add a new command,

Name                                                           command

Command – failure                                   copy c:\result. Txt c:\fail

Clicks apply —-> click ok

From repository, menu clicks on save

Step3: configuring post session command task:

1)    Double click the session s1 —->select components tab

Task                                                                     type                               value

Post – session success command               reusable                          on – success – command

Post – session failure                                    reusable                          on – failure – command

Clicks apply —->click ok



a)    Create the workflow —>Drop the session’s s1, s2 & s3

b)    Create two events wait tasks with the name, on – success – event – wait, on – failure – event – wait

c)    Create 2 control tasks with the name on – success – stop, on – failure – stop

d)    Create the links between the tasks,

Pic28: –

e)    Double click on – success – event – wait —>select events tab —->select predefined —>enter the name of the file to watch c:\success\result. Txt

f)    Select properties tab —->select delete file watch file click apply —>click ok

g)    Double click on on – failure – event – wait —–>Select events tab —->select predefined —->Enter the name as the file to watch c: \ fail \ result. Txt

h)    Select properties tab —->Select delete file watch file click apply —->click ok

i)    Double click on control task with the name, on – success – stop —->select properties tab,

Attribute                                                 value

Control options                                      stop – top – level – workflow

j)    Double click on control task with the name, on – failure – stop —>select properties tab,

Attribute                                                      value

Control options                                             stop top – level workflow

Clicks apply & click ok

k)    Save & start the workflow

—->without control task,


Workflow:  running ———>manually we can stop the workflow not automatic. For this

S1: succeeded           purpose we use control task. It stops the session task or any task

On – success – event: succeeded         automatically

On – failure – event: running

Unconnected stored procedure

a)     It is neither connected to the source nor connected to the target

b)    It can receive the multiple input ports but always returns a single output port


Business problem: –

1.    If the target is having an index then loading performance degrades

2.    An index is created for each record, as a result loading performance (or) loading time increases


I.    Drop an index before loading the data into the target and recreate the index after loading the data into the target (pre & post activities)

Creation of target table with index: –

a)    Logon to oracle with the target database account,

Username: Batch 7 Am

Password: Target

Connect string: ORCL

b)    In SQL prompt,

SQL > create table EMP – Trg as select * from Scott. EMP where 1 = 2;

Table created

SQL > Create index EMP – IDX on EMP – Trg (EMPNO);

Index created

SQL > select Index – name from user – indexes where table – name = ‘EMP – TRG’;

Index – name


Creation of procedures in target database account: –

Procedure to drop index: –

Create or replace procedure EMP – Drop – Index


V – index – name in varchar 2




Execute immediate

‘Drop index (—>leave single space)’ //v – index – name;



Procedure to create an index:-

Create or replace procedure EMP – create – index


V – table – name in varchar 2,

V – col – name in varchar 2,

V – index – name in varchar 2




Execute immediate

‘Create index ‘ //v – index – name//’ on

‘//v – table – name//’ (‘// v – col – name//’)’;



Designing mapping: –

1.    Source table: EMP                                         target table: EMP – Trg

2.    Create a mapping M – drop – create – index

3.    Drop the source & target definitions to the mapping designers workspace

4.    From source, Qualifier connects the ports to the target

5.    From transformation menu —->select create select T/R procedure —->enter the name (SP – drop – index) —->click on create

6.    Connect to the target database with the following details,

ODBC data source: Batch – ODBC – Oracle

User name: Batch 7 am

Password: target

Click on connect

7.    Select the procedure with the name EMP – drop – index clicks ok —>click on done

8.    Similarly, import the stored procedure with the name EMP – Create – index

9.    Double click the stored procedure T/R with the name SP – Drop – index –>selects the properties tab,

Transformations attribute                                                            value

Stored procedure type                                                                 target pre load

Call text                                                                                          EMP – Drop – index (“EMP – IDX”)

Connection information                                                               oracle – batch 7 AM – DB

Clicks apply —->click ok

10.    Double click the stored procedure T/R with the name SP – create – index —->select properties tab,

Transformations attribute                                                              value

Stored procedure type                                                                     target post load EMP – Create – Index

(“EMP – Trg”, “EMPNO”, EMP – IDX”)

Connection information                                                                  oracle – Batch 7 AM – DB

Clicks apply —-> click ok

11.    from repository menu —->click on save

—-> (Run)

Pic30: –

Time is increased & performance degrades that’s why we before loading the target table index, create index on source this process is called a target pre load

After loading the data into target table, create index on the target table, this process is called target post load.

Unconnected lookup transformation

1)    It is not part of mapping dataflow

2)    It is neither connected to the source nor connected to the target

3)    It can receive the multiple input ports but always returns a single output port. That should be designated as return port (R)

4)    If the return port is not checked the mapping will be valid but the session created for that mapping will fail at run time

5)    Use unconnected lookup when the lookup is not needed for every row

6)    The lookup data is called at the point in the mapping that needs it

7)    Lookup function can be set with in any transformation that supports to write expressions

8)    Use the lookup function within a conditional statement. The condition is evaluated for each row but the lookup function is only called if the condition evaluates to TRUE.

9)    The unconnected lookup T/R is called using following key expression,

: Lkp. Lookup name

Note: – unconnected lookup is also known as conditional lookup

Business purpose: –

A source table or file may have a percentage of records with incomplete data. The holes in the data can be filled by performing a look up to another table or tables. As only a percentage of the rows are affected it is better to perform the lookup on only those rows that need it and not the entire data set

Differences between connected & unconnected lookup: –

Connected lookup                                                        unconnected lookup

1)    Part of the mapping data flow                                       1) separate from the mapping data flow

2)    Returns multiple values                                                  2) return one value – by checking the return (by (by linking output ports to another                                         (R) port option for the output that provides       Transformation)                                                                   the return value

3)    Executed for every record passing the                       3) only executed when the lookup function is Transformation.                                                                                  Called

4)    More visible, shows where the lookup                      4) Less visible, as the lookup is called from an

Values are used.                                                                         Expression within another transformation.

5)    Default values are used                                                  5) Default values are ignored

Note: -unconnected lookup by default it takes null values only. But in connected lookup we mention default value for job ‘CEO’ connected lookup accept these default value in job column

Connected lookup: –

Here integration service retrieves all records and then process.

Pic31: –

I want to load which records having nulls, but here all records loaded into target. Here it takes a lot of time and performance degraded. For this purpose, we use unconnected lookup. In connected lookup so many draw backs is there in this situation.

Unconnected lookup: –

Pic32: –

—>Unconnected lookup returns only one output port, in this we use we check return port (R) option. It combines multiple ports single port that ports are called return port.

—->Here we give RKR name to lookup. It is taken as function RKR.

—>In lookup override property we can write multiple conditions at a time

—>Why you are not connected stored procedure in the mapping?

(A) There is no target preload

Target post load

Here target table has index we will not do above these 2 operations



Procedure for conditional Lookup: –

Source table: EMP                                                          Target table: Stg – EMP – conditional – LKP


Lookup table: EMP

1)    Create a mapping with the name M – conditional – LKP

2)    Drop the source and target definitions

3)    From transformation menu —>select create —>select the transformation type lookup —-> enter the name (LKP – EMP)

Click on create.

4)    Select the source —->Select the table EMP —->Click ok and click on done

5)    Double click the lookup —->select the ports tab

6)    Delete the following ports (ENAME, JOB, Mgr, hire date, comm, dept no)

7)    From tool bar —->click on add a new port

Port name              data type                prec             scale           I      O     L    R

Empno                    decimal                    4                    0                      *      *

Sal                            decimal                     7                   2                       *      *      *

In – empno              decimal              4          0        *

8)    Select the condition tab,

Lookup table column                    operator                      transformation port

Empno                                                  =                                     In – Empno

9)    Select the properties tab,

Transformations attribute                                 value

Connection information                                 Oracle – Scott – DB

Clicks apply —->click ok

10)    Create the T/R expression —->From source Qualifier copy the required ports to expression (Empno, Ename, Job, sal, comm, dept no)

11)    Double click expression T/R —>select the ports tab from toolbar click on add a new port

Port name          data type        prec          scale             I         O       V

O – Comm           decimal            7                 2                            *


IIF (is null (comm): LKP LKP – EMP (EMPNO), comm)

Clicks apply & click ok

12)    From expression connect the ports to target


Cron —–>UNIX based (development & administration)


MLM (work load Mgr)



Above these are third party schedulers

Regress used in environment

—->Scheduling is automate of the running workflow at given date & time

—->pay role application, hr application, informatica application uses third party scheduler

—->Informatica receives multiple applications. Those applications are different applications. That is one is pay role application, HR application that’s why informatica uses third party scheduler

—>uncounted lookup (or) conditional lookup both are same

Based on condition lookup on source that why it is called conditional lookup. It is uncounted lookup. Lookup is not needed for every record that time we have to use unconnected lookup

Here lookup name treated as lookup function i.e. LKP – EMP ()

PIC33: –

Here unconnected lookup reads records from source when condition is satisfied,


Scheduling workflow

1.    A schedule is an automation of running the workflow at a given date & time

2.    There are 2 types of schedulers,

I.    Reusable scheduler

II.    Non reusable scheduler

Reusable scheduler: –

A reusable scheduler can be assigned to multiple workflows

Non reusable scheduler: –

A non reusable scheduler is created specific to the workflow

A non reusable scheduler can be converted into a reusable scheduler.

The following are the 3 rd party schedulers,

1.    Cron (Unix based scheduling process)

2.    Tivoli

3.    Control-M

4.    Auto sys

5.    Tidal

6.    WLM (work load Mgr)

These 3 are very frequently used

—->99% production people will do scheduling

—>Before we run the workflow manually. Through scheduling, we run workflow this is called auto running


Creation of reusable scheduler: –

1.    Open the client power center workflow manager

2.    from left window activate the folder

3.    from workflows menu —->click on schedulers

4.    Click on new —>enter the name daily – scheduler

5.    Click on schedule tab —->From Run options select

*Run on integration service initialization from scheduled options —->select run evry

*Run every   days   hours      minutes

1          0                0

From start options —->Select start date & time

From end options —->select forever (continuously running no end date)

Clicks apply & click ok

*Run on demand —->by default (manually we can run workflow)

*Run continuously —>stream workflow continuously (real time, live, stream)

Assign – scheduler to workflow: –

1.    From workflows menu select create/edit select the scheduler tab —>select reusable —>click on scheduler —>select the scheduler —>click ok

Click apply —>click ok

From repository, menu clicks on save

Mapping parameters: –

I.     A parameter represents a constant value that can be define mapping run

II.    Mapping parameters are local to the mapping

III.    Mapping parameters can be used with source qualifier T/R to define SQL override (where clause condition)

IV.    Mapping parameters are created to standardize the business rules. That increases the flexibility in mapping development

V.    The mapping can be reused for various constants

VI.    Decrease the overhead in development

VII.    Mapping parameters can also be created to perform incremental extraction (SQL override)

VIII.    Mapping parameters are represented with the constant values using parameter file which is saved with an extension either. Txt or .Prm (parameter)

Syntax: –

[Folder. WF: workflow.ST: session]

$$parameter    = constant value

$$parameter 1 = constant value

$$parameter 2 = constant value

Ex: -Gold rates, home loans, tax


Mapping Parameters:-

1.    A parameter represents a constant value that can be defined before mapping run

2.    Mapping parameters are local to the mapping

3.    Mapping parameters can be used with source Qualifier T/R to defer SQL override (where clause condition)

4.    Mapping parameters are created to standardize the business rules that increases the flexibility in mapping development

5.    The mapping can be reused for various constants

6.     Decrease the overhead in development

7.    Mapping parameters can also be created to perform incremental extraction (SQL override)

8.    Mapping parameters are represented with the constant values using parameter file which is the

9.    Saved with an extension either.txt or .prm (parameter)


[ Folder. WF: Workflow. ST: session]

$$ parameter = constant value

$$ Parameter = Constant value

$$ Parameter 2= constant value

Ex: Gold rates, Home loans, Tax




Incremental Extraction:-




  1. Default date format in Informatica is mm/dd/yyyy

Overhead: -I can reuse the same mapping repeatedly modifying mapping logic is called


1.    Source table: Emp[EMO no, Ename, Job, sal, tax, deptni, target table: Stg-Emp-Param]

2.    Create the mapping with the name m-param

3.    Drop the source & target definitions

4.    Create the T/R type expression

5.    from mapping menu ——> select parameters and variables

6.    From too bar click on Add a new variable

Name             Type             Data type         Prec         Scale

$$ Dno        Parameter      integer             10           0

$$ Tax        Parameter       decimal             5            2

Click OK

7.    From S qualifier copy, they require ports to expression

8. Double click the S qualifier T/R —–>select the properties tab

Transformation attribute                             Value

SQL Query                                                            Select Emp. Emp no, Emp-E-name, Emp. Job, Emp.sal, Emp.deptno  from Emp, where Emp.deptno = $$ Dno

Click apply—–>Click ok

9. Double Click the expression T/R  ——->select the ports tab

Port Name          Data type            Perc        Scale       0        Expression

Tax                     Decimal              7                     2        yes        sal*$$ tax

Note:- In IIF (Sal > 8000, $$ sal * Tax1, $$ sal * Tax2)

Click apply ——>Click ok ( here we create 2 parameters)

10. From Expression T/R Connect the ports to target

11. Click save


Creation of parameter file:-

Open the text editor notepad

[Batch 7 am.WF: W-S-M-STG-EMP-Param.ST: S-M-Stg-Emp-param(session task)]

$$D no = 20

$$ Tax = 0.17

Note: – Parameter file we declare any number of parameter

12. From file, menu clicks on save

13. Select the save as type all files

File name: “Test.Prm” (informative native

Click on save

——>C: \ Sales pavan \ test prm

14. Create a session—–>double click the session —-> select the properties tab,

Attribute            Value

Parameter filename            C:\ sales param\ test.prm

15. Select the mapping tab —–>define the reader & writer connection with the target load type normal

Click apply —–>Click ok

16. Save the session

17. Create & star the workflow

*Diff between variable port & mapping variables

* Diff between mapping variables & mapping parameters

* What are the ways to generate sequence numbers?

A. 1. Sequence generator T/R

2. Expression T/R (using variable port)

3. Mapping variables (regularly used in real time)


Mapping variables

1. A variable represents a value that can be changed during mapping run

2. A mapping variable is created with the name, type data type, precision, scale & aggregation

3. After each successful completion of the session the integration service save the current variable value in the repos story (value is persistent (permanently save)

4. Integration service reads the current variable value from the repository for each subsequent runs & performs incremental operations.

5. Mapping variable are local to the mapping

6. Mapping variables can be used with source Qualifier to define SQL override filters

Business purpose:-

1. Mapping variables are created to perform incremental extraction

Differences between variable port & mapping variable

Variable port                                             Mapping variable

1. Local to the T/R                                             1.Local to the mapping

2. Values are non-persistent                       2.Values are persistent

3. Can’t be used with SQL override            3.Can be used with SQL override

*Mapping variables is used for incremental extraction

*In mapping variables no need to changed the date, it automatically changed

*In mapping parameter you have to change the date & time

Home Assignment:-

Read the help about incremental extraction using mapping variables


Generate sequence numbers using mapping variables

Source: Emp

Target Table:- Stg-Emp-Sequence-variable(Employee-key+ all source Columns)

1. Create a mapping

2. Drop the source & target definition

3. From toolbar click on add a new variable

4. From mapping menu——->Select parameters & variables

Name        Type        Datatype    Prec         Scale        Aggregation

$$CNT      variable     integer        10              0                 count

Decimal that time we get Aggregation (Max-Min(here no count))

In expression variable

False —–>by default True (Means use these mapping variables in expression)

Initial value [0]

What is T/R S supports SQL override?

S Qualifier


What is different between variable port & mapping variables?

How to capture the rejected records?

Router T/R

Update strategy T/R [] Follow rejected records

Note: Variable port by default value is ‘0’ no need to initial the value

Note: Once session completes variable ports automatically will delete

5. Create the T/R type expression —–>From source Qualifier copy all the ports to expression

6. Double click the expression T/R —–>select the ports tab from toolbar click on add a new port

Port Name           Data type           Prec        Scale     0    Expression

Employee key      integer                10                           0    []

(set count variable ($$cnt))

Variable ()

+Set count variable ()

+Set Max variable ()

+Set Min variable ()

+ Set variable ()

Click apply ——>Click ok

What is diff between mapplet & reuseable T/R?

7. Connect the ports from expression to target

View current variable value:-

From Wok flow ——>Select the session —–>Right click

Click on view persistent values

Session Parameters

1. Session parameters represent values that can change between session runs, such as database connections or source and target files

Use user-defined session parameters in session or workflow properties and define the values

In a parameter file, when you run a session, the parameter files with the parameters in the session



$DB connection = Relational \ connection

Built in session parameters:-

Use built in session parameters to get run-time information such as folder name, service names or session run statistics (source table name, target table names. How many records loaded rejected, integration service start date & time)

Note: Session parameter is 2 types

1.    User defined session parameter

2.    Built in session parameter




1. Design a flat mapping

2. Create session

3. Double Click the session ——> Select the mapping tab

4. From left window select the source ( SQ-EMP)

5. On Right-hand side window, from connections sections section

Click on (relational connection browser ) to open relational connection browser

•    User connection variable ——>Enter the name

$DB connection –src

Click ok

6. Similarly, parameters write connection

($ DB connection –Trg)

Click apply —–>Click ok

•    In this example (Scenario) we have to use same database (oracle) with different users (Scott, sess) with same table names (Emp, Emp)

Extract any table metadata but here we can’t use session parameters



*Parameters are define at session level that’s why these parameters are called session parameters





[May be possible here we create two parameter files one for oracle database & another one for my SQL server]

How many ways to get the information (statistics) in run time?

1. Built-in session parameter (Integration service create predefined parameter in session level)

2. Post session email task (to provide run time Statistics)

User define session parameter:

User can define the parameters before run the session

Creation of parameter file:-

1. Open the notepad

Typing the following syntax,

[Bacth 7 am. S-m-stg-Emp-session-param]

$Db connection –src = Oracle-Scott-DB

$ DB connection –Trg =Oracle-batch 7 am – DB

Save the file with an extension.txt(C:\ sales parm\ sessparm.txt)

2. Double Click the session —->Select the properties tab,

    Attribute                                                           Value

Parameter file name                      C:\salesparm \ sess parm.txt

Save the session

3. Create & Start workflow

Types of lookup caches

1. By default, integration service builds cache to process the lookup T/R

2. Look up T/R can be configured with the following types of the caches

i. Non-persistent cache (default)

ii. Static lookup cache (default)

iii. Persistent lookup cache

iv. Shared lookup cache

1. Non-persistent lookup cache:-

i. By default, integration service erases the cache when the session completes hance it is known

as non-persistent lookup cache

ii. For each session run integration service prepares a select Query which is submitted to the lookup date base

2. Static lookup cache:-

i. By default integration service builds the read-only cache, cache can’t be updated hence it is known as a static lookup cache.

*Q. Can U convert non-persistent cache into persistent?

A. Yes, (Double click on lookup T/R—–>Property tab —–>Lookup cache persistent

3. Persistent (reusable) lookup cache:-

i. Can be configured with lookup properties tab

ii. When we configure the lookup T/R with the lookup cache persistent property, the cached data is saved to an integration service local drive.

iii. At the end of the session, the cached data is saved in two files

a.    Index cache file (.idx)

b.    Data cache file (.dat)

iv. Cache files are located in the following server directory

[C:\informatica \ 9.5.0\ server \infa-shared\ cache]

v. For subsequent session runs integration service builts the cache from local server hard drive(not from database server)

vi. Session performance increase

Re-cache from lookup source:-

1. Rebuilds the cache from lookup source, erase historical cache

2. Use persistent lookup cache when the lookup source is static, which doesn’t change very frequently

Note: – Static lookup cache is nothing but non-persistent cache (when session completes erasing the cache)

4. Shared lookup cache:-

1. Persistent cache is shareable

2. Persistent cache can be configured with cache file name prefix property

Unnamed cache:-

Persistent cache shareable within the mapping when you look up the same source with the same properties

Named cache:-

Persistent cache is shareable with other mappings when you lookup same source with the same properties

Implementation of persistent lookup cache:-

1. Source Table:-

Emp 1

Note: SQL>conn Scott/tiger;

SQL>Create table Emp1 as select * from Emp where

Dept-no in(20,30);

Target table:

Stg-Emp-dept-Lkp(Emp no, E name, Job, Sal dept no, dname, LDC)

Lookup table:-


2. Create a mapping with the name m-persistent-lookup-cache

3. Drop the source & target table to the mapping designer workspace

4 .Create the T/R type lookup that lookup on source table Dept

5. From source, Qualifier copies the port dept no to the lookup

6. Double click the lookup T/R —–> select the ports tab for a port name Dept no1 uncheck lookup output port

7. Select the condition tab,

Lookup table column        Operator        Transformation port

Dept no                                         =                Dept no 1

8. Select the properties tab,


Transformation Attribute           Value

Connection information                Oracle_Scott_DB(Look ok source because dept table                             available in source only)

Lookup cache persistent

Cache file name prefix                 Sales—-> UR wishes any name

Click apply  —–> Click ok

9.From lookup T/R connect S name, LDC to the target definition

10. From source, Qualifier connects the remaining ports to the target definition ( Em pno, Ename, Job, Sal, Dept no)

Note: Locate the cache files in the following directory

[C:\ Informatica \9.5.0\server \infa- shared \cache]

*For sales we get 5 files,

Sales .idxo           sales .idx1           sales.dat0           sales.dat1

Dept no                      10                     dname                Newyork

20                      Loc                     “

30                     “

[These files controls by lock file]

Sales.lok—->Lock file

Here cache is created for dept because lookup on dept that’s the reason integration service builds the cache

Implementation of shared lookup cache:-

Source table:


Note:- SQL>Conn Scott/tiger

SQL>Create table EMP as select * from EMP where dept no in(20,30);

Target table:-


[Empno, E name, job, sal, dept no, D name, LDC)

Lookup table:-


1. Create a mapping with the name, M-Shared-lookup- cache

2. Drop the source & target definitions

3. Create the T/R type lookup that lookup on source table dept

4.  From the source, qualifier copies the port dept no to the lookup

5. Double click the lookup T/R ——> Select the ports tab for a port name dept no 1 uncheck output port

6. Select the condition tab,


    Lookup table column     Operator     Transformation      port

Dept no                                 —-                    —                       Dept no’s

7. Select the properties tab,

    Transformation Attribute            Value

Connection information             Oracle-Scott-DB

Lookup cache persistent                []

Cache file name pre fir                sales

Click apply —–> Click ok

8. From lookup T/R connect d name, loc to the target definition

9. From source, qualifier connects the remaining ports to the target definition

10. Save the mapping

Note: – To verify whether cache is shared or not select

the session S-m-shared –lookup-cache —->Right click

Click on get session log

Debug (severity)————–Message(warming)

*Shared lookup cache use the persistent lookup cache

* Lookup cache is static then only we prefer persistent lookup cache


1.    A mapplet is a reusable object created with business rules using set of transformations

2.    A mapplet is created using mapplet designer tool in the designer client component

3.    There are 2 types of mapplets

i.    Active Mapplet

ii. Passive Mapplet

i. Active Mapplet:- A Mapplet which is created with at least one active transformation is known as active mapplets

ii. Passive mapplet:- A Mapplet which is created with all passive T/R s is known as passive mapplet


Mapplet Limitations:-

1. Keep the following instructions in the mind while creating the mapplets

a. If you want to use stored procedure T/R you should use stored procedure T/R with the type Normal

b. If u want to use sequence generator T/R you should use reusable sequence generator T/R

c. The following objects can’t be used in the mapplet

1. XML Source Qualifier T/R

2. Normalizer T/R

3. Pre (or) Post stored procedure T/R s

4. Mapplet


1. A mapplet is defined with mappet input and mapplet output T/R

2. Mapplet input is optional, but mapplet output is mandatorily required

Q. Can I design a mapping without a source?

A. Yes, we can design without source through the concept of mapplet

Q. Can I design a mapping without a source, using mapplet?

A. Yes, we can design


1.Open the client power center designer

2. From tools menu ——>Select mapplet designer —–>From

Mapplets menu ——>Select create

3. Enter the mapplet name MT-sales- Employees- Tac-calc

Click ok

4. From transformation menu —-> select create —–> Enter the name

Select the T/R type  —–>Mapplet input  —->enter the name  M-sales –employees

5. Similarly create mapplet output T/R

6. Create the T/R type filter & expression

7. Double Click the mapplet input T/R  —–> select the ports

Tab —-> From toolbar click on add a new port


Emp no

E name



Dept no

Click  apply —-> Click ok

8. From Mapplet input copy the ports to the filter T/R

9. Double click the filter T/R   —->Select the ports tab change the data type, precision& scale for the required ports

10. Select the properties tab& create the filter condition

Not is null (comm.) and Dept no = 30

Click apply ——-> Click ok

11. From Filter copy the ports to expression

12. Double click the expression T/R —-> select port tab

Port  Name     Data type     prec    Scale        IOV    Expression

V-Total sal                                                              []    Sal + comm.

Tax                                                                               []

IIF(V-Totalsal>5000, V- Total sal*0.15, V-Total sal * 0.11]

Click apply  —–>Click ok

13. From expression copy the ports to mapplet output


1. Create the source & target definitions

2. Create a mapping, drop the source & target definitions

3. From the left window expand mapplet subfolder drag the mapplet, drop beside the source qualifier T/R

4. From source qualifier connects the ports to mapplet input, from mapplet output, connect the ports to target.


1. From mapplets menu —-> select create enter the name


2. From left window expand source subfolder drop the following source to the mapplet designer workspace EMP, Dept

3. Create the T/R type  —-> Joiner & Expression[Dept no = Dep no , Tax =sal* 0.25]

4. Create the T/R type Mapplet output

5. From ExpressionT/R copy the ports to the mapplet output transformation

6. Save the mapplet



1. Create target definition (Emp no, Ename, Job , Sal, Dept no, Tax, D name , loc)

2. Create a mapping from left window expand mapplet

Subfolder —–>drag the mapplet, drop on mapping designers

3. Drop the target

4. From mapplet output connect the ports to target

Q. One customer wants to develop business rule by using T/R is it possible by using mapplet?

A. No, its not possible ( Mapplet created with more than 1 T/R)

For this, we use Reusable transformation

Q. Different between Mapplet & Reusable T/R?

A. Both are same only one difference is in mapplet we develop business rule more than 1 T/R but in reusable T/R we develop business rule using single T/R

Active Mapplet:-

Passive T/R

Active T/R (at least one)

Passive T/R

Passive T/R

[we have any no.of passive T/Rs but we one active T/R is there means it will change the records]

*What is Mapplet?

*What are the types of Mapplets?

*Store procedures —–>Property tab —->Normal (select this one)

Target preload

Target post load

Source preload

Source post load



1. A Reusable T/R is a reusable object created with business single T/R

2. It is created in two different ways

i. Converting non-reusable T/R into a reusable T/R

ii. Using transformation developer tool


Source qualifier T/R can’t be defined as reusable T/R

i. Converting non-reusable T/R into a reusable T/R:-

General procedure:-

1. In mapping select transformation

2. Double click the T/R which we want to convert to reusable

3. From transformation tab —–> select make reusable

[] make reusable

Click yes

Click apply —–>Click ok

4.    From Repository menu —–> Click on save

ii. Using transformation developer tool:-

1.    From tools menu  —>select transformation developer

2.    From transformation menu —–> select create —->

Select the T/R type—-> sequence generator  —-> enter

The name —–>click on create and done

3.    From repository menu —–>Click on save


1.    Standard sequence generator T/R is created using Mapping designer tool

2.    A Reusable sequence generator T/R is created using transformation developer tool

Mapping designer


1. Filter

2. Router {Reusable}

3. Sorter


1. When you drag the reusable T/R to the mapping designer workspace, it will be created as instance

2. The instance properties can be modified or altered (Can’t modify logic)

3. When you change the instance properties the original object doesn’t affect or impact


Open transformation developer ——>develop reusable T/R object









In this we have to change the reusable aggregator in two times but there is no effect on original T/R we drag the only instance, when we drag T/R means it automatically come instance, when we drag T/R means it automatically come instance only not original T/R

D. Mapping designer


1.    Reusable T/R

2.    Sorter T/R {reusable T/R (developed in transformation developer tool}

How many ways to develop reusable T/R?

1. Using Transformation developer tool

2. When we create T/R by default that T/R is non-reusable then we convert into reusable

*User defined functions

1. A user-defined function extends the functionality of power center built in functions

2. A user-defined function is a reusable object that can be used to define the business rules in multiple mappings

3. It is a reusable object created using power center designer client component

4. UDF is an identifier of a user defined function


1.    Open the client power center designer

2.    From left window select the subfolder user defined functions

3.    From tools menu —–>select user defined functions

——>Click on new

4.    Enter the name —–>(Trim-both any name )Trim another (sir taken)

Select type public

From toolbar click on add a new argument

Name        Data type    Prec

Arg1         Sting

Click on launch editor

5. Developed the following expression,

L trim (Rtrim(Arg1))

Click on validate

Click ok —–> Click ok

5.    From repository, menu clicks on save

6.    From repository, menu clicks on save

Note : – Create an expression T/R

Port Name             Data type     Prec     Scale                I      O    V     Expression

Ename                                                                                                   []

O-Ename                                                                                               [] :UDF-trim(Ename)

*Q.Informatica provide so many built in functions why u create user defined functions?

A. Ltrim ()




I want to trim both sides at a time, Informatica there is no trim function that’s the reason we develop user defined functions created with

: UDF. Trim-both (Ename)——–>Which column we want to trim ——>user defined function (name of user defined function

In my spec document trim 25 columns data type varchar2 (30)

By using built-in functions, we have to create 25 output ports time is wasted here

For this reason, we have to develop user defined functions

*User defined function are used in expression T/R

Date                         I     O    V             Expression

Ename                     Vasch                   []

O-Ename                 []                          : UDF.trim(Ename)

Job                           []

O-Job                       []        :UDF.trim(JOB)

25 columns we have to create output port

*This user-defined function is a reusable object we can use any mapping


DEF: -It defines the load order into multiple targets which are related with PK & FK relationship.

Business Purpose:-

Loading the data into snowflake dimensions which are related with PK & FK relationship snowflake dimensions

Snowflake dimensions:-





*Before doing CBI we have to de join different sources




This target stg-Emp-Dept taken as source for CBL, here table





Load Order:-

Dept——> Emp


1. Design the mapping as describe in data flow diagram

2. Crete session

3. Double click the session —–> select conflig object tab

Attribute                                    Value

Constraint-based load ordering            []

4. Select the mapping tab —-> Define reader & write connection the target load type normal

5. Click apply ——> Click Ok

6. Save the session


Access the session logs and read the log messages

*When we select, constraint-based is available in session level load ordering []

*We have to tell to integration service load the data based on PK & FK i.e, the First load having PK table then load, FK table

FK table:-

•    Dept(less no. of records) ——->(Primary –table (or) parent table (or) master table)

•    Emp —–>(Foreign key table (o) child table (or) Detailed table)

•    More no. of records

•    In CBL mapping we can’t see any loading process and errors we will see in session log file .ther we gat how data is loaded

•    Load the data into target tables in 2 ways

1. Constraint-based load ordering (based on PK& FK)

2. Target load Plan (not based on PK & FK)

[ Means plan & how to load data into target here we load data without based PK & FK, here we develop 2 pipelines mappings for target load plan]

•    When we rename source table to target table that time we select

Generate & execute SQL —–> Then one window will be opened there we select

[*] Primary key

[*] Foreign key {select both keys}

Target Load Plan:-

1.    It defines the order in which data is extracted from source qualifier T/R

Business Purpose:-




*In flat files there is no consistency of data, we have duplicate data, null data is there, for this reason

Before loading data into DWTT we keep the data into staging area, this staging area is also called temporary area it stores data temporarily, this is also called data parking area

Truncate load: – This (option) Property available in,

Session —-> Mapping —->Truncate target load[]

*Staging area maintain data same times 90 days,2 months that’s why it is called ODS(Operational data store)

What are load types?

1. Initial load

2. Incremental load { further divided into 2 types}

i. Normal load

ii. Bulk load

Informatica load order:-

1. CBL

2. Target load plan

Load order in star schema?

1.    Dimension table will be load first

2.    After fact, table will be load

Data flow diagram:-




Data flow diagram:





Single mapping 2 pipelines

Alternate solution for above 2 pipeline mapping

1.    Create 2 mapping with 2 session [m1][m2]

2.    Keep session in sequential order [s1][s2]



1.    Design the mapping as described in DFD

2.    Select mapping menu ——>Click on target load plan

3.    Change the load order using up & down arrows click ok

4.    Save the mapping


Character utility

Command utility


1. It is a command based client program that communicates with integration service to perform some of the tasks which can also be perform using workflow manager client

2. Using PMCMD, we can perform the following tasks

i. Starting workflow

ii. Scheduling workflow

3. The PMCMD can be operated in two different modes

i. Interactive mode

ii. Command line mode

i.    Interactive Mode:-

You establish and maintain an active connection to the integration service. this lets you issue a series of command


PMCMD indicates the success of failure of a command with a return code return code “o” indicates that the command succeeded any other return code indicates that the command failed (range 1…… 25)

Use the DOS or UNIX echo command to see the return code for the command

In a DOS shell: echo% ERROR LEVEL %

In a UNIX Bourne or Korn shell: Echo $?

In a Unix C shell: echo $ status

*Accessing PMCMD in interactive mode:-


Start — Run——> type Pmcmd —> Click OK

PMCMD Commands:-

  1. Connect:-

It connects to integration service with the following syntax,

PMCMD> connect –SV service –d domain – U user –P PWD

EX: Pmcmd > Connect –SV nipuna(integration service name)—d Domain-nipuna—U Administrator

  1. Start work Flow:-

It starts the workflow on integration service with the following syntax,

PMCMD> Start workflow –f folder workflow

Ex: -PMSMD> Start workflow -F batch 7 am W- PMEMD

3.    SET FOLDER:-

Active the folder from which workflows can run


PMCMD > set folder

Ex: Set folder batch 7 am

Note: Now workflow can be started with the following syntax,

PMSMD > Start workflow

Ex: PMCMD > Stare workflow w-PMCMD

4. Unset folder:-

Deactivates the currently active folder to run the workflows from another folder


PMCMD> unset folder

EX:PMCMD>unset folder

5. Ping service:-

Verifies whether the integration service is alive or not


Ex: PMCMD> Ping service

6. Get service properties: -Provides the configuration details of integration service

PMCMD >get service properties

7. Disconnect:-

Discount the PMCMD from integration service


8. EXIT:-

Discounts the PMCMD from integration service and closes PMCMD client program

*PMREP UTILITY: –It is a command based client program that connects to repository service to perform repository administrative tasks such as backup, restore etc…

1. Connect: -Connects to the repository service with the following syntax.


PM rep> Connect –r repository –d domain-n user-x PWD

EX: – Connect-r, nipuna- rep –d domain –nipuna –n administrator –x administrator

Create folder: –It creates a new folder in repository


Create no space – folder –n folder –s-n–> normal folder –s —> Shared folder


1.    Create folder –n Batch 4 pm

2.    Create folder – n Venkat-s

Delete folder:-


Delete folder –n folder name

Ex: – Delete folder –n Batch 4 pm

Backup: –It takes the repository backup in a rep file format

PMrep > backup -0 <Path>

Before taking the backup, you should close allthe folder I repository service

Ex:- Backup -0 C:\ backup \test.rep


Restore the metadata from .ep backup file


Restore –u domain-user-p Domain-password –n <input files>

y- represent a versiond repository

Note: – Repository database should be empty before restore metadata from [.rep] backup file

Ex:-Pmrep> restore –u administrator   -p Administrator –I C:\back up \venkat .rep –y

Delete no space object: –Delete the repository objects such as mappings, sessions, workflows etc….

Syntax:- Pmsmd > delete object -0 <object type>-F folder –n object name

Ex: Pmcdm > delete object -0 session –f Batch 7 am –n s5

This command is not supported because the versioning is on for repository nipuna –rep

Note: -This command supports only unversioned repositories

Exit: -Disconnects the Pmrep from repositories service and closes the client program


Exporting repository objects:-

The repository objects such as mappings, mapplets, sessions, workflows, worklets —-etc…,can be exported into a metadata file format called.xml


1. Open the client power center repository manager

2. From left window expand folder (Batch 7 am)

3. Expand mapping subfolder select all the mappings

4. From repository, menu selects export objects

5. Select the file directory (C :\ Backup)

6. Enter the file name (Batch 7 am –mappings) and click on save

7. Click on close


The repository objects can be imported from XML file (Metadata file)

Step-1: Create new folder

1.    From the client power center repository manager select folders menu click on create enter the folder name testing environment click ok

Step-2: Importing Objects

1. From left window select the folder (testing Environment)

2. From repository, menu clicks on import objects

3. Click on browse to select XML file in wizard window

4. Select the file directory (C:/Backup)

5. Select the file name (Batch 7 am –mapping)

6. Click ok

7. Click on ADD All >> to select all objects to import

8. Click on next

9. Select the destination folder (testing Env) Click on next

10. Click on next  —->Click on next , click on import —-> Click on done

11. From left window select the folder(Testing Env)

Right Click —–> Click on refresh


A session option that allows you to push transformation logic to the source or target DB

When you configure pushdown optimization, the IS(Integration service) tries to push each T/R to the DB

When you run a session configured for pushdown optimization, the IS analyzes the mapping and transformations to determine the T/R logic it can push to the DB

Integration service converts the transformation logic into SQL, Queries which can be pushed to the DB server, a session is configured with following typed of push down

1.    To Source

2.    To target

3.    Full

1. To source: – The is pushes as much transformation logic as possible to the source DB

2.To Target: – The is pushes as much T/R logic as possible as to the target DB

3. Full: – Full pushdown optimization, the pushes as much T/R logic as possible to both source & target DB

Pushdown optimization Limitations:-

1.    Rank T/R cannot be pushed

2.    Transaction control T/R

3.    Sorted aggregation



1.    Design a mapping with filter, rank and expression T/R

2.    Create a session ——> Double Click the session select properties tab,

Attribute                             Value

Pushdown optimization                full

3.    Select the mapping tab —–>set reader, write connection with target load type normal

4.    Click apply —–> Click ok —–>save the session

5.    Create & start workflow

Pushdown optimization viewer:-

Double click the session  —–> Select the mapping tab

from left window  ——–> Select pushdown optimization

Object sharing:-

1. Reuse an existing object

2. Decrease the deployment time

3. Increase the consistency in the development.

4. Objects can be shared using copy and shortcut

Differences between copy and shortcut:-

COPY                                                                       SHORTCUT

1.    Copy an object to another folder                      1.Dynamic link to an object in the folder

2.    Changes to original object don’t reflect          2.Dynamically reflects the changes to an original


3.    Duplicates space                                                         3.Preserves space

4.    Created from unshared folders                           4. Create from shared folders

Procedure for copy:-

Step 1: Create a new folder using repository manager client

Step 2: Open the client power center designer, active the destination folder where you want to copy the objects

Step 3 : – Select source analyzer work space (destination folder)

Step 4: – From source expand sources drag the object. Drop on source analyzers workspace—->

Click on yes

From repository, menu clicks on save

Creation of the shortcut:-

Step-1 :

 Title: – Creation of shared folder.

1. Open the client power center repository manager

2. From folder menu ——> Select create

3. Enter the folder name: Sales

4. From options select [] allow shortcut

5. Click ok

Step 2: –Creation of objects in shared folder

1.    Open the client power center designer

2.    Activate the shared folder (sales)

3.    Create a source definition with the name Emp using source analyzer tool

4.    From repository menu —–> Click on save

Step 3: – Creation of normal folder using repository manager client

[ New folder name (home)]

Step 4: -Creation of short cut

1.    From the client power center, designer activate the destination folder

2.    Select source analyzer workspace of destination folder (home)

3.    From shared folder —–> Dray the source definition (EMP), Drop on source analyzer wok space

—–> Click on yes

4.    From repository menu ——> Click on save

Setup Environmental variables:-

1. Right click on my computer —–> Click on properties

2. Select advanced tab

3. Click on environment variables

4. From user variable for administrator—–>

Select-click on new

Variable name: INFA _Home

Variable value: C:\ Informatica \9.5.0

Click ok

5.From system variables —-> select the ‘path’ —–>

Click on Edit —–> Type path in notepad,

C: \ informatica\ 9.5.0\ server\bin

Click ok and click ok

Session Recovery:-

When you stop a session (or) an error causes session to stop when identifying the reason for the failure of the session and start the session using one of the following methods,

a. Restart the session if integration service has not issued at least one commit

b. Run the session in recovery mode if the integration service has issued at least one commit

When you run the session in recovery mode the integration service starts processing the data from the last checkpoint (commit point)


  1. Double click the session—->select properties tab,

Attribute        Value

Recovery strategy        Resume from the last checkpoint

Click apply —–>Click ok

2. From Repository menu ——> Click on save


1. BY using version control we are maintaining the history of the metadata objects

2. A version repository stores multiple versions of an object

3. Each version is a separate object with unique number

4. You can perform the following change management tasks to create and manage multiple versions of objects in the repository

1. Check in  ——> Read-only mode

2. Check out ——->Editable mode


1. You must save an object before you can check it in

2. When you check in an object, the repository creates a new version of the object & assigns it a version number

3. The repository increments the version number when you check in an object


1. To edit an object, you must check out the object

2. When you check out an object, the repository obtains

3. No, other users can edit the object when you have checked out


  1. Design a mapping with the filter T/R and create the following filter condition

MOD (Emp no, 2) =0

2. From repository menu —-> Click on save

3. From versioning menu —–> Click on check in

4. Specify the check in comments

5. To handle the change management tasks —–> from version menu —–> click on checkout

6. Enter the check out comments, load employee with Employee number ODD click ok

7. In filter T/R change the filter condition to

MOD (Emp no, 2)!=0

Click ok  —-> Click ok

8. From repository menu ——> Click on save

9. From versioning menu —–> Click on check in

Click on

10.From versioning menu —> Click on view history

Dynamic Lookup cache:-

1. The cache updates or changes dynamically when lookup on target table

2. The Dynamic lookup T/R allows for the synchronization of the target lookup table image in the memory with its physical table in the database

3. The dynamic lookup T/R or dynamic lookup cache is operated in only connected made (Connected lookup)

4. Dynamic lookup cache support only equality conditions (=conditions)

New lookup Row                             Description

0                                              The integration service doesn’t update or insert the row in the cache

1                                              The integration service inserts the row into the cache

2                                              The integration service updates the row in the cache

Key Point:-

1. The lookup T/R “Associated Port” matches a lookup input port with the corresponding port in the lookup cache

2. The “ Ignore null inputs for updates should be checked for ports where null data I the input stream may overwrite the corresponding field in the lookup cache

3. The “Ignore in comparison” should be checked for any port that is not to b compared






Dynamic lookup cache store image of target table when we run the session and then compare with source record then the only record in loaded into target when session succeeded the cache will be erased when new record inserted (or ) updated in target same process will be continued with flag value in dynamic cache

*Unwanted ports ignore it here job, sal meanwhile changing that’s why we will not ignore that ports

*What is dynamic lookup cache (or) dynamic lookup T/R?

*Can u convert lookup into dynamic lookup cache?

A. Yes we can convert, select the property in connected lookup

T/R ——>Dynamic lookup cache []


  1. Source table: Emp                target table: – Stg-EMp-Dynamic-cache

(Emp-key +all source columns)

2. Make sure that Emp-key is primary key column

3. Create a mapping with the name m-Dynamic-cache

4. Drop the source definition

5. Drop the target definition with the two instances ( 1 instance for insert, another for update)

6. Create T/R type lookup that lookup on target table (Stg-Emp- Dynamic-cache)

7. From source, Qualifier copies all the ports to the lookup

8. Double click the lookup T/R select the port tab

9. Uncheck the output ports for the ports which are copied from source qualifier

10. Select the condition tab

 Lookup Table column        Operator        Transformation port

Emp no                          =                   Emp no1

11. Select the properties tab,


Transformation Attribute            Value

Connection information                Oracle-Batch 7 am –DB

Dynamic lookup cache                                   []

Insert Else update                                       []

12. Select ports tab for a port name, Emp-Key select the data type integer

    Port Name            Associated expression

Ename                Name 1

Job                    Job1

MGR                 MGR1

Hiredate            Hiredate 1

Sal                   Sal1

Comm                Comm1

Dept no                Dept no1

Click apply ——>Click ok

13. Create the T/R type router

14. From lookup T/R copy all the output ports to the router T/R

15. Double click the router T/R —–> select the group

Tab —–> from toolbar click on add a new group

Group name            Group Filter condition

Insert                 New lookup Row=1

Update                New lookup Row =1


Apply —–> OK


  1. From Router T/R, from insert group connect ports to first target instance

Update existing records:-

1.Create the T/R type update strategy —–> From Route T/R, from update group —-> copy all the ports to update strategy except

2. Double-click the update strategy selects the properties tab,

Transformation Attribute             Value

Update strategy expression                   1

Click apply —–> Click ok

3. From update, strategy connects the ports to 2nd target instance

4. Save the mapping

*In dynamic lookup cache is done on target table i.e, dynamic lookup on target table

*Dynamic lookup cache not possible on source table

*Dynamic lookup T/R (or Cache did through connected lookup, but it is not possible through unconnected lookup

* Dynamic lookup cache support only ‘=’ match does not support <, >, <=, >=

*After session succeeded then we update & insert the records in source

Open source account (Scott)

Insert new record into Scott & restart the session update existing record into Scott & restart the session

Check it in batch 7 am (target database)

*Static lookup cache is read-only cache we can’t do any insert & updates

*By using dynamic lookup T/R (Cache) we created sequence numbers


  1. Data Availability:-

Description: – Make sure that data is available for extraction from source

Test Procedure:-

1.    Connect to the source database with valid username & password.

2.    Run the select query on database

3.    Verify that the data is available for extraction

2. Data load /insert:-

Description: – Make sure that the target table should be inserted with records

Test procedure:-

1. Make sure that there are no records in the target table.

2. Truncate or delete the records if the target table is having records

3. Run the mapping

3. Count test(verify data loss)

Description: – Count the no.of records in source & target

Test Procedure:-

1. Write select query to count no.of record in source

2. Write the select query to count no.of records in target

3. Source count should tie-up with target count (or)

Target count should tie-up with Source count

4. Minusque test (verify data loss)

Test procedure:-

1. Write a select query to verify data loss with minus set operator

2. If the query returns ‘0’ rows then that indicates there is no data loss

3. Other than ‘0’ indicates the data loss





5. Data load /update:-

Test Procedure

1. Make sure that there are some records in the target table

2. Update a source table record which has been already loaded in the target

3. Run the mapping

4. The target table should be updated with the given value


Test procedure:-

1. Add a new record with the new values, in addition, to already existing records in the source

2. Run the mapping

3. The target table should be inserted with only new record


Test procedure:-

1. Add a new record with the new values, in addition, to already existing records in the source

2. Run the mapping

3. Compare the target column values that should be the same as that of source column values


Test procedure:-

1. Perform the manual check to verify source column mapping with target columns

2. Make sure that every source column having a right linkage with the corresponding target column


Test description:-

1. Ensure that sequence number are generating for every source row and that should be unique

2. Verify the sequence numbers at initial load and incremental load

Test procedure:-

1. Make sure that the target table is empty

2. Run the mapping

3. Verify the sequence

4. Add a new record with the new values, in addition, to already existing records in the source

5. Run the mapping

6. Verify the sequence number


Test Procedure:-

1. Write a select query to verify the duplicates

2. If the query returns result ‘0’ rows, that indicates that no duplicate rows

3. Other than ‘0’ indicates there are some duplicates rows

Query Pattern:-

Verify duplicates(doesn’t delete duplicates)

Select Emp no, Ename, sal count (*) from Emp

Group by Emp no, ename, sal

Having count (*) >1


Identify target bottleneck:-

A target bottleneck can be identified by configuring the session that writes the data to a flat file target


1. Drop the index before loading the data into the target

2. Increase commit interval

3. Use bulk loading




(Problem int target then development 10 min testing 10 min(8-10 min) there is a problem in source)


Test Produre:-

1. Test mapping removes all the T/R s

2. If the time taken to execute the test mapping is same as original mapping then there is a source bottleneck


1. Create index at the source

2. Reduce the volume of data though N/W While extraction using filters

Test Mapping:-





1. Keep the filter T/R before the target definition and set the filter condition to false, so that no data is loaded into the target

2. If the time it takes to execute the new session is same as original session then there is a mapping bottleneck

Test mapping:-



Development (Original)-10 min

Testing (File) -4-5 min

(Problem in original target table)

Development (original)  –10 mins

Testing (File) –10 mins

(Problem in source table)



1. Partition points mark the boundaries between threads in a pipeline. The integration service redistributes rows o f data at partition points you can add partition points to increase the number of transformation threads and increases session performance

2. When you configure a session to read a source database, the integration service creates a separate connection and SQL query to the source database for each partition

3. When you configure a session to load data to a relational target, the integration service creates a separate connection to the target database for each partition at the target instance

Types of partitions:-

1. Round Robin partition

2. Hash key partition

3. Range partition

4. Pass through partition (by default)

5. Database partition

1. Round Robin Partition:-

i. Distribute the rows evenly across all partition points





2 thoughts on “Informatica training in Hyderabad”

  1. Hi, I am interested in learning informatics.Please let m know the course contents and your fees.I want to know about batch timings , price and duration of the course.Not sure about fast track or regular. please update me about the batch details and admission policy


Leave a Reply

Your email address will not be published. Required fields are marked *

Protected by WP Anti Spam

Hadoop Class Room training |Hadoop Training Online