Thursday, October 30, 2008

Data Lineage in Oracle BI EE

Earlier this year, I had the opportunity to work in a small Oracle BI EE project with a not so small source database. One of my tasks was to enable the project team members to easily identify objects with specific indications such as long-running queries or a high number of requests or a redundancy problem or the like.

To put it short, what they wanted was: Data Lineage for Oracle BI metadata
The design - and later implementation - included the following
  • Oracle BI Usage Tracking (standard module)
  • combined with presentation catalog metadata
  • combined with rpd metadata
There were some obstacles on the way, as usual.

First, the Oracle BI rpd file is, well, a file and the data is not accessible like in a relational database. A fact which has been subject to criticism in recent times. The same holds true for the presentation catalog which has its information spread across thousands of small xml files.

To overcome this, we used the Repository Documentation utility of the Oracle BI Administration Tool. This allows you to create a flat csv file with the dependencies of all objects from presentation layer, business model and mapping layer and physical layer. The screenshot shows the file after the insertion of a primary key column, which is a necessary task if you wish to connect to it as a data source in Oracle BI. In the project we used an import package to load the file into a MS SQL Server database.


You can do the same with the presentation catalog using the Create Report option in the Tools menu of the Catalog Manager.

Both tasks are purely manual and up to this moment I have not found an automation solution for it (if you did, please drop a comment), which is the main reason why data lineage for Oracle BI is a little bit cumbersome. Whenever the rpd or catalog content changes, an administrator must manually extract the csv files and (ideally) load them into tables in a relational database.

As one commenter to this post has found out, there is an execution plan named Data Lineage in the current DAC repository (as created by the installer for Oracle BI Applications 7.9.5). A short investigation reveals that there are indeed table definitions which supposedly hold the data from the rpd and presentation catalog. At the moment this is only in the DAC repository and there is no accompanying ETL mapping in the Informatica repository. I would be so glad to be wrong here - if so, please add your comments below.

Data Lineage subject area (inactive by default) in DAC

We will see what future versions of Oracle BI Applications bring. Maybe a mapping which loads the csv files into the tables along with a ready-to-use rpd file (just dreamin'...)

Now comes the knitting, which means that you have to create complex (physical) joins to glue the rpd metadata together with the presentation catalog data. For a first test you can use the presentation layer object names (table and column) to get first results in Answers. (This post does not deal with the creation of the business model and presentation layer in the rpd file to access the data).

The S_NQ_ACCT table holds the valuable data from Oracle BI Usage Tracking, it has a column which contains the request path which allows us to join it to the presentation catalog data.

Now we have a somewhat complete data lineage tool. Below is a screenshot from one of the first dashboard prototypes of the a/m project which allowed us to visualize the benefits of caching.


There would be more to tell and maybe I find the time to post more on this, but let's dream with me a little more...

What about bringing in the DAC repository, the Informatica repository, the Siebel CRM repository (right-click on a Siebel applet and see a report which describes all BC fields, columns and tables and their mapping to Oracle BI)...whoa...just woke up

have a nice day

4 comments:

Vestan Pance said...

"What about bringing in the DAC repository, the Informatica repository, the Siebel CRM repository (right-click on a Siebel applet and see a report which describes all BC fields, columns and tables and their mapping to Oracle BI)"

You can do this with Informatica Meatadata Manager

krishna said...

Integration between ODI to OBIEE11g successfully completed. All reports and Dashboards are working fine.
We are facing so many issues but finally we succeeded.Default all lineage reports are working .I have one question here,Could you please help us ASAP

Is it possible to create a report for the reports to select a column they see in the presentation catalog and show them the source column and table information and the transformations?

I am looking for a report where the users can enter the source table and column and get the presentation layer information including all the transformation it has gone through and vice versa.

Alexander Hansal said...

Hi krishna,

while your question is a bit unspecific, you should very well be able to produce any kind of 'report' once the data is available in Answers.

For the users to enter the information, you could use dashboard prompts or embedded prompts.

have a nice day

@lex

krishna said...

Thank You very much for your reply Lex.I will analyze more detail level, if I need any other information I will post my comment here.

Thank You