Data Model Source Meta Data Exposed

In this issue we’ll discuss Data Movement meta data in ERwin R8 and how to report on it using the Erwin SQL Query tool.

In Erwin R7 you had the luxury of using the Data Browser to report Data Source meta data. One would simply open any ERP file (you could even use STANDARD or REPORTS.ERP), define a NEW COLUMN level report, and complete the required fields (see below).

In the R7 ERwin Data Browser you create a New Column level report and select the Data Warehouse Source options at the bottom of the pop-up to report on source column meta data. Available up to version 7.3.X of ERwin

Results of executing the R7 Data Browser report to display Column Source meta data, including the Target Table and Column name, and the Source Schema, Table Name, Column Name, and Source Column datatype.

 

Note: In 7.3.X Erwin, if you define a TABLE level Data Browser report you’ll be able to see DATA WAREHOUSE RULE meta data which defines the population strategy and timing for the TABLE as a whole. We’ll pick this up in a future issue…In Erwin R8, you can achieve the same results, and with a little investigation of the ERwin meta model, you can extend the reporting capability beyond what was offered in R7. Let’s start with the meta model…
In the Sandhill 1-day ODBC class we cover in detail the R8 Erwin meta model abstraction and explain Erwin’s “dual objects” . Two of those dual objects you’ll need to know for Data Source reporting are ENTITY and ATTRIBUTE. You‘ll also need the meta model tables that begin with “DATA_SOURCE_”. You might build a Subject Area and Diagram including just these tables so you can focus on the meta data to be extracted.

Notice that the MODEL table is the parent of ENTITY, which is the parent of ATTRIBUTE (Entity owns ATTRBIUTE), and there is an identifying relation to DATA_SOURCES_REF. Notice also that MODEL is a parent of DATA_SOURCE_OBJECT which is parent of DATA_SOURCE_TABLE which is a parent of DATA_SOURCE_COLUMN, a very important table here in this discussion.

Start by reviewing the relations to the tables and note the table DATA_SOURCES_REF. This table, like all other tables named “REF” in the ERwin meta model, contains links or pointers to the OBJECT ID’s (ID@ and OWNER@ fields) in the parent and child tables that you’ll need. In this case DATA_SOURCES_REF links the ID of the COLUMN (ID@) to the DATA SOURCE COLUMN (VALUE@). So from this table we can gather target column and DATA SOURCE column mapping information and any other meta data we want to report.

Aove we have a query against the DATA_SOURCE_COLUMN table that reveals the source column meta data that you select in the Data Source Editor when building the mappings. Here we have selected the Table Name (owner_path), (source) Column Name, and Column (physical) Data Type.

Next we join the ATTRIBUTE table to the DATA SOURCE REF table to reveal the data source transform comment and the column meta data including target Column ID (id@), Target Physical (Column) Name, Target column Physical_Data_Type, and Source Column ID (value@)

Finally we pull in the Target Table Name (from the ENTITY table – remember dual objects) and pretty up the report-column headings. We now have a complete report of the Source Table/Column and Target Table/Column mapping and transform comments/instructions.

Here is the complete query that you can run against your own models and export to Excel or use Crystal Reports to create a more elaborate presentation. You’ll have to do a little extra work where multiple source columns are mapped, and there are a few ways to accomplish this…

 

Now that you have defined the Data Movement mapping for each of the Tables and Columns in the model, you can turn your attention to publishing this meta data using the SQL Query tool and a little knowledge of the ERwin meta model. You can use the grouping and subselect ability of Crystal to filter data from the final report, and you can publish the report to HTML using Crystal as well.

2 Responses to Data Model Source Meta Data Exposed

  1. Wasif says:

    Hi ,

    I have a case where multiple sources are populating the target table . In this case, if i add multiple columns , the query will generate multiple columns against that. You have already this case in your article. Please suggest other alternative ways of doing or if you have any article like this , that would be great.

    • steve says:

      Hi Wasif,

      Unfortunately, the nature of the data means that if you have more than one source column you will always have more than one row in the query result.

      Did you know that from version 9.6 you can now access the data movement mappings through the Report Designer (Tools–>Report Designer)?

Leave a Reply

Your email address will not be published. Required fields are marked *