by Deepak Vohra

 

Oracle Data Integrator is a tool for data transformation, validation, and integration based on a declarative design model and making use of a E-LT (Extract-Load Transform) architecture. ODI is efficient, and easy to develop with and maintain. ODI supports several RDBMSes including Oracle Database, IBM DB2, and also supports other types of data servers such as Hadoop HDFS, Hive, text file, LDAP and XML. We shall be discussing using ODI to integrate text file data into Oracle Database 12c. The ODI architecture is based on a central Oracle Data Integrator Repository, which stores all the information including metadata of applications and projects, and configuration information, and run logs. The ODI Repository is made of one Master Repository and one or more Work Repositories. A repository may be created either using the Repository Creation Utility (RCU) or with the ODI Studio.

This article has the following sections.

Pre-Requisite Software

Creating a User for ODI Main Repository

Creating Source and Target Data Stores

Installing Oracle Data Integrator 12c

Creating the Master Repository

Creating a Work Repository

Creating a Login

Creating the Topology for Source and Target

Creating the Design Models for Source and Target

Creating an Integration Project

Creating a Mapping from Source To Target

Running the Mapping

 

Pre-Requisite Software

Download and install the following software:

  • Java 7 or later (should not be OpenJDK)-JDK 1.7.0_51 used in the article
  • Oracle Database 12c (http://www.oracle.com/technetwork/database/enterprise-edition/downloads/index.html)
  • Oracle Data Integrator 12c (http://www.oracle.com/technetwork/middleware/data-integrator/downloads/index.html?ssSourceSiteId=otnru)

 Extract the ODI zip file ofm_odi_generic_12.1.2.0.0_disk1_1of1.zip to a directory.

 

Creating a User for ODI Main Repository

We need to create a Oracle Database table for the Main Repository. In SQL Plus run the following SQL script to create database table prod_odi_repo and grant the required permissions.

create user prod_odi_repo identified by oracle;

GRANT CONNECT, RESOURCE, CREATE ANY TABLE, CREATE SESSION TO prod_odi_repo;

 

Creating Source and Target Data Stores

As we shall be integrating a text file into Oracle Database 12c we need to create data stores for a text file and a Oracle Database table. Create a text file wlslog.txt in the

C:\Oracle\Middleware\Oracle_Home\odi\inputfile directory with the following tab delimited and header columned data. For variable width field values the last column index of each of the field values in a column must be within a common column index. With different field values ending at different column indexes the greatest column index is the common delimiting column index.

TIMESTAMP            CATEGORY     TYPE        SERVERNAME    CODE        MSG
Apr-8-2014-7:06:16-PM-PDT    Notice        WebLogicServer    AdminServer    BEA-000365    Server state changed to STANDBY
Apr-8-2014-7:06:17-PM-PDT    Notice        WebLogicServer    AdminServer    BEA-000365    Server state changed to STARTING
Apr-8-2014-7:06:18-PM-PDT    Notice        Log Management    AdminServer    BEA-170027    The Server has established connection with the Domain level Diagnostic Service successfully
Apr-8-2014-7:06:19-PM-PDT    Notice        WebLogicServer    AdminServer    BEA-000365    Server state changed to ADMIN
Apr-8-2014-7:06:20-PM-PDT    Notice        WebLogicServer    AdminServer    BEA-000365    Server state changed to RESUMING
Apr-8-2014-7:06:21-PM-PDT    Notice        Server            AdminServer    BEA-002613    Channel    Default is now listening on fe80:0:0:0:0:5efe:c0a8:147:7001 for protocols iiop t3ldap snmp http
Apr-8-2014-7:06:22-PM-PDT    Notice        WebLogicServer    AdminServer    BEA-000331    Started WebLogic Admin Server AdminServer for domain base_domain running in Development Mode
Apr-8-2014-7:06:23-PM-PDT    Notice        WebLogicServer    AdminServer    BEA-000365    Server state changed to RUNNING
Apr-8-2014-7:06:24-PM-PDT    Notice        WebLogicServer    AdminServer    BEA-000360    Server started in RUNNING mode

For the target Oracle Database table the following SQL script creates database table OE.WLSSERVER.

CREATE TABLE OE.wlsserver (timestamp VARCHAR2(55), category VARCHAR2(15), type VARCHAR2(55), servername VARCHAR2(15), code VARCHAR2(15), msg VARCHAR2(255));

Run the SQL script in SQL Plus to create the OE.WLSSERVER table. The table data structure may be listed with the DESC command.

  

Installing Oracle Data Integrator 12c

Next, we shall install Oracle Data Integrator 12c. Run the following command in Windows CMD shell from the directory in which the ODI zip is extracted.

java -jar odi_121200.jar

The Oracle Data Integrator Installation wizard gets started. Click on Next in Welcome.

  

In Installation Location select a directory for ORACLE_HOME. Click on Next.

  

In Installation Type select Enterprise Installation and click on Next.

  

In Prerequisite Checks a test gets run to verify if the system has the required prerequisites for OS certification, physical memory and Java version. If all the prerequisites are met click on Next.

  

In Installation Summary the ODI components to be installed are listed. Click on Install.

  

The installation starts and the Installation Progress is indicated as % completion.

  

When the installation is completed as indicated by 100% in the Installation Progress click on Next.

  

The ODI components installed get listed. Click on Finish.

  

In Confirm Import Preferences click on No.

  

Oracle Data Integrator 12c gets installed.

 

Creating the Master Repository

The Master Repository stores information about users, profiles and permissions. The Master Repository also stores topology information about data server definitions, physical schemas and logical schemas, technologies and contexts.

Click on New to create the Master Repository.

  

In New Gallery select ODI in Categories and Master Repository Creation Wizard in Items. Click on OK.

  

In Repository Connection specify the connection parameters to connect to Oracle Database 12c in Database Connection. For Technology select Oracle. In JDBC Driver field specify oracle.jdbc.OracleDriver. Specify the connection JDBC Url as jdbc:oracle:thin:@localhost:1521:ORCL. Specify User as PROD_ODI_REPO. Specify the Password for the User. Specify DBA User as system and specify the DBA Password. Click on Next.

  

In Authentication select Use ODI Authentication and select the default Supervisor User, which is SUPERVISOR. Specify the Supervisor Password and Confirm Password. Click on Next.

  

In Password Storage select Internal Password Storage and click on Finish.

  

A Master Repository Creation Wizard dialog indicates that “ODI is creating your master repository”.

  

When the master repository has been created another dialog indicates the same. Click on OK.

  

 

Creating a Work Repository

The developed objects are stored in one or more Work Repositories. A work repository stores the design models, data structures of the data stores including table and column definitions and field definitions. Information about projects, packages, procedures, folders and knowledge modules is also stored in the work repositories. Scenarios, scheduling information and logs are also stored in a work repository.

To create a work repository we need to disconnect from the Master Repository. Select ODI>Disconnect.

  

Select the Topology tab. In Repositories right-click on Work Repositories and select New Work Repository.

  

The Work Repository Creation Wizard gets started. In Specify ODI Work Repository connection properties select Technology as Oracle. Specify the JDBC Driver, the JDBC Url, and master repository user and password. Click on Test Connection to test the connection with the master repository user.

  

A Successful Connection dialog indicates the connection is established. Click on OK.

  

Click on Next in the Work Repository Creation Wizard.

  

A dialog prompts that because the use of localhost or 127.0.0.1 hostname is not recommended “Would you like to continue?” Click on Yes.

  

In Specify ODI Work Repository properties specify Name as WORKREP1 and specify a Password. Select Work Repository Type as Development. Click on Finish.

  

A dialog prompts that the work repository is being created.

  

When the work repository has been created another dialog prompts if a login for the work repository should be created. Click on No.

  

Creating a Login

After the master repository and the work repository have been created create a login for the Repositories. Click on Connect to Repository…

  

In Oracle Data Integrator Login click on “New” button.

  

In Repository Connection Information specify a Login Name and User (Supervisor) and Password. By default connection is established with master repository only as indicated by Master Repository Only in Work Repository Only header. Click on Test to test the connection.

  

A Successful Connection dialog indicates that the connection is successful.

  

Click on OK in Repository Connection Information.

  

Click on Ok in Oracle Data Integrator Login.

  

If the Master Repository Only is selected the Work repository and the Development environment is not connected to. In a later section we shall discuss connecting to the Work repository.

  

 

Creating the Topology for Source and Target

The topology describes the physical and logical architecture of the technologies and data types including the data servers and the schemas associated with the information systems. The physical architecture describes the components of the information systems such as a text file or a database table. The ODI components, such as a design data model, are not directly based on the physical architecture and the physical schemas but are based on a logical architecture and logical schema. A runtime context resolves a logical schema to a physical schema. A context establishes a basis for linking the physical components of the information systems with the logical components of the ODI. For example, the context could be based on the environment such as Test, Development and Production. A context could be a location of a physical resource such. The default context is Global.

A data server is the physical component that stores the structured data. A data server is linked to a single technology. For example for a text file the technology would be File and for Oracle Database the technology would be Oracle. The data store associated with File would be a text file and the data store associated with Oracle Database would be a database table.

The data stored in a data server is based on a technical logic described in a physical schema.

A logical schema describes the actual schema used by ODI in obtaining data for the integration. A logical schema could be linked to multiple physical schemas in different contexts. At runtime only one of the physical schemas is selected based on the context.

To create a topology for the Source and the target data stores the following is required for each.

  1. Create Data Server
  2. Create Physical Schema
  3. Create Logical Schema

To create a data server for the text file data store select Physical Architecture>File. Right-click on File>FILE_GENERIC and select Open.

  

The Data Server configuration wizard is opened. The Data Server Name is FILE_GENERIC by default. Specify Host as localhost. Specify Connection User and Password to the OS.

  

To create a physical schema linked to the data server right-click on FILE_GENERIC and select New Physical Schema.

  

In the Physical Schema wizard select the Directory (Schema) as the directory C:\Oracle\Middleware\Oracle_Home\odi\inputfile in which the wlslog.txt file is. Select the same directory for Directory (Work Schema). Click on Save to save the physical schema.

  

A physical schema gets created.

  

Next, create a logical schema associated with the physical schema. The logical schema is what the ODI makes use of in the integration of data. Select Logical Architecture>File and right-click and select New Logical Schema.

  

In Logical Schema wizard Definition specify a Name and select the physical schema created earlier for the Global Context Physical Schemas.

With the data server, physical schema and a logical schema for the source data store completed the Source topology has been created.

Next, create a topology for the target data store. Select Physical Architecture gain and select Oracle. Right-click on Oracle and select New Data Server.

  

In the Data Server configuration wizard specify a Name for the data server. The Technology is Oracle. Specify the User and Password to connect to Oracle Database.

  

Select the JDBC tab and specify the JDBC Driver and JDBC Url. Click on Test Connection to test the connection using the connection parameters.

  

In the Test Connection dialog click on Test with the default Physical Agent as Local.

  

A Successful Connection dialog message indicates a connection has been established.

Click on OK.

  

Next, create the physical schema for the Oracle Database data server. Right-click on the data server node and select New Physical Schema.

  

In the Physical Schema configuration wizard specify the Schema (Schema) and Schema (Work Schema) as OE. Click on Save to save the Physical Schema definition.

  

A physical schema for the Oracle technology data server gets created.

  

Next, create a logical schema associated with the physical schema. Right click on Logical Architecture>Oracle and select New Logical Schema.

  

In Logical Schema Definition specify a Name and select the Physical Schema for the Global Context.

  

Click on Save to save the Logical schema. A logical schema gets added to the Oracle technology in Logical Architecture.

  

The source and target topologies get completed.

 

Creating the Design Data Models for Source and Target

The topology only describes the source and target data servers and their physical and logical schemas, but not the data model for the source and the target. A design data model is a schema definition including the data store structures and columns and fields definitions. Next we shall create the data models for the source and the target.

Because we need access to the actual data objects for the data models we need to connect to the work repository. Click on Connect To Repository.

  

In the Oracle Data Integrator Login click on Edit.

  

In the Repository Connection Information the Oracle Data Integrator Connection and the Database Connection (Master Repository) parameters are the pre-specified as we had configured the login for the master repository earlier.

In Work Repository select the WORKREP1 work repository and click on OK.

  

Having connected to the work repository the wizards to create Projects and Models become available. In Models select New Model.

  

In the Model Definition specify a model Name and select Technology as Oracle. Select the Logical Schema for the Oracle technology created earlier. In Action Group select <Generic Action>.

  

Click on Save to save the model definition. Select the Reverse Engineer tab and select the Global Context and Types of objects to reverse engineer as Table, and click on Reverse Engineer button to get the metadata for the Oracle Database into the work repository.

  

The Oracle Database tables’ metadata including for the WLSSERVER table created earlier get imported into the work repository model.

  

Right-click on the WLSSERVER table node and select View Data.

  

The columns in the WLSSERVER table get displayed but without any data as we have not yet integrated the text file with Oracle Database

 

A data model (DB-OE) for the target data store gets created. Next, create a data model for the source data store. Select New Model Folder in Models

  

In Model Folder Definition specify a Name and click on Save.

  

A Model Folder gets created. Right-click on the Model Folder node and select New Model.

  

In the Definition for the Model specify a Name and select Technology as File. Select the Logical Schema for the File technology created earlier. In Action Group select Generic Action and click on Save.

  

Next, we need to configure a data store with the File data model. Right-click on the data model node and select New Datastore.

  

In the Datastore Definition specify a Name and select Datastore Type as Table. Select the Resource Name as wlslog.txt. Click on Save.

  

Select the Files tab and select File Format as Fixed. As the first line in the wlslog.txt file is the headers line specify Heading (Number of Lines) as 1. Select Field Separator as Tab. Select the Attributes tab.

  

The Attribute Setup Wizard gets started. In the Fields click on the column index in the header at which the first column field values end. A red dot gets added at the column index. All the field values for the first column must be within the selected column index. Select the first column field values and specify Attribute Name as TIMESTAMP and Datatype as String.

  

Similarly add a red dot at the column index at which the second column field values end, select the second column field values and specify the Attribute Name (CATEGORY) and Datatype for the second column. The attribute names should preferably be the same as the column names in the database table to which the text filed data is to be mapped and integrated as the mapping does not have to be customized with the same name attributes.

  

Similarly, select the third column and specify the Attribute Name (TYPE) and Datatype.

  

Similarly configure the SERVERNAME attribute. All the field values must be within the selected column width, though within a column the field values could be not aligned.

  

Similarly select the 5th column, specify Attribute Name as CODE and Datatype as String.

  

For the last column the field values do not have to be demarcated with a red dot at the column index at which the field values end. Click on OK.

  

The Attributes for the File technology data model get configured. Any attribute name and datatype not configured in the Attribute Setup Wizard may be configured in the data model Attributes tab table.

  

Click on Save to save the data model. Click on Reverse Engineer to reverse engineer the data from the text file to the work repository.

  

The wlslog.txt file data gets reverse-engineered into the work repository. Right-click on the Model node and select View Data to display the data reverse-engineered into the work repository.

  

The text file data gets displayed as imported into the work repository Model.

 

Creating an Integration Project

An ODI integration project implements the data integrations, validations and transformations making use of the topology and the data models. In the Designer tab select New Project in Projects.

  

In Project Definition specify a Project Name. Click on Save.

  

A project node gets added. Next, import the required Knowledge Modules for the integration project. A Knowledge Module is a code template used for a specific task such as reverse engineer to get metadata associated with a datastore or integrate data from a staging zone to a target data server.

Right-click on Knowledge Modules in the project node and select Import Knowledge Modules..…

  

In the Import Knowledge Modules (XML File) wizard select the $ODI_HOME/sdk/xml-reference as File import directory. Select the IKM SQL Incremental Update Knowledge Module.

  

Also select the LKM File to SQL Knowledge Module. Click on OK.

  

An Import report for the Knowledge modules imported gets generated. Click on Close.

  

The select Knowledge Modules get imported into the integration project.

 

Creating a Mapping from Source To Target

Mapping is the mapping of a source data store to a target data store including the transformations, constraints, and business rules applied in the integration.

To create a mapping right-click on the Mappings node in the project and select New Mapping.

  

In New Mapping specify a Name and click on OK.

  

A mapping definition gets created. Next, select and drag data model definitions to the mapping. Select the data model for the wlslog.txt text file, drag the data model definition and drop on the Mapping palette.

  

The data model definition for the text file data store gets added.

  

Similarly drag the data model definition for the target data store onto the Mapping palette.

  

The data model for the Oracle Database gets added to the Mapping. Next, we need to define the mapping between the two data models. Select the right handle on the WLS data model and drag to the left handle on the WLSSERVER data model.

  

In Attribute Matching select the default options, which are to Map Options By Name, Ignore Case and Auto Map. Click on OK.

  

A mapping from the attributes of the source data model to the attributes of the target data model gets defined. Click on Perform Layout to format the layout.

  

Click on Save to save the Mapping.

  

A new Mapping gets defined.

  

 

Running the Mapping

Next, we shall run the Mapping to actually integrate the text file data into the Oracle Database table OE.WLSSERVER. Right-click on the mapping node in Projects>Mappings>MapTextToOracleDB and select Run.

  

In Run select the default settings and click on OK.

  

A Session started dialog gets displayed. Click on OK.

  

The text file data gets integrated into Oracle Database. Click on Operator and select Date to find the completion status of the mapping.

  

In the Models right-click on the WLSSERVER table node for the Oracle technology model and select View Data.

  

The data integrated into the WLSSERVER table gets displayed.

  

Run a SELECT SQL query in SQL Plus to list the data integrated into Oracle Database.

  

The 9 rows of data integrated from the text file into Oracle Database get listed.

  

In this article we integrated text file data into Oracleatabase 12c using Oracle Data Integrator 12c.