Ab Initio From the Data Warehousing Perspective
Data Warehousing Why did it arise? Large Corporations gathered huge amounts of data. Sooner, they were data rich but information poor. Large body of disparate data, difficult to make informed business decisions.
Solution!!! Users wanted more control over their data Each department request data specific to their department. Varying amounts of data from the same source. One Solution to the PROBLEMS--Information/Data Warehouse.
What is Data Warehousing? ď Ž ď Ž
ď Ž
Data Warehousing is an architecture System for storing, retrieving and managing large amounts of any type of data. Data warehousing concerned in moving data from its current location to the data warehouse and transforming data into information.
E.T.L-----What is it?? ď Ž
ď Ž
E.T.L stands for Extraction, Transformation Loading. This is the principle used in Data Warehousing
Characteristics of Data Warehousing
Application independent Collected at any moment in a business cycle. Metadata has been created for it. Easily understood by a non-technical
Characteristics of data in data warehouse
Subject Oriented
Integrated
Non-volatile
Time Variant
Subject-Oriented ď Ž ď Ž
Focus on entities rather than on process. A Subject-Oriented data warehouse is called a Data Mart
ETL Tools
Many tools are being used in Data Warehousing for the purpose of ETL. Ab Initio is one of the major ETL tool.
What is Ab Initio?
Latin word, meaning “From First Principles” ETL tool, developed by Ab Initio software corporation (http://www.abinitio.com) Used in data warehousing, batch processing and application integration.
Why Ab Initio?
Achieving Scalability Reduced Development Time Managing Metadata Integrating Other Applications
Features
Basic Components: Filter by Expression, Reformat, Sort, Join, Rollup, Dedup
Database Components: Input, Output and Update Table; db_config_utility
Built in Functions: Ab Initio built-in functions are those which
can manipulate strings, dates, and numbers can access system properties
Vectors: An array of same type of elements that is repeated
Look Up Function
Built-in function within a transform function that allows a transform component to retrieve records from a look up file Held in main memory Faster as searching and retrieval is key based Not connected to other components in a graph
Alternatives for Ab Initio ď Ž
Informatica and Ascential are alternatives for Ab Initio but the main disadvantage is they are tougher to work with.
Highlights ď Ž
ď Ž
Every plug-in facility available from industry leaders Informatica and Ascential incorporated into Ab Initio Fastest ETL, possible to extract 41 million rows of data from an Oracle 8i database (Geneva billing system!) in about 5.2 minutes
Success Stories ď Ž
ď Ž
Bank of Montreal-Moving 10 terabytes (TB) of data daily and analyzing it done using Ab Initio Premier Inc. (www.premierinc.com health care services) successfully handled 14 TB of data using Ab Initio to achieve scalability and data quality
End ď Ž
Thank You for your time!!