« PreviousNext »

Data Integration

18 August 2006

Data Integration is the process of accumulating and combining data set from disparate sources at various locations. This data can then be used for business intelligence, CRM, data mining, or other applications that involve the analysis of data in order to make key decisions.

Data integration incorporates a series of processes – the sequence of applications that extract data from various sources, bring them to a data staging area, and then programmatically prepare the data for migration into to data warehouse and the actual loading of the data into the data warehouse and data marts.

Data integration usually involves operations such as data conversion, cleansing, formatting and aggregation. After the data is extracted a number of transformations may be applied in preparation for data consolidation and subsequently loaded into data warehouses, data marts or dimensional data structures used for decision support systems or business intelligence systems.  The following of some of common tasks and functions of data integration:

Key benefits of data integration:

- Availability of data
- Enhanced data quality
- Better manageability
- Improved decision making
- Higher return on investment (ROI)

Three Approaches to Data Integration

  1. Data Consolidation
  2. Data Federation
  3. Data Sharing (replication)

Data Consolidation

This approach consolidates heterogeneous data into a central database. It is the simplest form of data integration. Data consolidation provides fast application deployment, fast access to global data, and low administration costs.

In recent years many large-scale data integration projects have been done with data consolidation approach. Many different type of data, including audio, video, XML, email, messages, etc, had been consolidated in various platforms (Windows, Linux, Solaris, HPUX, AIX, Tru64, OpenVMS, OS/390, etc.) on centeralized Very Large Database with proven scalability.

Data Federation

This approach federates data in multiple data stores into a single virtual database. Data federation hides physical location of data from applications and provides access to both structured and unstructured data. Data federation provides fast integration and support integration of data that cannot be consolidated such as legacy applications and data requiring local ownership. Data federation is often implemented via web services.

Data Sharing

This approach was traditionally implemented as replication or message queuing. It has evolved to include warehouse loading, event notification, workflow, and EAI. The data sharing approach lets data been share between users, applications, and databases by moving or copying data as needed.

Posted in Cool Tech | Trackback | del.icio.us | Top Of Page

No comments yet

Leave a Reply


*
To prove you're a person (not a spam script), type the security word shown in the picture.
Anti-Spam Image