Data Integration
18 August 2006Data Integration is the process of accumulating and combining data set from disparate sources at various locations. This data can then be used for business intelligence, CRM, data mining, or other applications that involve the analysis of data in order to make key decisions.
Data integration incorporates a series of processes – the sequence of applications that extract data from various sources, bring them to a data staging area, and then programmatically prepare the data for migration into to data warehouse and the actual loading of the data into the data warehouse and data marts.
Data integration usually involves operations such as data conversion, cleansing, formatting and aggregation. After the data is extracted a number of transformations may be applied in preparation for data consolidation and subsequently loaded into data warehouses, data marts or dimensional data structures used for decision support systems or business intelligence systems. The following of some of common tasks and functions of data integration:
- Data cleansing and enrichment – Profile, cleanse, augment, and monitor data to create consistent, reliable information.
- Extraction, transformation and loading (ETL) – Extract, transform and load data from across the enterprise to create consistent, accurate information.
- Data migration and synchronization – Capture and propagate data changes in real-time to ensure data integrity, consistency and credibility.
- Data federation – Query and use data across multiple systems without the physical movement of source data.
- Data Warehousing - Build dimensional data structure for data warehouses, data marts, business intelligence and operational data stores.
- Data Quality - Providing the infrastructure to maintain high-quality data in house.
- Data Consolidation - Consolidating complex, diverse, and large volumes of data into centrallized new system.
- Connectivity and metadata management – Leverage all data, regardless of source.
- Master data management – Creation of a unified view of enterprise data from multiple sources.
Key benefits of data integration:
- Availability of data
- Enhanced data quality
- Better manageability
- Improved decision making
- Higher return on investment (ROI)
Three Approaches to Data Integration
- Data Consolidation
- Data Federation
- Data Sharing (replication)
Data Consolidation
This approach consolidates heterogeneous data into a central database. It is the simplest form of data integration. Data consolidation provides fast application deployment, fast access to global data, and low administration costs.
In recent years many large-scale data integration projects have been done with data consolidation approach. Many different type of data, including audio, video, XML, email, messages, etc, had been consolidated in various platforms (Windows, Linux, Solaris, HPUX, AIX, Tru64, OpenVMS, OS/390, etc.) on centeralized Very Large Database with proven scalability.
Data Federation
This approach federates data in multiple data stores into a single virtual database. Data federation hides physical location of data from applications and provides access to both structured and unstructured data. Data federation provides fast integration and support integration of data that cannot be consolidated such as legacy applications and data requiring local ownership. Data federation is often implemented via web services.
Data Sharing
This approach was traditionally implemented as replication or message queuing. It has evolved to include warehouse loading, event notification, workflow, and EAI. The data sharing approach lets data been share between users, applications, and databases by moving or copying data as needed.
No comments yet
![Validate my RSS feed [Valid RSS]](http://www.itcareersuccess.com/wp-content/themes/andyblue-ver-1/images/valid-rss.png)