Friday, September 26, 2008
These are few question which any Data warehouse consultant should be very clear in his mind.
ODS - also known as Operational Data Store. ODS is the first persistent data layer in the data warehouse layered architecture. In practice, ODS data model is kept very close to source data model. Then what is the value of ODS layer ? Data is extracted from source using very simple queries, so that there is minimum impact on the source database. Following are typical processing/task are performed on ODS
- Data Standardization
- Naming standardization
- Data Profiling
When data is extracted from the source, data standardization is performed to convert the data value to enterprise standard value ( It is recommended to store the original and standardize value in the ODS for any future reference). Different table column names are also standardized to enterprise standard name. This enable business user to get first preview of what is coming next in the data warehouse. This data store is also open for any data profiling activity for next data store, Integration.
Staging Area : Staging is transient data store which are used to load the data before loading into persistent data store. In staging area, data is prepared and made ready to load into persistent area. Table structure in Staging Area is kept same as persistent data store table structure. This help to move data from staging area to persistent data store effectively without keeping persistent data store locked for long time. Staging area is ETL/ELT requirement to enable effective controlled data load. Staging area is truncated after every successful data load in persistent data store. There are different approach used to load data from staging area to persistent data store
- ELT - use SQL to move data
- Partition pining - Move partition from Staging to Persistent Data Store.
- ELT - Standard extract and load.
For further discussion please write at contact_ak@yahoo.com ( Arvind Kumar)
Labels: ODS and Staging Area
