Wednesday, August 16, 2006

 

ETL Tool

It is interesting to see wide variety of DWH and DI ( like ETL, Data Quality, Data Profiling, BI tools) coming together to make the customer life easy. But this adds complexity to the developer knowledge base. What does this mean from a DWH/DI architect or developer? I started my career as DWH consultant in 1999 and I have seen the changes in this industry since then. Fortunately or unfortunately, I started with Informatica 3.1 and Business Object. It was not my choice of tool but it was the demand from the client. I was a mainframe ( COBOL, CICS, DB2) developer and there was need for a developer who can support the DWH initiative to extract data from Mainframe. And thus I got into first creating the flat file ( at that time PowerExchange/Striva was not there) and loading those file in DWH using informatica. Since I have loaded the data, I was the best person available with data knowledge and I end up developing Informatica mapping then developing BO reports over it. Since then Informatica as released 5 major release ( 4.x, 5.x, 6.x, 7.x). Informatica has made life easy for developer and same time posing some challenges for customer such as performance. And this is challenge for any ETL or BI tool. Other big challenge is easy of use of these tools. These tools are very user friendly due to which a person can start development without much in-depth knowledge of tool. Some time this can end up in very clumsy code.

There strong correlation between DWH, BI, data quality, Data profiling tools. And it makes sense to bring all these tools together. But this also means that the architect should understand very clearly the correlation between these. Till few year back ETL tool were mostly batch processing, loading the Data Warehouse. But now demand is to get the data as soon as possible to end user and this push the DWH and DI to real time (right time!!) . And this requires the BI reporting to be more efficient to present the data.

Time being the important criteria, Data quality is also one of the most critical component in the life cycle. No customer wants to present incorrect data to their user. This brings Data Quality tool in line with ETL and BI.

To achieve all this, it is important that we have short development cycle to achieve any new development. And Data Profiling plays a critical role for data analyst to capture the data anomalies and hidden business rule for the data.

Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?