Monday, March 31, 2008
Recycle process is one of the buzz word in any data warehouse or data integration process. It is very important to understand what is the recycle requirement before thinking about any solution. Improper recycle solution may become more of maintenance nightmare.
Recycle: It the the approach defined in ETL process to handle and re-process any rejected rows due to technical dependencies. Every simple example may be, reference data load is delayed and regular payload runs on schedule. This scenario can potentially create many failed rows due to reference code validation. But this failure is not due to any business error, but purely due to technical dependencies and there is no need for any business involvement to resolve the issue. Such failed records are candidate for automatic recycle. Technical design should take care of reprocessing such failed rows in regular interval and clean up the reject rows table.
Recycle Process Approach: The idea here is to design a process which can process the failed records and bring into the main stream process without any manual intervention. When the failed rows are brought back in the main stream processing, care need to be taken to make sure the failed record will not create duplicate in the main stream processing. In such case, duplicate need to handled and accordingly processed.
Recycle: It the the approach defined in ETL process to handle and re-process any rejected rows due to technical dependencies. Every simple example may be, reference data load is delayed and regular payload runs on schedule. This scenario can potentially create many failed rows due to reference code validation. But this failure is not due to any business error, but purely due to technical dependencies and there is no need for any business involvement to resolve the issue. Such failed records are candidate for automatic recycle. Technical design should take care of reprocessing such failed rows in regular interval and clean up the reject rows table.
Recycle Process Approach: The idea here is to design a process which can process the failed records and bring into the main stream process without any manual intervention. When the failed rows are brought back in the main stream processing, care need to be taken to make sure the failed record will not create duplicate in the main stream processing. In such case, duplicate need to handled and accordingly processed.
Labels: Recyle Process - Challange
Comments:
<< Home
Hi there! glad to drop by your page and found these very interesting and informative stuff. Thanks for sharing, keep it up!
- data integration
Post a Comment
- data integration
<< Home

