Tuesday, December 17, 2019

Talend ETL - Lookup data for Insert, Update and delete

Here you will learn "How to use Lookup data for Insert, Update and Delete by using tMap?" in Talend Open Studio.

We have an ideal business scenario where business wants to ensure that lookup in output file, check if there is any change in the new input file and if the record doesn't exists in the output file, business needs to insert these records into the output file, if the row already exists then they needs to update the data in the output file.

We have fields such as unique key such as SalespersonId and Salesyear in the both files. So, we will just use tMap with inner join with new file as main row and previous one as lookup.

Our source and target are raw data files and using tFileInputDelimited, tFileOutputDelimited, tLogRow, tMap components in the example.

If we need to detect deleted records, you need another subjob where main row becomes the lookup, then rejected records are deleted ones. All the records of the lookup flow need to be loaded before processing each record of the main flow. 

Three types of lookup loading models are provided suiting various types of business requirement and the performance needs: Load once, Reload at each row, and Reload at each row (cache).

Used components to accomplished this jobs are -
tFileInputDelimated: We can use this component to read a file and separate fields contained in this file using a defined separator. It allows you to create a data flow.

tMap: tMap is an advanced component to transforms and routes data from single or multiple sources to single or multiple destinations, which integrates itself as plugin to Talend Studio. Possible uses are from a simple reorganization of fields to the most complex Jobs of data multiplexing or demultiplexing transformation, concatenation, inversion, filtering and more.

tLogRow: This component is used to monitor data processed and displays data or results in the Run console. This component can be used as intermediate step in a data flow or as a n end object in the Job flowchart.

tFileOutputDelimated: tFileOutputDelimited outputs data to a delimited file and This component writes a delimited file that holds data organized according to the defined schema. Use this component to write a delimited file and separate fields using a field separator value.

To watch a live demo, please check the below YouTube video -

To Learn more, please visit our YouTube channel at - 

To Learn more, please visit our Instagram account at -

To Learn more, please visit our twitter account at -

No comments:

Post a Comment