Thursday, July 4, 2019

Talend ETL - How to Aggregating values and Sorting data

Actually, tAggregateRow receives a flow and aggregates it based on one or more columns. For each output line, are provided the aggregation key and the relevant result of set operations (min, max, sum etc.).

If you are using Talend ETL tool for processing your data and you want to aggregated and sorted data from the incoming data file then you can use tAggregateRow component which comes in Processing components category to help you to perform all types of processing tasks on data flows, including aggregations.

This component handles flow of data therefore it requires input and output, hence is defined as an intermediary step. Usually the use of tAggregateRow is combined with the tSortRow component to sort your data output.


Aggregating values and sorting data- This example shows you how to use Talend processing components to aggregate the users' comprehensive scores and then sort the aggregated scores based on the users names.

Steps to be follow -
1. Creating a Job for aggregating and sorting data
2. Creating Raw File Connections and metadata
3. Configure iFileInputDelimited to connect with File delimited metadata
4. Configure tAggregateRow component, a tSortRow component, and a tLogRow
5. Aggregate the rows with a tAggregateRow, grouping by user name, selecting the score value for each of the value rows - but make sure to ignore the nulls.
6. Executing the Job to aggregate and sort data

No comments:

Post a Comment