Sometimes, we have the requirement to
pull the most recent updated record from our employee table or any customer
tables. This is necessary because we try to take care of every single action
which is performed through the business application. In this case, we have
multiple records for any customer or employee but we need to show only most
updated record of the employee/customer.
So, How can we do this in SQL? To
understand the scenario, we are using customer table where created date is used
to capture most updated records. In this case, we cannot use distinct keyword
to pull the data because our data would be like as given below:
Use tempdb
Go
------ Design Customer table with the help of CTE
------ Where created date will used to pull most recent
records
;with
cte as
(
select FirstName='abc', LastName='yxz',
emailId='abc@xyz.com', CreatedDate=convert(varchar(10),getdate()-5 ,101)
union
select FirstName='abc', LastName='yxz',
emailId='abc@xyz.com', CreatedDate=convert(varchar(10),getdate()-2,101)
union
select FirstName='cde', LastName='mno',
emailId='cde@xyz.com', CreatedDate=convert(varchar(10),getdate()-5,101)
)
---- Pull all records
select FirstName, LastName,
emailId, CreatedDate
from cte
---- Output for all records
|
In the above table, you can see that
there are duplicate records for abc customer due to having two created dates.
Now, we want to pull the most recent record for abc customer. To do this, we
will use the row_number() function with partition option of SQL to get the row id
against each customer as given below:
Use tempdb
Go
------ Design Customer table with the help of CTE
------ Where created date will used to pull most recent
records
;with
cte as
(
select FirstName='abc', LastName='yxz',
emailId='abc@xyz.com', CreatedDate=convert(varchar(10),getdate()-5 ,101)
union
select FirstName='abc', LastName='yxz',
emailId='abc@xyz.com', CreatedDate=convert(varchar(10),getdate()-2,101)
union
select FirstName='cde', LastName='mno',
emailId='cde@xyz.com', CreatedDate=convert(varchar(10),getdate()-5,101)
)
------ Pull all records and assign row id for each
customer
select FirstName, LastName, emailId, CreatedDate,
row_id from
(
select FirstName, LastName,
emailId, CreatedDate,
----- use order by CreatedDate desc
row_id=row_number() over (partition by emailId order by CreatedDate desc)
from cte
)tbl
---- Output for all records
|
In the above query, you can see that we
have use cte table as derived data table and add new column as row_id to store
data in created date as descending order because in this case we are forcing to
generate the row id for most recent created date for every customer and
partition by is used to start from 1 for every new customer.
Now,
we need to pull the records for each customer where row_id=1 as given
below:
Use tempdb
Go
------ Design Customer table with the help of CTE
------ Where created date will used to pull most recent
records
;with
cte as
(
select FirstName='abc', LastName='yxz',
emailId='abc@xyz.com', CreatedDate=convert(varchar(10),getdate()-5 ,101)
union
select FirstName='abc', LastName='yxz',
emailId='abc@xyz.com', CreatedDate=convert(varchar(10),getdate()-2,101)
union
select FirstName='cde', LastName='mno',
emailId='cde@xyz.com', CreatedDate=convert(varchar(10),getdate()-5,101)
)
------ Pull all records and assign row id for each
customer
select FirstName, LastName,
emailId, CreatedDate,
from
(
select FirstName, LastName,
emailId, CreatedDate,
----- use order by CreatedDate desc
row_id=row_number() over (partition by emailId order by CreatedDate desc)
from cte
)tbl
---- filter unique
data condition
---- and row_id=1 is the first record
where row_id=1
---- Output for all records
|
Conclusion
By using Row_Number()
fuction with partition, we can assign them some number based on our requirement
and filter all the unwanted data from the final output and store them in the
persistent data table.
No comments:
Post a Comment