65.9K
CodeProject is changing. Read more.
Home

How to fetch data from the previous or next rows in the resultset

starIconstarIconstarIconstarIconstarIcon

5.00/5 (4 votes)

Jul 27, 2011

CPOL

1 min read

viewsIcon

29654

downloadIcon

70

This tip demonstrates the use of LAG and LEAD functions in SQL Server.

One common problem is the need to have data on a result set row from another row in the same result set. For example, in order to calculate an interval between dates on two separate rows, typically a self join is needed. This requires to add the same table (or set) twice into the query. With a simple query, this won't be any problem but if the result set is created using a more complex query, creating the same complexity twice can cause problems and of course create performance problems.

In SQL Server 2012, two functions with windowing support are introduced to address this problem: LAG (to look backwards) and LEAD (to look forward). Both can be instructed to look over several records, not just the previous or next.

First we need a simple table with some test data:

CREATE TABLE LagTest (
   Event   VARCHAR(100)  NOT NULL,
   AddTime DATETIME      NOT NULL
);

And some data with random dates in the future:

INSERT INTO LagTest VALUES ('E#1', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);
INSERT INTO LagTest VALUES ('E#2', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);
INSERT INTO LagTest VALUES ('E#3', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);
INSERT INTO LagTest VALUES ('E#4', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);
INSERT INTO LagTest VALUES ('E#5', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);
INSERT INTO LagTest VALUES ('E#6', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);
INSERT INTO LagTest VALUES ('E#7', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);
INSERT INTO LagTest VALUES ('E#8', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);
INSERT INTO LagTest VALUES ('E#9', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);
INSERT INTO LagTest VALUES ('E#10', CURRENT_TIMESTAMP + ABS( RAND() ) * 10);

Now, in order to fetch the following information:

  • What row I'm on?
  • What was the previous event?
  • What's the datetime on previous row?
  • What's the interval in hours between the previous and this event?
  • What event pair is going to be next?

The query could be like:

SELECT sub.Event AS ActualRow,
       CASE
          WHEN ROW_NUMBER() OVER (ORDER BY sub.AddTime) = 1 THEN 'First event'
          ELSE sub.PrevEvent + ' -> ' + sub.Event
       END AS LookBack,
       sub.PrevTime,
       DATEDIFF( hour, sub.PrevTime, sub.AddTime)  AS IntervalInHours,
       CASE
          WHEN sub.NextEvent IS NOT NULL
             THEN 'Next: ' + sub.Event + ' -> ' + sub.NextEvent
          ELSE 'Last event'
       END AS LookForward
FROM   (   SELECT lt.Event,
                  lt.AddTime,
                  LAG(lt.Event, 1)   OVER (ORDER BY lt.AddTime) AS PrevEvent,
                  LAG(lt.AddTime, 1) OVER (ORDER BY lt.AddTime) AS PrevTime,
                  LEAD(lt.Event, 1)  OVER (ORDER BY lt.AddTime) AS NextEvent
           FROM   LagTest lt) AS sub

First, an inline view (named sub) is used to gather the data. Since LAG and LEAD are window functions, the ordering must be specified. In this example the AddTime column defines the order of the rows.

An example output from the inline view could look like the following:

Event AddTime                 PrevEvent PrevTime                NextEvent
----- -------                 --------- --------                ---------
E#5   2011-07-29 05:17:44.680 NULL      NULL                    E#1
E#1   2011-07-29 18:30:02.053 E#5       2011-07-29 05:17:44.680 E#4
E#4   2011-08-01 01:39:16.430 E#1       2011-07-29 18:30:02.053 E#10
E#10  2011-08-01 14:32:50.040 E#4       2011-08-01 01:39:16.430 E#2
E#2   2011-08-02 19:30:16.483 E#10      2011-08-01 14:32:50.040 E#8
E#8   2011-08-02 20:20:19.920 E#2       2011-08-02 19:30:16.483 E#9
E#9   2011-08-03 01:23:00.940 E#8       2011-08-02 20:20:19.920 E#6
E#6   2011-08-03 02:09:39.343 E#9       2011-08-03 01:23:00.940 E#7
E#7   2011-08-04 05:30:01.203 E#6       2011-08-03 02:09:39.343 E#3
E#3   2011-08-05 11:16:57.253 E#7       2011-08-04 05:30:01.203 NULL

The main query is basically just formatting, concatenation of event column data, interval calculation etc. The actual output would be like:

ActualRow LookBack    PrevTime                IntervalInHours LookForward
--------- --------    --------                --------------- -----------
E#5       First event NULL                    NULL            Next: E#5 -> E#1
E#1       E#5 -> E#1  2011-07-29 05:17:44.680 13              Next: E#1 -> E#4
E#4       E#1 -> E#4  2011-07-29 18:30:02.053 55              Next: E#4 -> E#10
E#10      E#4 -> E#10 2011-08-01 01:39:16.430 13              Next: E#10 -> E#2
E#2       E#10 -> E#2 2011-08-01 14:32:50.040 29              Next: E#2 -> E#8
E#8       E#2 -> E#8  2011-08-02 19:30:16.483 1               Next: E#8 -> E#9
E#9       E#8 -> E#9  2011-08-02 20:20:19.920 5               Next: E#9 -> E#6
E#6       E#9 -> E#6  2011-08-03 01:23:00.940 1               Next: E#6 -> E#7
E#7       E#6 -> E#7  2011-08-03 02:09:39.343 27              Next: E#7 -> E#3
E#3       E#7 -> E#3  2011-08-04 05:30:01.203 30              Last event

Having these two functions makes it easier to get data from next or previous rows.