Click here to Skip to main content
15,886,664 members
Articles / Database Development

Workaround for AT TIME ZONE Performance Issues in SQL Server Views & Dynamic SQL

Rate me:
Please Sign up or sign in to vote.
5.00/5 (1 vote)
6 Mar 2023CPOL5 min read 5.5K   5  
Workaround for performance issues with AT TIME ZONE
This article describes a workaround for AT TIME ZONE performance issues in SQL Server.

Introduction

In SQL Server 2016, a new AT TIME ZONE operator was introduced. It allows to convert datetime to the corresponding value in the target time zone. Queries with AT TIME ZONE run smoothly on small tables. But once you apply AT TIME ZONE on table column (not a scalar expression) on a huge table with millions of records, you'll get performance issues like those described in Jonathan Kehayias's article.

In my case, the problem was with long-running SQL Server views, not a dynamic SQL.

Custom views filtered millions of records by one or more datetime criteria - datetime column value must be within a floating time range. For example, Before_Now, After_Now, Today, Tomorrow, Yesterday, CurrentWeek, CurrentMonth, CurrentYear, Next_X_Hours, Previous_X_Hours, etc.

Those custom views could be divided into two types:

  1. time zone is specified as a scalar expression. For example, AT TIME ZONE 'Pacific Standard Time';
  2. time zone is stored in a separate table. For example, AT TIME ZONE (SELECT store_tz FROM [Stores] WHERE store_id = 123).

A typical CTE query used for pagination for the second type took 10x times more than for the first type. Some queries took even a few minutes.

I managed to decrease second type query time from minutes to hundreds of milliseconds by implementing a workaround described below.

Workaround Tips

First, let's define a Time Zone Period as period of time in a particular time zone during which offset to UTC is unchanged:

  • Time zone that observes Daylight Saving Time usually will have three periods per year (non-DST⇒DST⇒non-DST or DST⇒non-DST⇒DST).
  • It's better to merge neighboring periods with same offset into one long period.
  • Time zone that does not observe Daylight Saving Time will have one period that covers a few or more years.

We need time zone period information for correct filtering across date ranges especially when Daylight Saving Time starts or ends.

Tips/Tricks

  1. Cache time zone period information in a local table. Populate information not only for one current period but for the whole current year, previous and next years. The number of years depends on:
    1. how often do you want to update the cache table (each year/month/release, etc.)
    2. what datetime criteria do you use? For example, criteria 'datetime column value must be within next X years' requires to have cached time zone periods for current and X following years
  2. The cache table should have at least the following columns:
    • Time zone name (timezone_id)
    • Period Start Date in UTC (period_start_date_utc)
    • Period End Date in UTC (period_end_date_utc)
    • UTC Offset of specified period (offset_in_minutes). For example, -420 minutes for 'Pacific Standard Time' time zone during Daylight Saving Time period
    • UTC Offset difference between specified period and preceding period (delta_previous_offset_in_minutes). For example, -60/0/+60 minutes
    • UTC Offset difference between following period and specified period (delta_next_offset_in_minutes). For example, +60/0/-60 minutes
    • ...

    Two or more preceding/following periods may be stored in JSON datatype column.

  3. Additionally, store CHECKSUM(timezone_id) in separate [timezone_id_checksum] [int] NOT NULL column.
    Create CLUSTERED INDEX on this timezone_id_checksum column.
    This gives the best performance when filtering records by timezone -
    [timezone_id_checksum] = CHECKSUM(store_tz) is faster than [timezone_id] = [store_tz] because integer datatype has low space requirement (Integer vs Varchar(50))
  4. In WHERE statement combine criteria for neighboring periods of time. For example, criteria "processing_date must be within Today or Tomorrow".

    SQL
    WHERE
         (Today 12:00AM <= processing_date) AND 
         (processing_date <= Today 23:59:59PM)
         OR
         (Tomorrow 12:00AM <= processing_date) AND 
         (processing_date <= Tomorrow 23:59:59PM)

    may be rewritten as:

    SQL
    WHERE (Today 12:00AM <= processing_date) AND 
          (processing_date <= Tomorrow 23:59:59PM)
  5. If _timezone_period_cache table has hundreds of records, it's better to place the time zone information subquery before table joins.
    SQL
    SELECT O.*  
    FROM 
        (  --sub query
            ...
        ) STORE_TZ_INFO
    INNER JOIN [Orders] O (nolock) ON O.store_fk = STORE_TZ_INFO.store_pk

Test Tables

I had the following tables in my test database:

  • _timezone_period_cache with records for one 'Pacific Standard Time' time zone
  • Orders table with 2 mln records
  • Stores table with one record for 'Pacific Standard Time' time zone
  • Other tables: Clients, Products and OrderItems, but they were not used in my SQL Server Views

_timezone_period_cache table

SQL
CREATE TABLE [dbo].[_timezone_period_cache](
    [timezone_id] [varchar](50) NOT NULL,
    [offset_in_minutes] [int] NOT NULL,
    [timezone_id_checksum] [int] NOT NULL,
    [period_start_date_utc] [datetime2](7) NOT NULL,
    [period_end_date_utc] [datetime2](7) NOT NULL,
    [delta_previous_offset_in_minutes] [int] NOT NULL,
    [delta_next_offset_in_minutes] [int] NOT NULL
) ON [PRIMARY]
GO
--  create CLUSTERED INDEX
CREATE CLUSTERED INDEX idx_timezone_id_checksum
ON [_timezone_period_cache] (timezone_id_checksum);

timezone_period_cache table

Populate _timezone_period_cache table

When evaluating Time Zone Periods, it's important not to take the exact datetimeoffset when clocks turn backward (usually Daylight Saving Time ends), but a microsecond before and add later that microsecond to datetimeoffset in UTC. On the screenshot below, SQL Server returned two datetimeoffsets and their difference - it is 1 hour and 1 microsecond, not just 1 microsecond.

Image 2

For my test, I chose Pacific Standard Time time zone. Since I wanted to run tests for dates when Daylight Saving Time ends & starts, I populated records for 2022/2023 years in a script.

SQL
--2022

--2022-03-13 / 2022-11-06 DST UTC-7
INSERT INTO [dbo].[_timezone_period_cache]
           ([timezone_id]
           ,[offset_in_minutes]
           ,[timezone_id_checksum]
           ,[period_start_date_utc]
           ,[period_end_date_utc]
           ,[delta_previous_offset_in_minutes]
           ,[delta_next_offset_in_minutes]
           )
     VALUES
           ('Pacific Standard Time'
           , -7*60
           , CHECKSUM(CONVERT(varchar(50), 'Pacific Standard Time'))
           ,
                DATEADD(
                    HOUR,
                    3,
                    CONVERT(datetime2, DATEFROMPARTS(2022, 03, 13))
                ) AT TIME ZONE 'Pacific Standard Time'  AT TIME ZONE 'UTC'
           ,
            DATEADD(
                MICROSECOND,
                1,
                DATEADD(
                    MICROSECOND,
                    -1,
                    DATEADD(HOUR,
                        2,
                        CONVERT(datetime2, DATEFROMPARTS(2022, 11, 6))
                    )
                ) AT TIME ZONE 'Pacific Standard Time'  AT TIME ZONE 'UTC'
            )
            , 60
            , -60
        )
GO
--2022-11-06 / 2023-03-12 No DST UTC-8
INSERT INTO [dbo].[_timezone_period_cache]
           ([timezone_id]
           ,[offset_in_minutes]
           ,[timezone_id_checksum]
           ,[period_start_date_utc]
           ,[period_end_date_utc]
           ,[delta_previous_offset_in_minutes]
           ,[delta_next_offset_in_minutes]
           )
     VALUES
           ('Pacific Standard Time'
           , -8*60
           , CHECKSUM(CONVERT(varchar(50), 'Pacific Standard Time'))
           , DATEADD(
                MICROSECOND,
                1,
                DATEADD(
                    MICROSECOND,
                    -1,
                    DATEADD(HOUR,
                        2,
                        CONVERT(datetime2, DATEFROMPARTS(2022, 11, 6))
                    )
                ) AT TIME ZONE 'Pacific Standard Time'   AT TIME ZONE 'UTC'
            )
            , DATEADD(
                    HOUR,
                    3,
                    CONVERT(datetime2, DATEFROMPARTS(2023, 03, 12))
                ) AT TIME ZONE 'Pacific Standard Time'   AT TIME ZONE 'UTC'
            , -60
            , +60
        )
GO

--2023 year
--2023-03-12 / 2023-11-05 DST UTC-7
INSERT INTO [dbo].[_timezone_period_cache]
           ([timezone_id]
           ,[offset_in_minutes]
           ,[timezone_id_checksum]
           ,[period_start_date_utc]
           ,[period_end_date_utc]
           ,[delta_previous_offset_in_minutes]
           ,[delta_next_offset_in_minutes]
           )
     VALUES
           ('Pacific Standard Time'
           , -7*60
           , CHECKSUM(CONVERT(varchar(50), 'Pacific Standard Time'))
           ,
                DATEADD(
                    HOUR,
                    3,
                    CONVERT(datetime2, DATEFROMPARTS(2023, 03, 12))
                ) AT TIME ZONE 'Pacific Standard Time'  AT TIME ZONE 'UTC'
           ,
            DATEADD(
                MICROSECOND,
                1,
                DATEADD(
                    MICROSECOND,
                    -1,
                    DATEADD(HOUR,
                        2,
                        CONVERT(datetime2, DATEFROMPARTS(2023, 11, 5))
                    )
                ) AT TIME ZONE 'Pacific Standard Time'  AT TIME ZONE 'UTC'
            )
            , +60
            , -60
        )
GO
--2023-11-05 / 2024-03-10 No DST UTC-8
INSERT INTO [dbo].[_timezone_period_cache]
           ([timezone_id]
           ,[offset_in_minutes]
           ,[timezone_id_checksum]
           ,[period_start_date_utc]
           ,[period_end_date_utc]
           ,[delta_previous_offset_in_minutes]
           ,[delta_next_offset_in_minutes]
           )
     VALUES
           ('Pacific Standard Time'
           , -8*60
           , CHECKSUM(CONVERT(varchar(50), 'Pacific Standard Time'))
           , DATEADD(
                MICROSECOND,
                1,
                DATEADD(
                    MICROSECOND,
                    -1,
                    DATEADD(HOUR,
                        2,
                        CONVERT(datetime2, DATEFROMPARTS(2023, 11, 5))
                    )
                ) AT TIME ZONE 'Pacific Standard Time'   AT TIME ZONE 'UTC'
            )
            , DATEADD(
                    HOUR,
                    3,
                    CONVERT(datetime2, DATEFROMPARTS(2024, 03, 10))
                ) AT TIME ZONE 'Pacific Standard Time'   AT TIME ZONE 'UTC'
            , -60
            , +60
        )
GO

Orders table

SQL
CREATE TABLE [dbo].[Orders](
    [order_pk] [int] IDENTITY(1,1) NOT NULL,
    [client_fk] [int] NOT NULL,
    [store_fk] [int] NOT NULL,
    [created_date_store_local] [datetime2](7) NOT NULL,
    [shipping_date_store_local] [datetime2](7) NULL,
    [delivered_date_client_local] [datetime2](7) NULL,
 CONSTRAINT [PK_Orders] PRIMARY KEY CLUSTERED
(
    [order_pk] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
        IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
        ALLOW_PAGE_LOCKS = ON, _
        OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

Orders table

Stores table

SQL
CREATE TABLE [dbo].[Stores](
             [store_pk] [int] IDENTITY(1,1) NOT NULL,
             [store_tz] [varchar](50) NOT NULL,
             [store_name] [nvarchar](100) NOT NULL,
         CONSTRAINT [PK_Stores] PRIMARY KEY CLUSTERED 
        (
            [store_pk] ASC
        ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
                IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, 
                OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
        ) ON [PRIMARY]

Stores table

Original View

For store located in San-Francisco ('Pacific Standard Time' time zone) v_Original view filters orders by DateTime criteria - column value should be in a time range from 24h before now to 24 hours after now (to get different values of GETUTCDATE(), I changed local SQL SERVER system time).

SQL
CREATE VIEW v_Original AS
          SELECT O.*
          FROM [Orders] O  
          INNER JOIN Stores S ON S.store_pk = O.store_fk 
          WHERE 
                  O.store_fk = 1
              AND
                  [created_date_store_local] 
                   AT TIME ZONE S.store_tz < DATEADD(day, +1, GETUTCDATE())
              AND 
                  [created_date_store_local] 
                   AT TIME ZONE S.store_tz > DATEADD(day, -1, GETUTCDATE())

Get Total Record Number Query

SQL
SELECT COUNT(*) FROM v_Original

CTE with Pagination

SQL
DECLARE @PageNumber int = 10
DECLARE @PageSize int = 100

;WITH v_Original AS
(
    SELECT *,
        ROW_NUMBER() OVER (ORDER BY order_pk) AS RowNumber
    FROM v_Original
)

SELECT * FROM v_Original
WHERE RowNumber BETWEEN ((@PageNumber - 1) * @PageSize) + 1
                AND (@PageNumber * @PageSize);

Optimized View

SQL
          CREATE VIEW v_Optimized AS
          SELECT O.*  
          FROM 
               (
                  SELECT tz_name
                      , utc_plus_tzOffset			
-- Evaluate lower time bound for [now - 24h < column] criterion
                      , DATEADD(minute
                          , IIF(
                              utc_plus_tzOffset_minus_24h < 
                              period_start_date_utc --if Date from past 
                                                    --is in preceding period
                              , -1 * delta_previous_offset_in_minutes
                              , 0
                            )
                          , utc_plus_tzOffset_minus_24h
                        ) AS utc_plus_tzOffset_minus_24h_adjusted
-- Evaluate upper time bound for [column < now + 24h] criterion
                      , DATEADD(minute
                          , IIF(
                              utc_plus_tzOffset_plus_24h > 
                              period_end_date_utc --if Date from future 
                                                  --is in next period
                              , delta_next_offset_in_minutes
                              , 0
                          )
                          , utc_plus_tzOffset_plus_24h
                          ) AS utc_plus_tzOffset_plus_24h_adjusted
                      , store_pk
                  FROM 
                  (
                      SELECT
                          tz_name
                          , utc_plus_tzOffset				
                          , store_pk
                          , DATEADD(day, -1, utc_plus_tzOffset) 
                                    AS utc_plus_tzOffset_minus_24h
                          , DATEADD(day, +1, utc_plus_tzOffset) 
                                    AS utc_plus_tzOffset_plus_24h
                          , period_start_date_utc, delta_previous_offset_in_minutes 
                          , period_end_date_utc, delta_next_offset_in_minutes
                      FROM 
                      (
                          SELECT timezone_id AS tz_name
            --   column AT TIME ZONE 'any timezone' < UtcNow ⇒
            --   column' < UtcNow ⇒
--  Since  column - column' = TimeZoneOffset 
            --   column' + (column - column') < UtcNow + TimeZoneOffset ⇒
            --   column < UtcNow + TimeZoneOffset
                              , DATEADD(MINUTE,  [offset_in_minutes], 
                                GETUTCDATE()) AS utc_plus_tzOffset		
                              , store_pk
                              , period_start_date_utc, delta_previous_offset_in_minutes
                              , period_end_date_utc, delta_next_offset_in_minutes
                          FROM 
                          (
                              SELECT timezone_id, [offset_in_minutes], store_pk
                                  , TZP.period_start_date_utc, _
                                    TZP.delta_previous_offset_in_minutes
                                  , TZP.period_end_date_utc, _
                                    TZP.delta_next_offset_in_minutes
                              FROM [_timezone_period_cache] TZP (nolock)
                              INNER JOIN Stores (nolock) ON (1=1)
                              WHERE                                 
--  If 'store_tz' time zone column type does not match  'timezone_id' 
--  column type - use CONVERT function
     -> CHECKSUM(CONVERT(varchar(50), store_tz_unicode))
                                  TZP.[timezone_id_checksum] = CHECKSUM(store_tz) 
                                  --  Find current TimeZonePeriod 
                                  --  for 'store_tz' time zone  
                                  AND [period_start_date_utc] < GETUTCDATE()
                                  AND [period_end_date_utc] > GETUTCDATE()				
                          ) TZI	
                      ) TZI2
                  ) TZI3
                ) STORE_TZ_INFO
            INNER JOIN [Orders] O (nolock) ON O.store_fk = STORE_TZ_INFO.store_pk 
            WHERE
                  store_fk = 1 
              AND
                  [created_date_store_local] >= utc_plus_tzOffset_minus_24h_adjusted  
              AND 
                  [created_date_store_local] <= utc_plus_tzOffset_plus_24h_adjusted

Get Total Record Number Query

SQL
SELECT COUNT(*) FROM v_Optimized

CTE with Pagination

SQL
DECLARE @PageNumber int = 10
DECLARE @PageSize int = 100

;WITH cte_optimized AS
(
    SELECT *,
        ROW_NUMBER() OVER (ORDER BY order_pk) AS RowNumber
    FROM v_Optimized
)

SELECT * FROM cte_optimized
WHERE RowNumber BETWEEN ((@PageNumber - 1) * @PageSize) + 1
                AND (@PageNumber * @PageSize);

Test Results

For the test, I used a desktop PC with 4 core CPU, 16 GB of RAM, and HDD. I changed the system time to November 5 and 6, 2022 respectively (days before and after DST change in 'Pacific Standard Time' time zone). As expected, all queries returned the same resultsets - it confirms the workaround is a completely working solution across date ranges.

First, I run queries when Orders table had primary key (PK) only. Later, I added a nonclustered index:

SQL
CREATE NONCLUSTERED INDEX [idx_Orders_created_date]
              ON [dbo].[Orders] ([store_fk],[created_date_store_local])

Here are the results:

SQL Server View \ query time count(*) CTE with pagination
PK only PK + nonclustered index PK only PK + nonclustered index
Original (uses AT TIME ZONE operator) 9082 ms 8358 ms 318 ms 324 ms
Optimized (uses _timezone_period_cache table) 1514 ms 23 ms 52 ms 53 ms

Conclusion

By using Time Zone Period cache table, SQL query performance increased from 6 to 360 times. With the proposed workaround, we can have the exact same resultset as by using AT TIME ZONE operator. This approach may be used not only in SQL Server views but in dynamic SQL also. Each set of fixed or floating time range criteria will have its own tricky solution with complex subqueries.

History

  • 6th March, 2023: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
Poland Poland
I also wrote Online Json Editor & 'Json/Xml to Excel converter'. It exports data to excel in the same way as the ReportExporters library does - plain/hierarchical views.
www.json-xls.com

Comments and Discussions

 
-- There are no messages in this forum --