Click here to Skip to main content
15,867,686 members
Articles / Database Development

Indexed Views and How It Can Help in Improving Query Performance

Rate me:
Please Sign up or sign in to vote.
5.00/5 (8 votes)
23 Mar 2019CPOL6 min read 13.1K   12   2
A considerable method to speed up query running time

Introduction

SQL Server Views can be a great helpful tool for database developers, allowing us to encapsulate many complicated queries into one single query and help other developers in working with our database more easily. But if we just create a regular view, its mission is just help to make the use of a complicated SELECT query become faster and more readable and understandable. There is no performance improvement applied on that kind of view, as the view becomes a shortcut to retrieve data. Fortunately, now we can create an indexed view, which will help us to increase query performance, and save the cost. There are many benefits if we understand it well and manipulate it in the right way. But there is also no free-lunch, so what should we be concerned about? Where and when to use indexed views?

In this very basic article, I'm going to show you what is an indexed view and how to use it to improve query performance in SQL Server.

What Is a View?

View is an object in SQL Server that allows us to group a complex and complicated SELECT clause, with many sub clauses, like JOIN operator or aggregate function like SUM COUNT AVG,... into a single query. For a very common database for managing customers invoices, assuming we have a query with many JOIN clauses to list all order detail from customers who live in Danang city.

SQL
SELECT a.Id, a.FirstName, a.LastName, d.ProductId, d.ProductName
FROM Customers a JOIN Orders b
ON a.CustomerId = b.CustomerId
JOIN OrderDetails c 
ON b.OrderId = c.OrderId
JOIN Products d
ON c.ProductId = d.ProductId
WHERE a.City = 'Danang'

So wherever we want a list of order details from customers who live in Danang city, we have to run the query above. It's sometime not a short but very very long and complex query, and if the database engineer gives the complex query to a backend developer, he would get confused. In this case, the database engineer will create a view name vOrderDetail_DanangCity as below:

SQL
CREATE VIEW dbo.vOrderDetail_DanangCity AS
SELECT a.Id, a.FirstName, a.LastName, d.ProductId, d.ProductName
FROM Customers a JOIN Orders b 
ON a.CustomerId = b.CustomerId 
JOIN OrderDetails c 
ON b.OrderId = c.OrderId 
JOIN Products d 
ON c.ProductId = d.ProductId 
WHERE a.City = 'Danang'

So now the database engineer provides a view to backend developer, named vOrderDetail_DanangCity. The developer only needs to know what the view does, and doesn't care about how it does it. Anywhere he needs to use the view, he just:

SQL
SELECT * FROM dbo.vOrderDetail_DanangCity

Cool, now he can get the result using only 1-line query. But what happens inside the view? In this case, creating a view is just a shortcut, and when the view is called, the Database Management System (DBMS) still has to re-run the query above to get the result. It means many join clauses, sub clauses, and aggregate functions... would be executed again.

Indexed Views

Indexed Views (or Materialized Views in some books), is a view that has unique, clustered index on it. Let's create an indexed view to understand it clearly.

Assuming I have a simple database with the following relationship:

Image 1

One student can join many courses. One course can have many students joined. A student joining a course will have an examination and have a score. In the database, there are 2000 rows in Courses, 5000 rows in Students, and 10,000,000 rows in Student_Course. You can download this demo database backup file (bak) at the end of this article.

Assuming we have to list all courses that have at least 1 student joined, and the average score of that course.

SQL
SELECT crs.CourseName, AVG(std_crs.Score) AS Average
FROM dbo.Courses crs JOIN Student_Course std_crs
ON crs.Id = std_crs.CourseId
GROUP BY crs.CourseName

It takes 4 seconds to execute this query. We will use this query to create a view:

SQL
CREATE VIEW dbo.AverageScrores
WITH SCHEMABINDING
AS
SELECT crs.CourseName, SUM(std_crs.Score) AS TotalScore, COUNT_BIG(*) AS NumberOfRecords
FROM dbo.Courses crs JOIN dbo.Student_Course std_crs
ON crs.Id = std_crs.CourseId
GROUP BY crs.CourseName

Note that we will use SUM and COUNT_BIG(*) for this case. Creating indexed views in SQL Server has some restrictions and requirements, and we have to deal with it, in this case by turning AVG to SUM and COUNT, and in the future, we will calculate AVG = SUM / COUNT. We will not discuss these requirements here.

The WITH SCHEMABINDING is a mandatory requirement to create indexed view. It helps to protect the column appear in the view, you cannot alter or drop column from the base table. If you want, first you have to drop the view, alter or delete column, then re-create another view.

Finally, create a unique, clustered index on that view. We can create index on one or more columns.

SQL
CREATE UNIQUE CLUSTERED INDEX CIX_AverageScrores
ON dbo.AverageScrores(CourseName);

And now, we have an indexed view. When we create a Unique index on a view, we are "materializing" it.

  • If the view doesn't have any unique index, it is just a shortcut. When the view is called, DBMS will re-run the query to get the result. It means DBMS will run many join clauses, many complicated aggregate functions,... all over again.
  • If the view has an unique index, it is "materialized". One copy of this view is created. This materialized view is used to store the result of the query we used to create the view. When the view is called, instead of re-running the query with many complex clauses, DBMS will read from the materialized view.

Let's run the query below:

SQL
SELECT CourseName, TotalScore / NumberOfRecords AS Average FROM dbo.AverageScrores WITH (NOEXPAND)

We are now using the indexed view, by pointing out that the query should use WITH (NOEXPAND). WITH (NOEXPAND) will let the DBMS know that it is an indexed view and we want to use the result stored before. If you forget to add WITH (NOEXPAND), the DBMS would re-run the query.

The result appears after 00:00:00 seconds. Immediately.

Compare Indexed Views and Views

Run these 2 queries: The first query is the query we used to create the view. Second query is the indexed views.

SQL
SELECT crs.CourseName, SUM(std_crs.Score) / COUNT_BIG(*) AS Average
FROM dbo.Courses crs JOIN dbo.Student_Course std_crs
ON crs.Id = std_crs.CourseId
GROUP BY crs.CourseName
 
SELECT CourseName, TotalScore / NumberOfRecords AS Average FROM dbo.AverageScrores  WITH (NOEXPAND)

We can see the result is the same, there is no difference.

Image 2

But the time of execution is totally different. When we use indexed view, it's 00:00:00. When we use the based query, it's 4 seconds. This difference is because indexed views WITH NOEXPAND doesn't have to calculate the result again. It didn't join table and calculate aggregate function. It just reads the result stored in the view and shows it to us.

When to Use

Using indexed view seems to be a good choice. But is that a free-lunch?

No, it's not. In fact, it's an expensive lunch if we use them abusively. Whenever the based tables (the tables join in FROM clause when we create the view) has changed, DBMS not only has to update the table, re-index the view but also re-calculate the value stored. In this case, when some student joins any course, the DBMS has to re-calculate SUM and COUNT of that course, which is stored in materialized view. It also means the more complex the query is, the more complicated the view maintenance takes.

View maintenance is a big problem. What is the most effective way to maintain the result stored fastest without re-running the original query? Because of view maintenance problem, there are many restrictions and requirements for creating indexed view, for example, query with OUTER JOIN is not allowed.

So we know Indexed view will:

Make SELECT clause run faster.

Make INSERT, UPDATE, DELETE or some manipulating event become slower.

So use them wisely. Just apply indexed view on a query that we use very often, but the based tables are not updated frequently. In this case, the performance and effectiveness reach the maximum. Otherwise, be careful. You can make the system run slower than you expected if the view is not used very often, but the based tables have been updating time by time.

Image 3

You can download the demo database at the link below (I use SQL Server 2014):

History

  • 23rd March, 2019: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Team Leader
Vietnam Vietnam
I'm currently working in .NET technology, including Web and mobile app development.
My personal blog is at http://codingamazing.com

Comments and Discussions

 
QuestionImages missing Pin
Klaus Luedenscheidt23-Mar-19 19:50
Klaus Luedenscheidt23-Mar-19 19:50 
AnswerRe: Images missing Pin
Minh Tuan Do23-Mar-19 20:30
professionalMinh Tuan Do23-Mar-19 20:30 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.