Friday, May 6, 2016

SQL - Queries Tuning and Optimization Techniques

In SQL, it is very difficult to write complex SQL queries involving joins across many (at least 3-4) tables and involving several nested conditions because a SQL statement, once it reaches a certain level of complexity, is basically a little program in and of itself. 
A database index is a data structure that improves the speed of operations on a database table. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random look ups and efficient access of ordered records. Indexing is incredibly important when working with large tables, however, occasionally smaller tables should be indexed, if they are expected to grow.


Try to consistently indent and don't be afraid to use multiple lines. You don't have to write it all at once. Complex queries can sometimes just be a collection of simple queries. You need to follow some basic guidelines and Take the time to think these through such as-

  1. List all of the columns that are to be returned
  2. List all of the columns that are used in the WHERE clause
  3. List all of the columns used in the JOINs (if applicable)
  4. List all the tables used in JOINs (if applicable)
  5. Get the correct records selected first
  6. Save the complex calculations for last
  7. If you do use a Common Table Expression (CTE), be aware that the query only persists until the next query is run, so in some cases where you are using the CTE in multiple queries, it might be better for performance to use a temp table.


Once you have the above information organized into this easy-to-comprehend form, it is much easier to identify those columns that could potentially make use of indexes when executed.
It makes a big difference to really understand how the data is combined, selected, filtered, and output. Here, query Optimization tricks comes into the picture to increase the performance of the program or software. There are a lot of guideline points to tune your query which do work as the boost of the query performance.  These guideline points are mentioned below: 

  1. SET NOCOUNT ON at the beginning of each stored procedure you write. This statement should be included in every stored procedure, trigger, etc. that you write.
  2. The SQL query becomes faster if you use the actual columns names in SELECT statement instead of than '*'.
  3. HAVING clause is used to filter the rows after all the rows are selected if you are using aggregation functions. It is just like a filter. Do not use HAVING clause for any other purposes.
  4. It is the best practice to avoid sub queries in your SQL statement and try to minimize the number of subquery block in your query if possible.
  5. Use operator EXISTS, IN and table joins appropriately in your query. The reason is- Usually IN has the slowest performance 
  6. IN is efficient when most of the filter criteria are in the sub-query.  
  7. EXISTS is efficient when most of the filter criteria is in the main query.
  8. Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship.
  9. Be careful while using conditions in WHERE clause.
  10. To write queries which provide efficient performance follow the general SQL standard rules.
  11. Use single case for all SQL verbs
  12. Begin all SQL verbs on a new line
  13. Separate all words with a single space 
  14. Right or left aligning verbs within the initial SQL verb
  15. Indexes have the advantages as well as disadvantages as given below-
  16. Do not automatically add indexes on a table because it seems like the right thing to do. Only add indexes if you know that they will be used by the queries run against the table.
  17. Indexes should be measured on all columns that are frequently used in WHERE, ORDER BY, GROUP BY, TOP and DISTINCT clauses.
  18. Do not add more indexes on your OLTP tables to minimize the overhead that occurs with indexes during data modifications.
  19. Drop all those indexes that are not used by the Query Optimizer, generally.
  20. If possible, try to create indexes on columns that have integer values instead of characters. Integer values use less overhead than character values.
  21. To provide up-to-date statistics, the query optimizer needs to make smart query optimization decisions. You will generally want to leave the "Auto Update Statistics" database option on. This helps to ensure that the optimizer statistics are valid, ensuring that queries are properly optimized when they are run.
  22. If you want to boost the performance of a query that includes an AND operator in the WHERE clause, consider the following:
  23. Of the search criteria in the WHERE clause, at least one of them should be based on a highly selective column that has an index.
  24. If at least one of the search criteria in the WHERE clause is not highly selective, consider adding indexes to all of the columns referenced in the WHERE clause.
  25. If none of the columns in the WHERE clause are selective enough to use an index on their own, consider creating a covering index for this query.
  26. Queries that include either the DISTINCT or the GROUP BY clauses can be optimized by including appropriate indexes. Any of the following indexing strategies can be used:
  27. Include a covering, non-clustered index (covering the appropriate columns) of the DISTINCT or the GROUP BY clauses.
  28. Include a clustered index on the columns in the GROUP BY clause.
  29. Include a clustered index on the columns found in the SELECT clause.
  30. Adding appropriate indexes to queries that include DISTINCT or GROUP BY is most important for those queries that run often.
  31. When you need to execute a string of Transact-SQL, you should use the sp_executesql stored procedure instead of the EXECUTE statement.
  32. When calling a stored procedure from your application, it is important that you call it using its qualified name.
  33. Use stored procedures instead of views because they offer better performance and don't include code, variable or parameters that don't do anything.
  34. If possible, avoid using SQL Server cursors. They generally use a lot of SQL Server resources and reduce the performance and scalability of your applications.
  35. Instead of using temporary tables, consider using a derived table instead. A derived table is the result of using a SELECT statement in the FROM clause of an existing SELECT statement. By using derived tables instead of temporary tables, you can reduce I/O and often boost your application's performance.
  36. Don't use the NVARCHAR or NCHAR data types unless you need to store 16-bit character (Unicode) data. They take up twice as much space as VARCHAR or CHAR data types, increasing server I/O and wasting unnecessary space in your buffer cache.
  37. If you use the CONVERT function to convert a value to a variable length data type such as VARCHAR, always specify the length of the variable data type. If you do not, SQL Server assumes a default length of 30.
  38. If you are creating a column that you know will be subject to many sorts, consider making the column integer-based and not character-based. This is because SQL Server can sort integer data much faster than character data.
  39. Don't use ORDER BY in your SELECT statements unless you really need to, as it adds a lot of extra overhead. For example, perhaps it may be more efficient to sort the data at the client than at the server.
  40. Don't return more data (both rows and columns) from SQL Server than you need to the client or middle-tier and then further reduce the data to the data you really need at the client or middle-tier. This wastes SQL Server resources and network bandwidth.

Conclusion
To tune our SQL queries, understanding our database does play the most important role. In SQL, typically each table column has an associated data type. Text, Integer, Varchar, Date, and more, are typically available types for developers to choose from. When writing SQL statements, make sure you choose the proper data type for the column. Sometimes it's easier to break up sub groups into their own select statement. To write a query, we need to know about the actual need of the query and scope of the query also. 

49 comments:

  1. Can you prove this:
    The SQL query becomes faster if you use the actual columns names in SELECT statement instead of than '*'.

    ReplyDelete
    Replies
    1. Thats the first thing I thought also.

      Delete
    2. That's the are where common sense comes into play

      Delete
  2. There are a lot of presumptions in these tips which are not always correct. For example an IN statement is as fast as a exists when the column is not nullable.

    ReplyDelete
  3. I have found that complex queries always involve multiple tables. To get the query correct, I always write a sub query against each table that gives me exactly the data I need, either to return or to select. Sometimes I even define these as views so they can be independently verified and used in multiple places. Then I combine the subqueries/views into a straightforward join query that gives me the data I need. I never worry about performance until a correct result is achieved. I review the query execution plan if needed to see if additional indices are needed.

    I find the authors guidelines misleading because they do not include the use of views or the use of query execution plans and I find both of these to be essential to writing correct, performance queries.

    ReplyDelete
  4. Many of the tips above is prescribing one solution for all, which is not correct.

    Many of the tip can be proven to be wrong from query to query.

    Optimization tips must be based on Execution Plan, not on Query. The tips above are based on Query.

    ReplyDelete
  5. Very useful post. This is my first time i visit here. I found so many interesting stuff in your blog especially its discussion. Really its great article. Keep it up. SEO optimalisatie

    ReplyDelete
  6. You know your projects stand out of the herd. There is something special about them. It seems to me all of them are really brilliant! SEO

    ReplyDelete
  7. You know your projects stand out of the herd. There is something special about them. It seems to me all of them are really brilliant! SEO

    ReplyDelete
  8. To inquire about the enormous measure of information we need an expert and all around experienced Data Scientist, to end up one of the specialists among the Data Scientist swarm, you should upgrade your range of abilities in huge information in a propelled manner.ExcelR Data Science Courses

    ReplyDelete
  9. Extraordinary review, I am a major adherent to remarking on online journals to advise the blog scholars realize that they've added something advantageous to the internet!. SEO in New York

    ReplyDelete
  10. The other services are somehow averagely demanded. relevant comment

    ReplyDelete
  11. A IEEE project is an interrelated arrangement of exercises, having a positive beginning and end point and bringing about an interesting result in Engineering Colleges for a particular asset assignment working under a triple limitation - time, cost and execution. Final Year Project Domains for CSE In Engineering Colleges, final year IEEE Project Management requires the utilization of abilities and information to arrange, plan, plan, direct, control, screen, and assess a final year project for cse. The utilization of Project Management to accomplish authoritative objectives has expanded quickly and many engineering colleges have reacted with final year IEEE projects Project Centers in Chennai for CSE to help students in learning these remarkable abilities.



    Spring Framework has already made serious inroads as an integrated technology stack for building user-facing applications. Spring Framework Corporate TRaining the authors explore the idea of using Java in Big Data platforms.
    Specifically, Spring Framework provides various tasks are geared around preparing data for further analysis and visualization. Spring Training in Chennai


    ReplyDelete
  12. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.

    digital marketing course

    For more info :
    ExcelR - Data Science, Data Analytics, Business Analytics Course Training in Mumbai

    304, 3rd Floor, Pratibha Building. Three Petrol pump, Opposite Manas Tower, LBS Rd, Pakhdi, Thane West, Thane, Maharashtra 400602
    18002122120

    ReplyDelete
  13. Amazing knowledge and I like to share this kind of information with my friends and hope they like it they why I do new york towing

    ReplyDelete
  14. Gangaur Realtech is a professionally managed organisation specializing in real estate services where integrated services are provided by professionals to its clients seeking increased value by owning, occupying or investing in real estate. mobile cleaner light

    ReplyDelete
  15. I don’t think many of websites provide this type of information. Bridal ring

    ReplyDelete
  16. A debt of gratitude is in order for the blog entry amigo! Keep them coming... Magic mirror hire South west

    ReplyDelete
  17. A debt of gratitude is in order for the blog entry amigo! Keep them coming... Magic mirror hire South west

    ReplyDelete
  18. I read a article under the same title some time ago, but this articles quality is much, much better. How you do this.. day trading

    ReplyDelete
  19. It is extremely nice to see the greatest details presented in an easy and understanding manner. Women's clothes

    ReplyDelete
  20. Great job for publishing such a beneficial web site. Your web log isn’t only useful but it is additionally really creative too. alvin russell

    ReplyDelete
  21. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression

    ReplyDelete
  22. Very interesting blog. Many blogs I see these days do not really provide anything that attracts others, but believe me the way you interact is literally awesome.You can also check my articles as well.

    Data Science In Banglore With Placements
    Data Science Course In Bangalore
    Data Science Training In Bangalore
    Best Data Science Courses In Bangalore
    Data Science Institute In Bangalore

    Thank you..

    ReplyDelete
  23. Awesome article. Very interesting to read this newsletter.I would like to thanks for the efforts you had made for scripting this tremendous article. This text resolved my all queries...topsoil supplier |

    ReplyDelete
  24. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  25. You make so many great points here that I read your article a couple of times. Your views are in accordance with my own for the most part. This is great content for your readers. https://cbtemailextractor.com/

    ReplyDelete
  26. Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic. If possible, as you gain expertise, would you mind updating your blog with extra information? It is extremely helpful for me. blog comment

    ReplyDelete
  27. We have sell some products of different custom boxes.it is very useful and very low price please visits this site thanks and please share this post with your friends. Backlinks

    ReplyDelete
  28. There is no dearth of Data Science course syllabus or resources. Learn the advanced data science course concepts and get your skills upgraded from the pioneers in Data Science.
    data science course bangalore
    data science course syllabus

    ReplyDelete
  29. When you use a genuine service, you will be able to provide instructions, share materials and choose the formatting style. Human Rights

    ReplyDelete
  30. very well explained. I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Logistic Regression explained
    Correlation vs Covariance
    Simple Linear Regression
    data science interview questions
    KNN Algorithm
    Bag of Words Python

    ReplyDelete
  31. When Google bots are crawling a website they look for external links point to other websites. If this links is “dofolow” then search engines will follows the link and so link juice gets passed. Links that count as points, pushing SEO link juice and boosting the page rank of the linked-to sites, helping them go higher in the SERPs as a result. Dofollow Backlinks

    ReplyDelete
  32. Outstanding blog appreciating your endless efforts in coming up with an extraordinary content. Which perhaps motivates the readers to feel excited in grasping the subject easily. This obviously makes every readers to thank the blogger and hope the similar creative content in future too.
    360DigiTMG Data Analytics Course

    ReplyDelete
  33. At DMC Cleaning & Maintenance we are committed to providing high-quality cleaning & maintenance services as well as excellent customer satisfaction.
    Professional Window Cleaning Services

    ReplyDelete
  34. I was basically inspecting through the web filtering for certain data and ran over your blog. I am flabbergasted by the data that you have on this blog. It shows how well you welcome this subject. Bookmarked this page, will return for extra. data science course in jaipur

    ReplyDelete
  35. Hi! This is my first visit to your blog! We are a team of volunteers and new initiatives in the same niche. Blog gave us useful information to work. You have done an amazing job!

    satta

    satta matta matka

    ReplyDelete

Popular Posts