Posted by rahmanagoro on October 25, 2012
I have always known about issues with table statistics on a SQL server database, but actually seeing it cause performance problems is a bit of a different experience. I got called from the support teams saying that a report which they normally run on the database has been running for 3 hours now, and shows no sign of completing.
I then logged onto the system, ran some diagnostic queries to pull up the query plan for the query running and then suddenly, something didn’t quite make sense to me. One of the tables which I’m familiar with seems to be returning an estimated number of rows of one. At this stage, I knew this wasn’t right as I’m quite familiar with the table and I know that it contains millions and millions of rows.
Even looking at a section of the query plan, I just knew it wasn’t right at all.
Straight away all the tables which had an estimate of one, I updated the statistics with full scan, and re-ran the query again, and this time it completed in around 6 minutes. One thing to learn in this post is that when you see estimated number of rows of 1, and you know that the table certainly has more than one row, its usually a pointer to show that the statistics are out of date. After updating the statistics we can see from the screenshot below that the pipes which moves from object to object is a lot more bigger in size, this means that the number of rows being worked on it significantly more.
· Always check statistics on the database.
· Ensure that auto update statistics is on for the database, unless you choose to manually run this yourself or the database is very large and manual statistics update is essential.
· Watch out for one row estimates on the query plan especially for large tables when you know the number of rows that ought to be returned is more than one.
· Update statistics full scan may not be suitable for every environment, normally a sampling rate would also work, but tests will need to be carried out to establish which sample rate is suitable.
Posted in Database development, SQL 2008, SQL Administration, Tips | Tagged: statistics | Leave a Comment »
Posted by rahmanagoro on November 2, 2010
I have been using Microsoft Visual Studio 2010 within our shop, we use it towards the management and version control of SQL databases, these databases are stored as projects under visual studio. It’s within an agile environment and releases can happen almost every weekend or during the week. We also use SQL transaction replication in order to maintain sets of reporting database servers.
In the last couple of weeks, I tried to do a deployment onto a publisher database and ran into and error with visual studio, it was trying to drop and object on the database whilst the object was being replicated. This obviously throws and exception on SQL
It’s official that Microsoft visual studio does not support SQL replication and the only known workaround to the problem is to change the deployment properties to ignore verification checks. This can either be done from the GUI or altering an XML file, which is what the GUI does.
You can get to the deployment properties by navigating to the following location from visual studio.
Project >> Properties >> Database deployment
Posted in Database development, Day to day queries, Management, Scripts, SQL 2008, SQL replication, Tips | Leave a Comment »
Posted by rahmanagoro on November 2, 2010
Restoring a database whilst application connections persists to the database. I have a .NET application which continually needs to query the database to fetch data therefore opening a connection to the database, the problem is that I need to refresh the data on testing with recent data from production. If one tries to restore the database, you get the error.
Msg 3101, Level 16, State 1, Line 1
Exclusive access could not be obtained because the database is in use.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.
In order to get around this problem, what I tend to do is set the database offline, do the restore and set it back online. This is to prevent the application from continuously connection to the database thereby failing the restore. The scripts are shown on this post.
Posted in Database development, Scripts, SQL High Availability | Leave a Comment »
Posted by rahmanagoro on October 4, 2010
As the database administrator, I often have to deal with production problems. Now some of these problems can be classed under different categories, they include the followins.
- Peformance issues.
- Issues as a result of a release.
- Bugs within the SQL engine
- Faulty SQL code
- Proactive/reactive issues
The issue of performance problems and faulty SQL code can often be attributed to bad coding, whilst I know that a job of a DBA itself can be hard, the code review process is one which is often ignored in most shops. I don’t share thesame approach, I like to know what code is being written on the database server which I manage, afterall if there are problems, the database administrator will be the one to be called at 02:45AM in the early hours of the morning. I like to avoid such problems and also ensure that my environment is as stable as possible. The art of performance tuning is not one that can be learnt from text books or just reading from the internet, its one that you gain often from experience. I often gain such experience by dealing with production issues, attenging seminars and of course a lot of reading, but I find that some of these problems can be avoided in the first place. Thats where DBA code reviews comes into play, imagine a situation where a developer was corrected as to how to best write a SQL query, another developer is making thesame mistake, so what do you then do. Do you go to every developer and start correcting them ?
At a previous job, what I used to do is organise presentations once a month with my development team, often speaking about issues that I have either noticed or topics that the development team want me to speak about. As for code reviews, I came up with a checklist of items that I often look out for. They are as follows.
- Consistent naming convention should be used at all times, there are loads of naming conventions out there, pick one and stick to it.
- All objects especially stored procedures should have comments, information can include author, date, description of the object, changes etc.
- Error handling using TRY.. CATCH. TRY..CATCH was released in SQL 2005 and is by far the best error handling method built into SQL.
- Use of schema names when referencing tables, specify Database.schemaname.objectname. This ensures that the SQL engine doesnt start looking for the schema which holds the object.
- Avoid the use of Select *, all column names must be explicitly specified. This will avoid unnecessary IO.
- Ensure that all tables have a primary key (preferably of integer type), and all foreign keys are indexes with a non clustered index.
- Avoid the placing of duplicate indexes on tables. As simple as it might sound, I have seen this in various places.
- Developers should make efforts to ensure that databases are normalised, a relationship diagram for the database would also be useful.
- Avoid server side cursors and maintain the use of SET based operation. If a cursor really needs to be used, use a WHILE loop instead and specify a primary key in your table, a very easy way to do this is using an identity column.
- Temporary tables and table variable, as a general rule of thumb use a table variable for relatively small datasets, and use a temporary table for fairly large datasets. Temp tables maintain statistics and can have indexes, but often recompile on the other hand as a downside, table variables don’t maintain statistics but they don’t force a recompilation. This was demonstrated in the SQL BITS 2009 session.
- When using nested stored procedures, ensure that the temporary table names are unique/different. I have seen this affect query plans in a very catastrophic manner.
- Avoid the use of functions in Joins within SQL 2005/2008, from experience performance tends to suffer as the dataset grows mianly because the SQL optimizer doesn’t know which index to use. Use a CTE or a derived table instead.
- Embrace the use functions like OBJECT_ID to check for the existence of an object within stored procedures/script etc. This option is better then querying sysobjects where it can cause locking.
- When using ORDER BY within SQL, avoid using ORDER BY ORDINALS for example select name,surname from tblStudents order by 1,2. When using order by, ensure that the column named are specified, alias names should also be avoided as this doesn’t work in SQL 2008.
- Stored procedures that take in different combination of parameters should use local variables in order to avoid parameter sniffing. When having branching IF statements within stored procedures that process complicated logic, each IF statement should call another sub procedure rather than a direct SQL statement. This will allow stored procedures to make use of the procedure cache, and prevent stored procedures generating inconsistent query plans due to parameters.
- Tables that hold parameter information or lookups should have not null constraints defined.
- Developers should make efforts in identifying heavy queries within their applications, and review accordingly on how to optimise the query if needed.
- New tables to be sized appropriately with respect to the use of the table and the nature of the data it will store.
- Avoid the use of optimiser hints, in SQL 2008 hints should only be used in exceptional circumstances i.e. maintaining a consistent planguide etc. At all times, the optimiser should be left to determine the best plan for all queries.
- Use of SET NOCOUNT ON at the beginning of stored procedures, this improves performance by reducing network traffic.
- Developers should be careful with the use of custom data types, native SQL data types should be used where possible at all times.
Posted in Database development, Professional Development, SQL Administration | Leave a Comment »