As the database administrator, I often have to deal with production problems. Now some of these problems can be classed under different categories, they include the followins.
- Peformance issues.
- Issues as a result of a release.
- Bugs within the SQL engine
- Faulty SQL code
- Proactive/reactive issues
The issue of performance problems and faulty SQL code can often be attributed to bad coding, whilst I know that a job of a DBA itself can be hard, the code review process is one which is often ignored in most shops. I don’t share thesame approach, I like to know what code is being written on the database server which I manage, afterall if there are problems, the database administrator will be the one to be called at 02:45AM in the early hours of the morning. I like to avoid such problems and also ensure that my environment is as stable as possible. The art of performance tuning is not one that can be learnt from text books or just reading from the internet, its one that you gain often from experience. I often gain such experience by dealing with production issues, attenging seminars and of course a lot of reading, but I find that some of these problems can be avoided in the first place. Thats where DBA code reviews comes into play, imagine a situation where a developer was corrected as to how to best write a SQL query, another developer is making thesame mistake, so what do you then do. Do you go to every developer and start correcting them ?
At a previous job, what I used to do is organise presentations once a month with my development team, often speaking about issues that I have either noticed or topics that the development team want me to speak about. As for code reviews, I came up with a checklist of items that I often look out for. They are as follows.
- Consistent naming convention should be used at all times, there are loads of naming conventions out there, pick one and stick to it.
- All objects especially stored procedures should have comments, information can include author, date, description of the object, changes etc.
- Error handling using TRY.. CATCH. TRY..CATCH was released in SQL 2005 and is by far the best error handling method built into SQL.
- Use of schema names when referencing tables, specify Database.schemaname.objectname. This ensures that the SQL engine doesnt start looking for the schema which holds the object.
- Avoid the use of Select *, all column names must be explicitly specified. This will avoid unnecessary IO.
- Ensure that all tables have a primary key (preferably of integer type), and all foreign keys are indexes with a non clustered index.
- Avoid the placing of duplicate indexes on tables. As simple as it might sound, I have seen this in various places.
- Developers should make efforts to ensure that databases are normalised, a relationship diagram for the database would also be useful.
- Avoid server side cursors and maintain the use of SET based operation. If a cursor really needs to be used, use a WHILE loop instead and specify a primary key in your table, a very easy way to do this is using an identity column.
- Temporary tables and table variable, as a general rule of thumb use a table variable for relatively small datasets, and use a temporary table for fairly large datasets. Temp tables maintain statistics and can have indexes, but often recompile on the other hand as a downside, table variables don’t maintain statistics but they don’t force a recompilation. This was demonstrated in the SQL BITS 2009 session.
- When using nested stored procedures, ensure that the temporary table names are unique/different. I have seen this affect query plans in a very catastrophic manner.
- Avoid the use of functions in Joins within SQL 2005/2008, from experience performance tends to suffer as the dataset grows mianly because the SQL optimizer doesn’t know which index to use. Use a CTE or a derived table instead.
- Embrace the use functions like OBJECT_ID to check for the existence of an object within stored procedures/script etc. This option is better then querying sysobjects where it can cause locking.
- When using ORDER BY within SQL, avoid using ORDER BY ORDINALS for example select name,surname from tblStudents order by 1,2. When using order by, ensure that the column named are specified, alias names should also be avoided as this doesn’t work in SQL 2008.
- Stored procedures that take in different combination of parameters should use local variables in order to avoid parameter sniffing. When having branching IF statements within stored procedures that process complicated logic, each IF statement should call another sub procedure rather than a direct SQL statement. This will allow stored procedures to make use of the procedure cache, and prevent stored procedures generating inconsistent query plans due to parameters.
- Tables that hold parameter information or lookups should have not null constraints defined.
- Developers should make efforts in identifying heavy queries within their applications, and review accordingly on how to optimise the query if needed.
- New tables to be sized appropriately with respect to the use of the table and the nature of the data it will store.
- Avoid the use of optimiser hints, in SQL 2008 hints should only be used in exceptional circumstances i.e. maintaining a consistent planguide etc. At all times, the optimiser should be left to determine the best plan for all queries.
- Use of SET NOCOUNT ON at the beginning of stored procedures, this improves performance by reducing network traffic.
- Developers should be careful with the use of custom data types, native SQL data types should be used where possible at all times.