Rahman Agoro, the frontline SQL DBA

Just another WordPress.com weblog

Archive for the ‘Management’ Category

SQL 2014 Replication Error The process could not execute ‘sp_replcmds’ on ‘xxxxxxxxxxxx’. (Source: MSSQL_REPL, Error number: MSSQL_REPL20011)

Posted by rahmanagoro on October 20, 2016


After spending some time away from SQL replication (SQL 2014), I find myself working with this brilliant piece of technology again. In this particular case I had to test, support and troubleshoot replication even without having any knowledge of its setup.

I had this error come up during one of my investigations when I check the log reader agent within SQL server replication monitor.

Error messages:

  • The process could not execute ‘sp_replcmds’ on ‘XXXServerXXX’. (Source: MSSQL_REPL, Error number: MSSQL_REPL20011)
    Get help: http://help/MSSQL_REPL20011
  • Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission. (Source: MSSQLServer, Error number: 15517)
    Get help: http://help/15517
  • The process could not execute ‘sp_replcmds’ on ‘XXXSERVERXXX’. (Source: MSSQL_REPL, Error number: MSSQL_REPL22037)
    Get help: http://help/MSSQL_REPL22037

 

After doing some investigations, I found that the database in question didn’t have an owner specified.

I changed the owner of the database and hey pronto, the agent started to work again.

 

Use DatabaseName
GO
sp_changedbowner 'sa'
Advertisements

Posted in Management, SQL replication | Leave a Comment »

Database ‘DatabaseName’ is in transition. Try the statement later Msg 952, Level 16, State 1, Line 1

Posted by rahmanagoro on April 4, 2016


I bumped onto this error message when our monitoring tool started to report errors with a SQL agent job. On taking a close look at the error, I found that whenever I tried to connect to the database, I simply got the error:

Msg 952, Level 16, State 1, Line 1
Database 'DatabaseName' is in transition. Try the statement later.

Digging deeper into the issue and also the error information, i made some interesting discoveries. It appears that we have a blocking chain going on within the database, our environment is  such that database owners can choose to detach databases etc.

I can the following code to check for blocking:

select
 l.resource_type,
 l.request_mode,
 l.request_status,
 l.request_session_id,
 r.command,
 r.status,
 r.blocking_session_id,
 r.wait_type,
 r.wait_time,
 r.wait_resource,
 request_sql_text = st.text,
 s.program_name,
 most_recent_sql_text = stc.text
from sys.dm_tran_locks l
left join sys.dm_exec_requests r
on l.request_session_id = r.session_id
left join sys.dm_exec_sessions s
on l.request_session_id = s.session_id
left join sys.dm_exec_connections c
on s.session_id = c.session_id
outer apply sys.dm_exec_sql_text(r.sql_handle) st
outer apply sys.dm_exec_sql_text(c.most_recent_sql_handle) stc
where l.resource_database_id = db_id('DatabaseName')

It turned out that one user wanted to detach the database whilst another user just started to issue a query on the same database, this was the reason for the error and the change of the transition state. One way to avoid such issue before detaching a database is to take the following steps.

  • Personally, I do like using code/T-SQL to run commands rather then the good old GUI, I can see what i am doing and I can also see if my actions are getting stuck, rather than wondering if my Pc is slow etc.
  • Secondly, when detaching a database, if other connections can be rolled back. The code below can be issued.
ALTER DATABASE DatabaseName SET offline with rollback immediate

The issue

I could see the blocking chain, from the SQL query above which checks the locks. Multiple sessions wanted to connect to a database that was about to change its transition to offline and then detached.

What I simply did was kill one of the sessions which was causing the blocking.

 

 

 

Posted in Management, Scripts, SQL 2008, SQL Administration | Leave a Comment »

Cannot drop database XXXX because it is being used for replication.

Posted by rahmanagoro on March 15, 2016


If you have ever come across this error, its because replication has been setup on the database before and it needs to be removed before the database can be dropped.

In my case, I had the error when I tried to run the code below.

drop database TEST_FWXXX

Server: Msg 3724, Level 16, State 3, Line 1
Cannot drop the database XXX because it is being used for replication.

 

Please note that in this case, the publication and subscription had already been dropped. Simple to resolve, just remove the replication property on the DB by running the code below.

exec sp_removedbreplication ‘TEST_FWXXX’

After running this, I was then able to drop the database successfully.

 

Posted in Management, SQL replication | Leave a Comment »

Error: 17053, Severity: 16, State: 1 what does it mean ?

Posted by rahmanagoro on October 25, 2012


E:\DATA\Database.ndf: Operating system error 112(failed to retrieve text for this error. Reason: 15100) encountered.

I have seen this error message on one of our production systems, and wondering what the issue is. This error is being caused by the fact that one is trying to expand the size of the ndf file, and there just isn’t enough disk space on the actual drive or mount point.

To recreate the problem in SQL 2008, create an existing database. And then expand the size of one of the files to exceed the free available capacity on the server.

Posted in Management, SQL 2008, Tips | Leave a Comment »

Statistics again and we see same query with two different plans

Posted by rahmanagoro on October 25, 2012


Following on from my earlier post on the importance of keeping statistics up to date, I have come across another issue recently and this time, I have the luxury of actually showing the query plans too. Please note that I have used SQL Sentry plan explorer which you can download for free from http://www.sqlsentry.com

I had a call from one of my clients today complaining about bad performance on their database. The irony of this particular case is that I can see it myself by actually running the query on the database server and I could see that its taking a lot of time to complete for a fairly trivial query that shouldn’t really take that long to complete. I was lucky in this particular case that I also had a UAT system to run the query on as well, as it was a report which wasn’t really changing data, it made rerunning the report slightly less intrusive.

I proceeded onto UAT, ran the query and he pesto it completed within 9 seconds as opposed to 3 hours which it was taking on production. In my normal fashion, I started taking a look at the query plans and one thing which became obvious is the fact that the query plans are somewhat different on the 2 systems. Although the way the joins in the query have been written isn’t the best I have seen in my career but these things do happen. I looked through one of the tables and used the SQL Sentry query plan too to observe that the estimated rows on the table was somewhat wrong. I can also see that the wait stats on the query was CXPACKET waits, the query uses parallelism but I can’t really say it’s a bad thing.

The section of the plan below shows the table with the right number of row estimates for the statistics.

The query below shows the same query but with the statistics somewhat wrong.

Although looking at the query, the optimizer also recommended that an index be created on the table, I was hesitate to create the index as the query was working fine before, so why would it start to behave badly all of a sudden. After updating the statistics on the table, cleared the procedure cache and reran the query everything was fine once more. This just proves the importance of having up to date statistics and also the fact that when the optimizer recommends that you create indexes, it’s not always the answer to every performance problem.

Posted in Management, SQL 2008, SQL Administration, Tips | Leave a Comment »

Setting up WMI query and getting error SQLServerAgent Error: WMI error: 0x80041003.

Posted by rahmanagoro on June 13, 2011


I was trying to setup database mirroring alerts on one of my production database servers and I looked into the article on the MS site, see http://technet.microsoft.com/en-us/library/cc966392.aspx on mirroring alerting best practices. I then proceeded in trying to setup same, I ran the script below.

USE [msdb]
GO
EXEC msdb.dbo.sp_add_alert @name=N'Mirroring_Alert',
  @enabled=1,
  @delay_between_responses=1800,
  @include_event_description_in=0,
  @wmi_namespace=N'\\.\root\Microsoft\SqlServer\ServerEvents\MSSQLSERVER',
  @wmi_query=N'SELECT * FROM DATABASE_MIRRORING_STATE_CHANGE WHERE State = 5 ',
  @job_id=N'00000000-0000-0000-0000-000000000000'
GO
EXEC msdb.dbo.sp_add_notification @alert_name=N'Mirroring_Alert', @operator_name=N'DBA_GROUP', @notification_method = 1
GO

 

I then got the error message below.


Msg 22022, Level 16, State 1, Line 0
SQLServerAgent Error: WMI error: 0x80041003.
Msg 14511, Level 16, State 1, Procedure sp_verify_alert, Line 300
The @wmi_query could not be executed in the @wmi_namespace provided. Verify that an event class selected in the query exists in the namespace and that the query has the correct syntax.

 

I ran thesame code on my desktop and it was fine, but I was thinking what could be the problem here.

I checked everything to make sure that the syntax was right and there wasnt anything wrong, bumped across a few KB’s about thesame problem, but it was primarily being caused by server names being longer than 15 characters and this applied to SQL 2005, my SQL instance was SQL 2008 and I then thought what could be going wrong. After spending the hour battling this, and I have a rule that any problem taking longer than an hour to fix, I need to get my head cooled out, pop out for a 10 minute break and fresh air and then give it another go. It seem to work for me time and time again, I came back, thought !! Hang on a minute, lets check the SQL error log to see if there are any pointers to the root cause, on there I saw errors as follows.


Date  6/13/2011 11:45:33 AM
Log  SQL Server (Current - 6/13/2011 10:01:00 AM)

Source  Logon

Message
Login failed for user 'NT AUTHORITY\SYSTEM'. Reason: The account is disabled. [CLIENT: <local machine>]


Ahh, this could be it. When I setup SQL servers I often disable the builtin/Administrators login as this is a security risk. It looks like the WMI service is using the local system account and this is obviously disabled and thus causing the problem highlighted.

I changed the WMI service account such that its a domain account and permissioned it on SQL server and pesto !! The problem is solved.

Posted in Database Mirroring, Management, SQL 2008, SQL High Availability, Tips, WMI | 1 Comment »

SQL 2008 Management Datawarehouse Overview

Posted by rahmanagoro on April 13, 2011


 

With the release of SQL 2008, Microsoft has developed a warehouse framework for database administrators to have an idea of what is actually going in within SQL server. A lot of the times during my career, I have heard the phrase “The database was slow between 3AM and 4AM) This is especially true if you work within a global framework whereby the database server is serving requests 24/7. I would have users in the Asia pacific region using the database from around midnight GMT (UK time), the as they finish the UK users start around 8AM, and as they wrap up the US users kick in around 1PM BST. So it’s a never ending circle.

To buttress the point, management data warehouse gives me the high level and detailed view of what is happening on the database server, I like to use the waits and queues methodology to see what SQL server is waiting for, and to also keep tabs on my queues as well. If a process is queing up waiting for resources, then there must be a reason for it and as a database administrator, I want to check and understand why we have the waits and queues on the database server.

This post assumes that management data warehouse has already been installed on the database server, I am still running SQL 2008 RTM SP1 cumulative update 5 on my systems and till date, there is no way to remove management data warehouse once it has been installed. I installed management data warehouse on the dedicated DBA database on the production system, the database schema gets installed on the database as well.

To launch management data warehouse, follow the steps below.

Management data warehouse works by collecting information from a number of DMV’s and using connection points to upload the data onto the database and then store it within  data warehouse style tables.

 

Memory section

 

If you click on the memory graph, it drills down to show you the key memory counters. Here you can see things like page life expectancy, SQL server internal memory consumption by type

Reviewing waits, within management data warehouse you can check your wait statistics and further drill into what percentage the composition is.

Memory counters based on page life expentancy.

 

 

 

 

 

Disk IO statistics information can also be found below.

On the IO specific information, you can drill down on top queries based on duration, total IO, physical disk reads and also logical writes.

Posted in Management, SQL 2008, Tips, Uncategorized | Leave a Comment »

Move System model database in SQL 2008 R2 Failover Cluster

Posted by rahmanagoro on April 12, 2011


Move System database on a SQL 2008 R2 Failover Cluster

In order to do this, run the following steps

  • Modify the location of the system databases
  • Open up the failover cluster manager.
  • Run the code below with the SQL resource online

Alter database msdb modify file (name=msdbdata,filename='F:\Database\msdbdata.mdf')

Alter database model modify file (name=modellog,filename='L:\Logs\modellog.ldf')

  • Bring the SQL resource offline

  • Confirmation from the failover cluser manager

  • The resource has now been taken offline


  • Now move the files onto the new destination, i.e the location of where the system data and log file should be.
  • Bring the SQL service and sQL agent resource online.

Posted in Clustering, Management, SQL 2008, SQL Administration, SQL High Availability, Tips | Leave a Comment »

SQL Failover Clustering Requirements

Posted by rahmanagoro on April 12, 2011


When installing a SQL failover cluster, its a complicated process and we DBA’s are always looking for ways to make our lives easier. Also as a SQL failover installation would require cooperation between the following groups of people.

  1. Database Administrators
  2. Windows Administrators
  3. Network Administrators
  4. Storage Administrators.

I find that creating a small little spreadsheet and sending it around for everyone to fill in their details or perhaps state your requirements often helps and avoids delays and ambiguity.

I have put together a spreadsheet with same information as it might be of use to others.

SQL Failover Cluster_Requirements_v1.0

Posted in Clustering, Management, SQL 2008, SQL Administration, SQL High Availability | Leave a Comment »

Troubleshooting using default trace information

Posted by rahmanagoro on March 11, 2011


The default trace is built into SQL server, its one that often tells you operations that are happening on the database server which is often not provided within the SQL error logs.

I regularly query the default trace to pick up issues that might be arising on the database server, the scripts that I use to query the default trace can be found below. The default trace also allows me to create policies using policy based management within SQL 2008, we once had a deployment using Visual studio and for those who have used visual studio, it creatred databases as a project and this is used in storing database schema, database deployment, integration with SVN etc. At the end of the day, the repository was not upto date and Auto skrink was turned on with visual studio, I found this had been turned on with visual studio. This was revealed within the default trace and I was able to create a policy to check for this across the database estate.

To get the default trace file currently in use, run the query below.

SELECT *
FROM fn_trace_getinfo(default);

To group the various events to see what has been happening on the database server, you can run the query below.

--Auditing various event types in the trace file.

SELECT
trc_evnt.name, trc_evnt.category_id
,count(*) as occurence
FROM fn_trace_gettable('y:\databases\log_5170.trc', NULL) AS dflt_trc
INNER JOIN sys.trace_events AS trc_evnt
ON dflt_trc.EventClass = trc_evnt.trace_event_id
group by trc_evnt.name, trc_evnt.category_id
order by trc_evnt.name

You can change the default trace file to query the last five default traces, so for example y:\databases\log_5170.trc is the current trace, to check the one before this simply decrement the number by one, so run the query y:\databases\log_5169.trc.

I get information similar to the one below

Audit DBCC Event    8    9006
Audit Server Alter Trace Event    8    2
Hash Warning    3    2
Object:Created    5    644
Object:Deleted    5    201
Sort Warnings    3    1

I have used this method to troubleshoot various issues on the database such as the following:

missing statistics: I subesequently created and index or statistics on the column.

transaction logs growing every night: I changed the growth settings and expanded the size so as to remove the need for the growth.

Posted in Management, Scripts, SQL 2008, SQL Administration, Tips, Uncategorized | Leave a Comment »