There is a great deal of information about your SQL Server that may be of interest to you or your clients. I was recently asked how to gather some statistical information for a SQL Server and its environment by a client. The person asking the question was wondering whether to use Performance Monitor, stored procedures, or some other method to get the following information:
- # users connected to each database
- name of all databases
- size of each database on disk (physical size)
Most of this information can be gathered using Performance Monitor, with the exception of the first item. Performance Monitor provides total users connected to the server, but not by database. In addition, if a new database is added, the Performance Monitor chart or log settings must be altered to add the new database instance to the database size object.
I do use Performance Monitor heavily to monitor my database servers and see how they are performing. There are a number of data points that performance monitor provides that I cannot get from another source. There are a large number of articles written on how to use performance monitor and quite a few specific to SQL Server, so I will not delve into details here (If you need more information on Performance Monitor, see Brad McGehee's series on this topic). I view Performance Monitor more of a long term trending tool than a status tool as it is cumbersome to get the data from Performance Monitor into a format that is easily used (At least, for me it is a cumbersome process).
So that brings me to this article. I like getting a daily status from my servers. A quick, concise view of what has happened on the server over the last day. I try to keep this information to a minimum as I do not have the luxury of being able to examine in detail how my servers are doing everyday. Instead I only want errors or exceptions reported to me. The less information I see everyday, the better. Of course I like to keep track of things, but information management to prevent information overload is becoming more important as the number of servers grow in my organization.
What kind of data can we gather?
So, lets get started. How can I get information from my SQL Server? Well, let's pick a few data items to gather. How about the following list my client asked for?
- Total users per database
- Total size of each database
- Last Backup of each database
This is a pretty general list, though I would not usually get the last backup on my servers. Instead, I would have some task that alerted me if there was a problem with a backup, but for the example let's use this list. I will tackle these items one by one. This article will tackle item # 1 with #s 2 and 3 in follow-up articles.
Total Users Per Database
This is an interesting one, because the number of users in a database is a snapshot statistic. The number of users is only meaningful at some point in time because users log in and out of databases constantly. In order for this to have some meaning you want it to include a time stamp of when the count was made.
So, I still have not answered where we can get this information. Microsoft has a couple of ways in which this can be done. The recommended way to find out about users and databases is to use the sp_who or sp_who2 stored procedures. These procedures return a list of all users and the databases they are connected to. However, in order to get a count by database, we would need to trap this information in a temporary table and then run a query to select the information out of this table. The code for doing this is given below:
create table #users ( spid int, status char( 40), loginname char( 100), hostname char( 100), blk int, dbname char( 40), cmd varchar( 200) ) -- load the table insert #users exec sp_who -- get the totals select dbname, count(spid) from #users group by dbname drop table #users
Many people will realize that you could just as easily get this information by querying against sysprocesses in the following manner:
select d.name, count(spid) from sysprocesses s, sysdatabases d where s.dbid = d.dbid group by dbname
This seems shorter and cleaner, so why not do this? Well, I used to do this quite often in v6.5 to generate daily status information. When v7.0 was released, a number of my stored procedures stopped working after an upgrade. After digging in further I found that the structure of some of the system tables changed. As a result I was no longer getting the results that I expected, and in some cases the procedures would return errors.
MORAL: If you can possibly avoid it, do not use the system tables for queries. There is no guarantee that these will remain the same over time. Instead, Microsoft has done a good job providing views and stored procedures which you can query for information about the server.
The code batch above will return a snapshot of the count of users per database when it is run. However, I think this is of limited use. So here are a few enhancements that I would make to this batch. First I would make it a stored procedure so I can schedule it easily using the SQL Agent. I feel that if there are any scripts that you run on a regular basis, you are better off creating stored procedures out of them. This gives you a building block for stringing processes together.
My client wanted to know two things about this data. The maximum count of users for each database for the previous day, and a long term average. So next I would create a table in which to store the data. That way I could query this each day for the previous days maximum as well as store this for long term trends. This changed the query statement as shown below:
insert dbusage (dbname, usrcnt, entrydt) select dbname, count(spid), getdate() from #users group by dbname
To get the daily report back, we setup a procedure to send the results of the following query through email to the administrator:
select dbname, max( usrcnt) 'User_Count' from dbusage where entrydt > dateadd( day, case when datepart( dd, getdate()) = 2 then -3 else -1 end, getdate()) group by dbname
This returns the previous days (or previous 3 days if monday) count per database.
Now we can schedule the first code to run every minute or so during the day and store the information. There is a flaw in this system in that if a user logs in and then out of the database between executions of the task, we will not count them towards our maximum. I think this is a small price to pay given that if our performance monitor interval were long enough, say 30sec, the same thing could happen. We can also schedule the second code to run every business day at the same time so that the intervals between reports is consistent.
Next time I will tackle gathering the size of every database on the system dynamically.