Gathering Space Usage Statistics

Thursday Sep 30th 2004 by Gregory A. Larsen
Share:

As a DBA, one of the DBA tasks you will need to perform is disk space capacity planning. Gregory Larsen discusses how disk space usage information can be a valuable resource in helping with capacity planning.

As a DBA, you will need to perform a number of different tasks to keep your databases and servers up and running. One of those DBA tasks you will need to perform is disk space capacity planning. You need to make sure there is enough disk space, so when one of your databases needs to grow it can find the disk space it needs. Depending on the growth patterns of your databases, you will more than likely need to purchase additional disk space periodically. One way to estimate your future disk space needs accurately, is to track the growth rate of your databases over time. This article will discuss how disk space usage information can be a valuable resource in helping with capacity planning.

SQL Server does not provide a simple solution to gathering database space usage statistics. In order to gather disk space usage statistics you will need to use both documented and undocumented methods available within SQL Server. Prior to discussing how to gather space usage statistics, let's first discuss how statistics can be used to help you with your disk space capacity planning.

Suppose your manager comes to you and says he has some extra money to buy you some additional disk space for your server, although, he does not want to buy any more than you might need to handle the growth of all your databases for the next 18 months. He needs you to come up with an estimate on how much disk space you will need for the next 18 months. Your manager expects you to base your estimate on some hard, actual disk space usage statistics. If you have been gathering disk space usage space statistics for your databases overtime this would be a very easy task. However, if you do not have any hard disk space usage information then you would not be able to comply with your manager's request.

Let's see how disk space statistics can help you provide your manager with how much your databases will grow in the next 18 months. To calculate disk space growth you need to have some actual disk space usage information for two different points in time. Suppose you wrote down how much each database was using during the first week of June 2004. Then during the first week of August, you once again recorded the amount of disk space each of your databases was using. Now to calculate your disk space usage for the next 18 months you could use the following formula:

Disk Usage Estimate = ((SUM(ADB1+ADB2+ADB3+...+ADBn) - SUM(JDB1+JDB2+JDB3+...+JDBn))/2) * 18

Where Disk Usage Estimate is your calculated estimate of the amount of disk space needed for the next 18 months, ADBx is the amount of space used in August for DBx where x is a value between 1 and n, where n represent a different space usage amount for each database on your server. JDBx is the amount of space used in June for DBx where x is a value between 1 and n, where n represents a different space usage amount for each database on your server. The "2" in the above formula represents the number of months between the two different dates where disk space usage information was collected. By subtracting the amount of space used in August by the amount of space used in June then dividing that number by "2" gives you the average space used in a single month. By taking the calculated monthly space usage number and multiplying it by 18 gives you the disk space usage estimate for 18 months. Now that you know how to calculate the amount of disk space used between two dates, let's discuss how to capture disk space usage information by database for SQL Server.

Most people are familiar with how to use Enterprise Manager to determine how much disk space is used by a database. To use Enterprise Manager, click on the "View" menu item and then click on the "Taskpad" item from the view pull down menu. Doing this will display the "Space Allocated" information in the right pane of the Enterprise Manager display. On the "Space Allocated" display, you can see the amount of disk space used and allocated for both "Data" and the "Transaction log."

So how does Enterprise Manager come up with the data for this space usage display? If you turn on Profiler, then have Enterprise Manager display the "Taskpad" information, then the method that Enterprise Manager uses should be revealed. Profiler will show you that Enterprise manager uses two commands to get the hard disk space information for the "Space Allocated" display. These two commands are "DBCC sqlperf(logspace)" and "DBCC showfilestats". The "DBCC sqlperf(logspace)" command is documented in Books Online, but the "DBCC showfilestats" is an undocumented DBCC command. If you run these commands in Query Analyzer, you will find that Enterprise Manager manipulates the data returned in order to build the "Space Allocated display. Let's review how Enterprise manager uses these two commands to calculate the amount of disk space used and allocated to the Data and Transaction log files.

First, let's look at the output of the "DBCC sqlperf(logspace)" command. When you run this command, it returns a record set. The record set contains one row per database, and each row in the record set contains the following information: Database Name, Log Size (MB), Log Space Used (%), and Status. Using the information returned from the "DBCC sqlperf(logspace)" command you can determine the allocated and used disk space amount for each database's transaction log. The "Log Size (MB)" column identifies the number of megabytes allocated to the transaction log. So to calculate the amount of space that the actual transaction log is using, you need to use the following formula:

Transaction Log Space Used = Log Size (MB) * Log Space Used (%).

By taking the total size of the transaction log and multiplying it by the percentage of log space used, you can calculate the amount the actual transaction log is using.

Now let's look at how the "DBCC showfilestats" command can be used to get the amount of data space used by a database. If you run this command in Query Analyzer for you will find it returns the following information for the current database: Fileid, FileGroup, TotalExtents, UsedExtents, Name and FileName. The "TotalExtents" and "UsedExtents" columns can be used to calculate the amount of space allocated and used for data file for the current database. An "Extent" in SQL Server is 64 K. So the amount of space in megabytes allocated to Data file can be calculated by using the following formula:

Space Allocated for Data = TotalExtents * 64.0 / 1024.0

And the amount of space in megabytes used by Data can be calculated using the following formula:

Space Used for Data = UsedExtents * 64.0 / 1024.0

By using these two DBCC commands, you get the amount of space allocated and used for each of your databases. Now manually calculating the amount of space used for each of your database would be a labor-intensive process. This is why we have computers so we can automate these manual mathematical calculations. Let's review a process I built that automatically gathers this information.

I have built a process that uses both of these two DBCC commands described above and saves the spaces usage information into a SQL Server table. This way I can use this table to determine the growth rates of each of my databases over time. The process I built contains three pieces: a database table, a stored procedure, and a SQL Server Agent job. Below I have described each of these pieces and provided you with the code for each piece.

The database table is used to hold my space usage information over time for each database on my server. Each row in this table will contain the space usage information for a single database for a single reporting period. Here is the code to create this table. Please note I have created this table in my DBA database (The DBA database is used for all DBA related tables and code):

if exists (select * from dbo.sysobjects 
   where id = object_id(N'[dbo].[DBSTATS]') and 
   OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[DBSTATS]
GO	
USE DBA -- Database for Table
go
CREATE TABLE [dbo].[DBSTATS] (
	[ID] [int] IDENTITY (1, 1) NOT NULL ,
	[RECORD_TYPE] [int] NOT NULL ,
	[DBNAME] [char] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
	[DATA_SIZE] [decimal](9, 2) NULL ,
	[DATA_USED] [decimal](9, 2) NULL ,
	[LOG_SIZE] [decimal](9, 2) NULL ,
	[LOG_USED] [decimal](9, 2) NULL ,
	[STAT_DATE] [datetime] NULL 
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[DBSTATS] ADD 
	CONSTRAINT [DF_DBSTATS_STAT_DATE] DEFAULT (getdate()) FOR [STAT_DATE]
GO

I created a stored procedure named "usp_get_dbstats" that processes through the system tables and generates a script that will populate my database table (DBSTATS) with space usage information. Running this stored procedure does not populate rows of data in the DBSTATS table, but as I said, it only generates a script to populate this table. The generated script, when run, will look at space allocation information for each database on my server and insert one row into the DBSTATS table for each database on the server. Here is the code for this stored procedure:

if exists (select * from dbo.sysobjects where id = 
object_id(N'[dbo].[usp_get_dbstats]') and OBJECTPROPERTY(id, N'IsProcedure') = 1)
drop procedure [dbo].[usp_get_dbstats]
GO

SET QUOTED_IDENTIFIER OFF 
GO
SET ANSI_NULLS ON 
GO


CREATE procedure usp_get_dbstats as
--
-- This stored procedure was written by Greg Larsen 
-- Date: 05/21/2001
--
-- Description:
--  This stored procedure generates SQL commands that uses DBCC SQLPERF and 
--  undocumented DBCC SHOWFILESTATS.  These generated commands are used 
--  to gather database disk usage information. The code generated will take the 
--  statistics generated by the DBCC commands and store them in TEMPDB tables.  
--  The statistics in the TEMPDB tables are then selected, manipulated and 
--  eventually placed in table DBSTATS, which is in a database specified by variable @DBSTATS_DB.  
--  The DBSTATS table then can be used to 
--  perform space usage trend analysis.
--

-- Declare variable to hold default database were growth statics table DBSTATS will be stored.
declare @DBSTATS_DB char(3)
set @DBSTATS_DB = 'DBA'

--
-- Begin "Gather Log Space Usage Information" Section
--

-- Declare variable to hold DBCC command
print 'declare @cmd nvarchar(1024) '

-- If temporary table  to hold transaction log size information for all databases 
-- exists drop
   print 'if exists (select * from tempdb..sysobjects where id = object_id(N' + 
char(39) + '[tempdb]..[#tmplg]' + char(39) + '))'
   print 'drop table #tmplg'

-- Create temporary table to hold transaction log size information for all databases  
Print 'CREATE TABLE #tmplg'
print '('
print 'DBName varchar(32),'
print 'LogSize real,'
print 'LogSpaceUsed real,'
print 'Status int'
print ')'

-- Build command string to get transaction log size information
print 'SELECT @cmd = ' + char(39) + 'dbcc sqlperf (logspace)' + char(39)  

-- Execute command to get transaction log size information and records returned are to be 
-- inserted into temporary table #Tmplg
print  'INSERT INTO #Tmplg EXECUTE (@cmd)'

--
-- End "Gather Log Space Usage Information" Section
--

--
-- Begin "Create Temporary Statistics Holding Table" Section
--

-- Drop temporary table for holding data and log size information if it exists
   print 'if exists (select * from tempdb..sysobjects where id = object_id(N' + 
 char (39) + '[tempdb]..[#tmp_stats]' + char(39 ) + '))'
   print 'drop table #tmp_stats'


-- Create temporary table for holding data and log size information
print 'create table #tmp_stats ('
print 'totalextents int, '
print 'usedextents int,'
print 'dbname varchar(40),'
print 'logsize real,'
print 'logspaceused real'
print ')'
print 'go'  --
--End "Create Temporary Statistics Holding Table" Section
--

--
--Begin "Generate DBCC SHOWFILESTATS Commands" Section
--

-- Declare cursor for holding the names of all databases on the server
DECLARE AllDatabases CURSOR FOR

-- Get the names of all databases
SELECT name FROM master..sysdatabases 
  
-- Open the cursor that holds the names of all databases on the server
OPEN AllDatabases

-- Declare variable for holding the name of the current database being processed  
DECLARE @DB NVARCHAR(128)

--  Get the name of the first database
FETCH NEXT FROM AllDatabases INTO @DB

-- Process all databases until there are no more
WHILE (@@FETCH_STATUS = 0)

BEGIN
-- switch to the database where stats will be gathered
   print 'use [' + @DB + ']'
   print 'go'

-- drop temporary table that will store data space used information if it exists
   print 'if exists (select * from tempdb..sysobjects where id = object_id(N' + 
char(39) + '[tempdb]..[#tmp_sfs]' + char(39) + '))'
   print 'drop table #tmp_sfs'


-- create temporary table that will store data space used information
   print 'create table #tmp_sfs ('
   print 'fileid int,'
   print 'filegroup int, '
   print 'totalextents int, '
   print 'usedextents int,'
   print 'name varchar(1024),'
   print 'filename varchar(1024)'
   print ')'
   print 'go'

-- Declare variable to hold the DBCC command
   print 'declare @cmd nvarchar(1024)'

-- Set up command for undocumented DBCC SHOWFILESTATS command
   print 'set @cmd=' + char(39) + 'DBCC SHOWFILESTATS' + char(39)

-- execute DBCC SHOWFILESTATS command and place records returned into temporary table
   print 'insert into #tmp_sfs execute(@cmd)'

-- Declare variables to hold total log sizes for a database
   print 'declare @logsize real '
   print 'declare @logspaceused real '

-- Get the logsize of a database
   print 'select @logsize= logsize from #tmplg where dbname = ' + char(39) + @DB + char(39) 

-- Calculate the megabytes of space used in the log .  
   print 'select @logspaceused = (logsize*logspaceused)/100.0'
   print '      from #tmplg where dbname = ' + char(39) + @DB + char(39) 
-- build command to insert current space usages record for database into temporary table
   print 'set @cmd = ' + char(39) + 'insert into #tmp_stats' + char(39) + ' +' 
   print '     ' + char(39) + 
'(totalextents,usedextents,dbname,logsize,logspaceused)' + char(39) + ' +' 
   print '     ' + char(39) + ' select sum(totalextents), sum(usedextents),' + 
 char(39) + ' + char(39) + ' + char(39) + @DB + char(39) + '+ char(39) + ' + 
 char(39) + ',' + char(39) + ' + ' 
   print ' cast(@logsize as varchar) + ' + char(39) + ',' + char(39) + ' + 
   cast(@logspaceused as varchar) +' 
   print ' ' + char(39) + ' from #tmp_sfs' + char(39)

-- Insert current space usage record for database being process into temporary table
   print 'exec sp_executesql @cmd'

-- Get next database to process
   FETCH NEXT FROM AllDatabases INTO @DB
END --(@@FETCH_STATUS = 0)

--
--End "Generate DBCC SHOWFILESTATS Commands" Section
--

--
--Begin "Save Space Usage Statistics to DBSTATS Table"  Section
--

-- Insert a space usage statistics record for each database into the 
-- historical space usage table DBSTATS
print 'INSERT INTO ' + @DBSTATS_DB + '.dbo.DBSTATS '
print '  (RECORD_TYPE, DBNAME, DATA_SIZE, DATA_USED, LOG_SIZE, LOG_USED)'
print '    SELECT 1,dbname,totalextents*64/1024 , usedextents*64/1024 ,'
print '           logsize ,logspaceused from #tmp_stats'

--
--End "Save Space Usage Statistics to DBSTATS Table"  Section
--
  
-- close and deallocate cursor to hold name of all databases on server
CLOSE AllDatabases
DEALLOCATE AllDatabases

GO
SET QUOTED_IDENTIFIER OFF 
GO
SET ANSI_NULLS ON 
GO


The last piece of my process is a SQL Server agent job. This job is a multiple step job. The first step executes the usp_get_dbstats stored procedure and generates an output file that contains the script to populate the DBSTATS table. The second step of this job executes the generated script, which when executed inserts one row of space usage information into my DBSTATS table for each database on the server. This job can be run daily, week, monthly depending on how often you want to capture space usage statistics. Here is the code to create the SQL Server agent job:

BEGIN TRANSACTION            
  DECLARE @JobID BINARY(16)  
  DECLARE @ReturnCode INT    
  SELECT @ReturnCode = 0     
IF (SELECT COUNT(*) FROM msdb.dbo.syscategories WHERE name = N'[Uncategorized (Local)]') < 1 
  EXECUTE msdb.dbo.sp_add_category @name = N'[Uncategorized (Local)]'

  -- Delete the job with the same name (if it exists)
  SELECT @JobID = job_id     
  FROM   msdb.dbo.sysjobs    
  WHERE (name = N'Get DBSTATS')       
  IF (@JobID IS NOT NULL)    
  BEGIN  
  -- Check if the job is a multi-server job  
  IF (EXISTS (SELECT  * 
              FROM    msdb.dbo.sysjobservers 
              WHERE   (job_id = @JobID) AND (server_id <> 0))) 
  BEGIN 
    -- There is, so abort the script 
    RAISERROR (N'Unable to import job ''Get DBSTATS'' since there is already a 
multi-server job with this name.', 16, 1) 
    GOTO QuitWithRollback  
  END 
  ELSE 
    -- Delete the [local] job 
    EXECUTE msdb.dbo.sp_delete_job @job_name = N'Get DBSTATS' 
    SELECT @JobID = NULL
  END 

BEGIN 

  -- Add the job
  EXECUTE @ReturnCode = msdb.dbo.sp_add_job @job_id = @JobID OUTPUT , @job_name = 
N'Get DBSTATS', @owner_login_name = N'sa', @description = N'No description 
available.', @category_name = N'[Uncategorized (Local)]', @enabled = 1, 
@notify_level_email = 0, @notify_level_page = 0, @notify_level_netsend = 0, 
@notify_level_eventlog = 2, @delete_level= 0
  IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback 

  -- Add the job steps
  EXECUTE @ReturnCode = msdb.dbo.sp_add_jobstep @job_id = @JobID, @step_id = 1, 
@step_name = N'generate TSQL script', @command = N'osql -E -Sserver1 -
Q"dba.dbo.usp_get_dbstats"  -o  c:\temp\get_dbstats.sql', @database_name = N'', 
@server = N'', @database_user_name = N'', @subsystem = N'CmdExec', 
@cmdexec_success_code = 0, @flags = 0, @retry_attempts = 0, @retry_interval = 1, 
@output_file_name = N'', @on_success_step_id = 0, @on_success_action = 3, 
@on_fail_step_id = 0, @on_fail_action = 2
  IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback 
  EXECUTE @ReturnCode = msdb.dbo.sp_add_jobstep @job_id = @JobID, @step_id = 2, 
@step_name = N'get_stats', @command = N'osql -E -Sserver1 -i 
c:\temp\get_dbstats.sql    -o c:\temp\get_dbstats.rpt', @database_name = N'', 
@server = N'', @database_user_name = N'', @subsystem = N'CmdExec', 
@cmdexec_success_code = 0, @flags = 0, @retry_attempts = 0, @retry_interval = 1, 
@output_file_name = N'', @on_success_step_id = 0, @on_success_action = 1, 
@on_fail_step_id = 0, @on_fail_action = 2
  IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback 
  EXECUTE @ReturnCode = msdb.dbo.sp_update_job @job_id = @JobID, @start_step_id = 1 

  IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback 

  -- Add the job schedules
  EXECUTE @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id = @JobID, @name = 
N'Weekly on Sunday', @enabled = 1, @freq_type = 8, @active_start_date = 20010628, 
@active_start_time = 0, @freq_interval = 1, @freq_subday_type = 1, 
@freq_subday_interval = 0, @freq_relative_interval = 0, @freq_recurrence_factor = 
1, @active_end_date = 99991231, @active_end_time = 235959
  IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback 

  -- Add the Target Servers
  EXECUTE @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @JobID, @server_name = N'(local)' 
  IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback 

END
COMMIT TRANSACTION          
GOTO   EndSave              
QuitWithRollback:
  IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION 
EndSave: 

Conclusion:

When you need to forecast the growth rate of all your databases on a given server, it is nice to have some disk space usage statistics. I hope that I have given you the groundwork for establishing your own method of gathering disk space usage information. I hope that before your manager comes to you with a question about the growth rate of your databases you will have implemented a process to gather disk space usage information.

» See All Articles by Columnist Gregory A. Larsen

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved