Monday, June 30, 2008

FAQ : Explain a scenario which supports vertical partitioning of a table in SQL Server.

The scenario may be different according to the SQL Server Version you have (SQL 7.0, SQL 2000, SQL 2005). I will be covering SQL Server 2000 and 2005

SQL Server 2000.
In SQL Server (in 2000 and 2005) the IN_ROW_DATA or the row size of a table can have only max 8060 bytes. If you have a table which contains 4 columns of varchar (3000) then though you can create the table but if the data being inserted is more than 8060 bytes then the insert will fail. So what we generally do is, we vertically partition the table to two or more table as per the requirement and keep ONE to ONE relation between all the tables. So this is a valid reason to partition your table vertically. When you do vertical partitioning, try to keep most commonly used small size column in single table.

SQL Server 2005
In SQL Server 2005, the above mentioned problem of SQL Server 2000 is not there because of the storage architecture change called ROW_OVERFLOW_DATA. Ie. In SQL Server 2005 you can have a row which exceed 8060 bytes provided the columns are variable types(Varchar,nvarchar). What Database engine internally does is, it keep all the variable datatypes (varchar, nvarchar )columns data in ROW_OVERFLOW_DATA page. Precisely, the Row size limitation is only applicable to fixed size columns like CHAR, NCHAR. So SQL Server 2000 scenario of partitioning table because of the row size exceeds 8060 bytes is not valid in SQL Server 2005.
But there is a valid reason to do vertical partition of the table in SQL Server 2005. If you are using ONLINE INDEX feature of SQL Server 2005 then you cannot use LOB data as a part of index at the leaf level. And you want to use the LOB column in Index because of the performance benefit it provides. In that case best method is to partition the table vertically in such a way that, keep the small ,mostly used columns in single table (like product detailed description may not be asked frequently by the user) and the columns those are referred in less frequency in another table. Since you do not have LOB data in the table you can use ONLINE INDEX feature in the table.

Sunday, June 29, 2008

Best Practices - Datatype Selection while designing tables

• If your database is to support web-based application better to go for UNICODE (Unicode like nchar, nvarchar 2 bytes per char where as ASCII datatypes takes 1 bytes per char) datatypes because you may be going to support different types of clients.
• If your application is multi-lingual go for UNICODE.
• If you are planning to include CLRDatatype (SQL Server 2005) in the database go for UNICODE Datatypes instead of ASCII Datatypes, because, if CLRDatatype is going to consume the data then it must be in UNICODE.
• For numeric column, find the range that column is going to have and then choose the datatype. For eg. If you are sure that the column cannot have more than 255 like DepartmentID in a small organization may not go probably beyond 10 or 20. In such cases it is recommended to choose TinyINT datatype. Generally keeping all integer columns type INT without analyzing the range that going to support is not at all recommended from storage perspective.
• Description /Comments /Remarks sort of columns may or may not have data for all the rows. So it is better to go for Variable datatypes like Varchar ,Nvarchar.
• If you know the column is not nullable and it may contain more or less the same size of the data then for sure go for Fixed datatype like CHAR or NCHAR. Having said that it is important to know that, if you select fixed datatypes and if the column is nullable then if you donot have any data (null) then also the column will consume the space.
• If the size of the column is less than 10 char , use fixed width datatypes like NCHAR or CHAR.
• I have seen in many applications use Decimal to store currency kind of data though the application need less precision which can be supported by money. So, my point is, use Money datatype if you need only 4 precision.
• Use UniqueIdentitifier column as PK and CI or so only when it is unavoidable because UniqueIdentitifier takes 16 Bytes of the space.
Note : The point I want to make here is, if you do a proper analysis of the data and then select the datatype then you can control the row, page, table size and hence increase the performance.

FAQ : How to find the Index Creation /Rebuild Date in SQL Server

AFAIK, there is no system object gives you the information of Index creation Date in SQL Server. If the Clustered index is on PK then the creation date can get from sysobjects or sys.objects. But that is not the case always.

This query which uses STATS_DATE() function to get the STATISTICS updated date. This will not give you accurate result if you are updating STATISTICS explicitly. The logic behind the query is, if you rebuild indexes the STATISTICS are also being updated at the same time. So if you are not explicitly updating STATISTICS using UPDATE STATISTICS tableName command , then this query will give you the correct information


--In SQL Server 2000
Select Name as IndexName,
STATS_DATE ( id , indid ) as IndexCreatedDate
From sysindexes where id=object_id('HumanResources.Employee')

-- In SQL Server 2005
Select Name as IndexName,
STATS_DATE ( object_id , index_id ) as IndexCreatedDate
From sys.indexes where object_id=object_id('HumanResources.Employee')

Sunday, June 22, 2008

SQL Server 2008 Express RC0 now available

Tuesday, June 10, 2008

Upgrading SQL Server from lower version to higher version

Upgrade methodology

(a) First step, before going for any up gradation, you should take full backup of the existing databases including system databases. If anything goes wrong you should have a full backup to fall upon
(b) Run Upgrade advisor : You must run this tool before going for up-gradation. This tool will analyze the existing database in lower version and suggest the potential problems , which may be because of the feature is removed or deprecated in the newer version. Once the upgrade advisor projected any critical problem , you may address the problem before going for up-gradation.
(c) Once you found that there is no major issues reported by Upgrade Advisor, you can plan your up-gradation. You must document each and every step you follow. Also ensure that you have a rollback plan incase anything goes wrong. You must have a proper testing script as well.
(d) You have two choices in upgrade.
In Place upgrade : When you upgrade an earlier SQL Server release in-place through the install upgrade process, all existing application connections remain the same because the server and server instance do not change. This approach requires a more thorough fallback plan and testing. By performing an in-place upgrade, logins and users remain in-sync, database connections remain the same for applications, and SQL Agent jobs and other functionality is concurrently upgraded during the installation. Note that several features, such as log shipping, replication, and cluster environments, have special upgrade considerations.

Side By Side : This method is more or less manual process hence you have full control over the process. You can test the upgrade (migration) process by running parallel system , test it and once proven bring it online. But the disadvantage here is, you have to have additional resource (Hardware/licenses ) . Side by side will be always better since you have the existing instance intact. But at time it may not be feasible because of hardware constraint or Instance Name (like the application demands Default Instance. I would say this method is more clean and controllable.

(e) Once you have upgraded test the system with the application with full load before going online.

Migrating Database in Side by Side Upgrade.
(a) If you have chosen side by side method, after installation of new version of sql server you must migrate the user database from older version sql server instance. There are three options available here.
(a) Backup /Restore : Best and easy method. Benefit is source database can still be online , no downtime
(b) Detach / Attach : You have down time at the source but easy method.
(c) Copy Database Wizard : This tool internally does detach/attach only.

Migration of SQL Logins
This is one of the major disadvantage of Side by Side up-gradation. You must transfer the SQL Logins explicitly. You must transfer the login with the password. Microsoft has provided the script for that. How to Transfer SQL Logins


Migration of SQL Scheduled Jobs
Simple , just need to create the script from source and run that script in target server.

Migration of SSIS Packages
You can use save as option of SSIS package . You can open the package and save the package as file system and then migrate.

Tuesday, June 03, 2008

FAQ: How to see what is inside hidden SQL Server 2005 Resource Database (mssqlsystemresource) ?

We all know the master database in SQL Server 2000 was split into two databases in SQL Server 2005. One kept as Master only and the other database is Resource database which is hidden. You can’t see that in Management studio or in any tool. You can see the physical files in Data ( default )folder. The idea behind this separation of objects is to allow very fast and safe upgrades.

If you want to see what is inside this database follow these steps

(a) Stop SQL Server
(b) Copy /paste mssqlsystemresource mdf and ldf files to a new location
(c) Create a new database by Attaching the files from the new location

Now you have the hidden database right in front….

FAQ : What is the difference between Lazywriter and Checkpoint in SQL Server

Lazywriter
The lazywriter thread sleeps for a specific interval of time. When it is restarted, it examines the size of the free buffer list. If the free buffer list is below a certain point, dependent on the size of the cache, the lazywriter thread scans the buffer cache to reclaim unused pages. It then writes dirty pages that have a reference count of 0. On the Windows 2000, Windows Server 2003, and Windows XP operating systems, most of the work populating the free buffer list and writing dirty pages is performed by the individual threads. The lazywriter thread typically has little to do.

Checkpoint
The checkpoint process also scans the buffer cache periodically and writes any dirty data pages to disk. The difference is that the checkpoint process does not put the buffer page back on the free list. The purpose of the checkpoint process is to minimize the number of dirty pages in memory to reduce the length of a recovery if the server fails. Its purpose is not to populate the free buffer list. Checkpoints typically find few dirty pages to write to disk, because most dirty pages are written to disk by the worker threads or the lazywriter thread in the period between two checkpoints

Refer
http://msdn.microsoft.com/en-us/library/aa175260(SQL.80).aspx
BOL
ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/udb9/html/4b0f27cd-f2ac-4761-8135-adc584bd8200.htm

How to assign XML output of select * FROM Yourtable FOR XML AUTO query to a variable

declare @DataXML xml
set @DataXML =(SELECT * FROM YourTable FOR XML AUTO, ELEMENTS)
select @DataXML

How to find the time taken (total elapsed time) by the query?

Though SQL Server provides SET STATISTICS TIME option to find the total elapsed time taken by a query or a sp, I generally do not use this because, for large number of statement reading the output of Statistics time will be difficult. It is better to keep your own script to get the elapsed time. Here is the script

DECLARE
@total_elapsed_time VARCHAR(100),
@start_time Datetime,
@complete_time Datetime

SELECT @start_time = GETDATE()
Print @start_time
-- here Paste the query which you want to execute
select *From sysobjects ,sys.tables – Replace this query with your query /sp

Set @complete_time=getdate()

SELECT
@total_elapsed_time = 'Total Elapsed Time (minutes:seconds) ' +
CONVERT(CHAR(3),
DATEDIFF(SS,@start_time,@complete_time)/60) +
':' +
CONVERT(CHAR(3),
DATEDIFF(SS,@start_time,@complete_time)%60)
print @total_elapsed_time

Note : If you want to know how long compilation and optimization took then use SET STATISTICS TIME

Sunday, June 01, 2008

Cumulative update package 7 for SQL Server 2005 Service Pack 2

Check out Cumulative update package 7 for SQL Server 2005 Service Pack 2 here
http://support.microsoft.com/kb/949095/en-us.

Check the bug fix list and if your application need that bug fix then only apply cummulative hotfix in Production env.

Friday, May 30, 2008

FAQ : SQL Server DBA as a career option

This is a very common question in SQL Forums hence i thought to blog my view.

If you are a college passed out or a newbie in IT field, then I would suggest you to go for SQL Server MCTS (70-431) and MCITP DBA (70-443 and 70-444) certifications. Do not clear the certification for the sake of it. Learn the techniques and best practices mentioned in the syllabus. You must also go through the Virtual Courses available in Microsoft site. One great thing about Microsoft is , there are fantastic resource available in internet which comes free of cost. You just need to have MSN Live ID and login with that ID and refer the resource. Also must not forget download all the SQL OnDemand webcasts available in Microsoft site. Mind you, Webcast comes right from very experience folks who have been working for decades in this technology and also from the product teams who developed that products. Basically, reading , listening to experience people (webcast) and learn new technology will certainly give you a good launch in SQL Server. Always try to start with new product versions available (like at this point of time we have 2005 and 2008 CTP) in the market to be a leader rather than a follower.

For those who are already into IT field and wants to switch to DBA career can also more or less follow the same steps I have mentioned earlier. Key point in DBA for that matter any field, is Documentation. Try to document day to day activities and be process oriented. DBA is very risky at the same time secure job. Contradictory statements right?. Risky because you are working on production and data is nothing less than god for a DBA or you can say Data is your job. If anything goes wrong, it will may cost your job also. Secure, because the technology in Database generally not changes drastically. Consider one is working in VB 6 and one fine day he has been told to work in .Net 2.0. He will surely have tough time. But for a DBA to jump from SQL Server 2000 to 2005 or 2008 may be a smooth drive compared to other technologies.


SQL Server 2008 Virtual Lab
http://technet.microsoft.com/en-us/cc164207.aspx
SQL Server 2005 Virtual Lab
http://technet.microsoft.com/en-us/bb499681.aspx
SQL Server 2000 Virtual Lab
http://technet.microsoft.com/en-us/bb499685.aspx

Microsoft ondemand Webcasts
http://www.microsoft.com/events/webcasts/ondemand.mspx

Information about all the certification available SQL Server 2005
http://madhuottapalam.blogspot.com/2007_10_01_archive.html

Wednesday, May 28, 2008

FAQ : What all are the recommended events for DTA / ITW workload

If you have provided same workload without different events and columns DTA may give you different report. For eg. if you have not included Duration column, DTA will tune the workload in the order they appear in the workload. IF the workload contains duration column , DTA will tune the events in the descending order of the Duration. So to get better result from DTA included recommended events and columns.

Events Recommended
RPC:Completed
RPC:Starting
SP:StmtCompleted
SP:StmtStarting
SQL:BatchCompleted
SQL:BatchStarting
SQL:StmtCompleted
SQL:StmtStarting

FAQ : How to run DBCC DBREINDEX against all the user table in SQL Server

Simple, use undocumented stored procedure sp_msforeachtable. Since its undocumented there is no gurantee that it will be available in all the future versions. But in SQL Server 2000 and 2005 it works.

Use YourdatabaseName
EXEC sp_msforeachtable 'DBCC DBREINDEX( ''?'')'

FAQ : How triggers in SQL Server 2005 impact Tempdb

In SQL Server 2005 the triggers are implemented using Version store feature. In earlier versions (SQL Server 2000/ SQL 7), the trigger's Inserted and Deleted tables data were captured by reading Transaction log. But in SQL Server 2005, the Inserted and Deleted data is stored in Tempdb as row version. So, if you have more trigger in SQL Server 2005, it may impact Tempdb performance.

FAQ : How to truncate and shrink Transaction Log file in SQL Server

First of all truncation of transaction log is not a recommended practice. But it is unavoidable if you have not kept proper backup policy and recovery model for your database. Its always better to know the cause and prevention for the transaction log size issue. Refer the following KB for more info

Managing the Size of the Transaction Log File
ahttp://msdn.microsoft.com/en-us/library/ms365418(SQL.100).aspx
Transaction Log Physical Architecture
http://msdn.microsoft.com/en-us/library/ms179355(SQL.100).aspx
Factors That Can Delay Log Truncation
http://msdn.microsoft.com/en-us/library/ms345414(SQL.100).aspx


Now coming to the point. If you have no space left with the drive where the Log file is kept and the size of the Transaction Log file is not manageable then its better to shrink the log.

Broadly , you have two steps here.
(a) Mark the inactive part of Trasaction log to release.
(b) Release the marked release portion of Transaction log to OS.

SQL Server 2005

-- Step 1 – Mark the inactive part of the log for release

Use YourDatabaseName
Go
Backup Log YourDatabaseName With Truncate_Only
GO

-- Step 2 - Release the marked space to OS


Declare @LogFileLogicalName sysname
select @LogFileLogicalName=Name from sys.database_files where Type=1
print @LogFileLogicalName

DBCC Shrinkfile(@LogFileLogicalName,100)

Note : If you have single log file the above mentioned script will work. IF you have multiple log file the change the script accordingly

SQl Server 2000

-- Step 1 – Mark the inactive part of the log for release

Use YourDatabaseName
Go
Backup Log YourDatabaseName With Truncate_Only
GO

-- Step 2 - Release the marked space to OS

Declare @LogFileLogicalName sysname
select @LogFileLogicalName=Name from sysfiles where filename like '%.ldf'
print @LogFileLogicalName

DBCC Shrinkfile(@LogFileLogicalName,100)


Note : If you have single log file and the extension of the log file is .LDF the above mentioned script will work. IF you have multiple log file the change the script accordingly

SQL Server 2008

In SQL Server this process have been changed. In 20008, just change the recovery model to simple and then use DBCC SHrinkfile command.

select name,recovery_model_desc from sys.databases
GO
Alter database YourDatabaseName Recovery simple
GO
Declare @LogFileLogicalName sysname
select @LogFileLogicalName=Name from sys.database_files where Type=1
print @LogFileLogicalName

DBCC Shrinkfile(@LogFileLogicalName,100)

Tuesday, May 27, 2008

FAQ : How to search for an object in all the databases

SQL Server 2005

CREATE TABLE #TEMP (TABLENAME SYSNAME, OBJECTNAME SYSNAME,TYPE CHAR(10))

INSERT INTO #TEMP

EXEC SP_MSFOREACHDB "SELECT '?' DATABASENAME, NAME,TYPE FROM ?.SYS.ALL_OBJECTS WHERE NAME='YourSearchingObjectName'"

SELECT * FROM #TEMP

DROP TABLE #TEMP

SQL Server 2000

CREATE TABLE #TEMP (TABLENAME SYSNAME, OBJECTNAME SYSNAME,TYPE CHAR(10))

INSERT INTO #TEMP

EXEC SP_MSFOREACHDB "SELECT '?' DATABASENAME, NAME,XTYPE FROM ?..SYSOBJECTS WHERE NAME='YourSearchingObjectName'"

SELECT * FROM #TEMP

DROP TABLE #TEMP


Note : Replace "YourSearchingObjectName" with the object name which you are searching in the select query

FAQ : Index Scan Vs Seek in SQL Server

There are five logical cum physical operators in SQL Server related to Index scan cum Seek and Table scan

(a) Table Scan
(b) Clustered Index Scan
(c) Clustered Index Seek
(d) Index Scan (Non- Clustered index Seek)
(e) Index Seek (Non- Clustered index Seek)

Table Scan
Table Scan retrieves all rows from the table (if you have no WHERE Conditions). Basically, before returning the rows, it traverse through all data pages related to the table. If you have where condition, though it travel through all the pages only those rows are returned which satisfy the conditions. When you do not have Clustered index on the table it does a table scan. In other words, both Clustered Index Scan (clustered index is nothing but the data itself) and Table Scan are same because in both method system traverse through all the data pages. Generally, you should avoid table scan.

Clustered Index Scan

Clustered Index Scan is nothing but horizontally traversing though the clustered index data pages. Clustered Index Scan return all the rows from the clustered index (Clustered index is nothing but Data). If you have where condition , only the satisfying rows are returned, but system traverse through all the data pages of the clustered index. Both Table scan and Clustered Index Scan are generally considered to be bad. But at times like if the table is small contains only few rows table or clustered index scan may be good also

Clustered Index Seek
Clustered Index Seek traverse vertically right down to the Data page where the requested data is stored. Basically any seek is vertically traversing though the B-Tree structure (as we all know the index is stored in B-tree structure in sql server). System does a Seek when it find a useful index and generally its done for highly selective query.

Index Scan or Non-Clustered Index Scan
As already said, Scan is horizontal traversing of B-Tree data pages. But in this case it horizontally traverse though the Non-Clustered index available. Its not same as Clustered index scan or Table Scan. In SQL Server , while reading execution plan you can find only Index Scan not Non-Clustered Index Scan. But you must read Index Scan as Non-Clustered Index Scan.

Index Seek or Non-Clustered Index Seek
As already mentioned Seek is Vertical traversing of B-Tree to the data page. But in Index seek it vertical traversing of Non-Clustered Index. Generally, its considered as the best option for high selective query.

FAQ : What is the difference between DELETE TABLE and TRUNCATE TABLE commands

TRUNCATE

• Less Transaction Log entry because TRUNCATE TABLE removes the data by deallocating the data pages used to store the table data and records only the page deallocations in the transaction log and hence TRUNCATE is fast
• TRUNCATE apply Fewer table locks
• TRUNCATE is DDL Statement
• TRUNCATE Release the tables spaces to System
• TRUNCATE Can not have WHERE Conditions
• TRUNCATE does not fire trigger
• TRUNCATE reset the identity to 0 (if table have any)
• TRUNCATE Can not be used against the tables involved in TRUNCATE transactional replication or merge replication.
• TRUNCATE Can not be used against the table used in Indexed view
• TRUNCATE can not be used against the table that Are referenced by a FOREIGN KEY constraint.
• TRUNCATE commands are not tracked by DDL trigger

Note : TRUNCATE can be rollbacked. I have seen many place where its is mentioned that it can not

DELETE

• DELETE FROM TABLE command logs each records in transaction log , hence DELETE is slow.
• DELETE apply more locks to the table
• DELETE is a DML command
• DELETE remove the records but will not release the space to the system
• DELETE can have WHERE conditions
• DELETE Fires TRIGGER
• DELETE do not RESET IDENTITY
• DELETE Can be used against table used transactional replication or merge replication
• DELETE Can be used in tables reference by a Foregin Key and tables involved in Indexed view

FAQ : How to move a physical file (MDF or LDF) from one location to another in SQL Server 2005

In SQL Server 2000, if you want to move a physical file of a database from one location to another , you have to detach and attach the database. In SQL Server 2005, this process has been made very simple, you need to take the database offline, alter the file path with the new one using Alter Database command and copy the database file to new location manually and finally take the database online. Simple

--Step 1 : Create Database
Create Database TestMoveFile and check the database file location
GO
Select *From Sys.master_files where database_id=db_id('TestmoveFile')
GO
--Step 2 : Alter Database and Set the db to offline
Alter Database TestMoveFile Set Offline
GO
-- Step 3 : Move the physical file to new location
--Move the file to new location using dos command or Windows GUI

--Step 4 : Alter the database file path using Alter Database command
Alter Database TestMoveFile Modify File(Name='TestMoveFile',FileName='c:\TestmoveFile.mdf')
Go

-- Step 5 : Set the database Online and check the file path
Alter database TestMoveFile Set Online
GO
Select *From Sys.master_files where database_id=db_id('TestmoveFile')

Go
Drop database TestMoveFile

FAQ : How can we know the progress of a maintenance command using DMVs

Its very common that, when we run DBCC maintenance command we would like to know how long will it take or how much percentage the process is completed. Here we go…


Select R.Session_id,R.Command,R.Percent_complete
From
sys.dm_exec_requests R
Inner Join
Sys.dm_exec_sessions S
on S.Session_id=R.Session_ID and S.IS_User_Process=1

This process is supported for DBCC CheckDB, DBCC CheckTable, DBCC CheckFilegroup,
DBCC IndexDefrag, DBCC Shrinkfile

Wednesday, May 21, 2008

FAQ : How to clear SQL Server cache / memory

Warning : Not to be used in production env.

In Development or tsesting env its very common that during performance tuning we do clear cache to get correct picture. It may also required that only a particular db related cache should be clear. here wo go...

(a) Clear entire procedure and databuffer cache

Checkpoint -- Write dirty pages to disk
DBCC FreeProcCache -- Clear entire proc cache
DBCC DropCleanBuffers -- Clear entire data cache

(b) Clear only a particular db procedure cache using undocumented DBCC command

Declare @DBID int
Select @DBID =db_id(‘YourDBname’)
DBCC FLUSHPROCINDB(@DBID) – Undocumented dbcc command to clear only a db proc cache

FAQ - How to use Stored Procedure in Select Query

Very common question in T-SQL Forums

The requirement is, need to use Stored procedure output (Exec SP) in Select query or in a Join. There is a method which may not be a recommended one. You can create a LoopBack Linked server to your own server and use OPENQUERY TO extract the data by EXECUTE Stored procedure.

LoopBack Linked Server

Linked servers can be defined to point back (loop back) to the server on which they are defined. Loopback servers are most useful when testing an application that uses distributed queries on a single server network.

Eg.

SELECT *FROM OPENROWSET
('SQLOLEDB','Server=(local);TRUSTED_CONNECTION=YES;','set fmtonly off exec master.dbo.sp_who')

OR

sp_addlinkedserver @server = N'MyLink',
@srvproduct = N' ',
@provider = N'SQLNCLI',
@datasrc = N'MyServer',
@catalog = N'AdventureWorks'

Note :
@server = N'MyLink' : It can not be your Server name. It is just a name and can be anything otherthan your actual servername. IF you give your server name in this parameter you will get an error as follows
Msg 15028, Level 16, State 1, Procedure sp_MSaddserver_internal, Line 89
The server 'LHI-115' already exists.

@datasrc = N'MyServer' : This parameter value has to be your server name or IP.

@catalog =N'AdventureWorks' : This is the database in which the Stored procedure exists.


OPENQUERY to Extract data from Loopback linkeserver

Select *from openquery([YourLoopbackServerName],'exec AdventureWorks.dbo.sptest')