Monday, June 30, 2008

FAQ : Explain a scenario which supports vertical partitioning of a table in SQL Server.

The scenario may be different according to the SQL Server Version you have (SQL 7.0, SQL 2000, SQL 2005). I will be covering SQL Server 2000 and 2005

SQL Server 2000.
In SQL Server (in 2000 and 2005) the IN_ROW_DATA or the row size of a table can have only max 8060 bytes. If you have a table which contains 4 columns of varchar (3000) then though you can create the table but if the data being inserted is more than 8060 bytes then the insert will fail. So what we generally do is, we vertically partition the table to two or more table as per the requirement and keep ONE to ONE relation between all the tables. So this is a valid reason to partition your table vertically. When you do vertical partitioning, try to keep most commonly used small size column in single table.

SQL Server 2005
In SQL Server 2005, the above mentioned problem of SQL Server 2000 is not there because of the storage architecture change called ROW_OVERFLOW_DATA. Ie. In SQL Server 2005 you can have a row which exceed 8060 bytes provided the columns are variable types(Varchar,nvarchar). What Database engine internally does is, it keep all the variable datatypes (varchar, nvarchar )columns data in ROW_OVERFLOW_DATA page. Precisely, the Row size limitation is only applicable to fixed size columns like CHAR, NCHAR. So SQL Server 2000 scenario of partitioning table because of the row size exceeds 8060 bytes is not valid in SQL Server 2005.
But there is a valid reason to do vertical partition of the table in SQL Server 2005. If you are using ONLINE INDEX feature of SQL Server 2005 then you cannot use LOB data as a part of index at the leaf level. And you want to use the LOB column in Index because of the performance benefit it provides. In that case best method is to partition the table vertically in such a way that, keep the small ,mostly used columns in single table (like product detailed description may not be asked frequently by the user) and the columns those are referred in less frequency in another table. Since you do not have LOB data in the table you can use ONLINE INDEX feature in the table.

Sunday, June 29, 2008

Best Practices - Datatype Selection while designing tables

• If your database is to support web-based application better to go for UNICODE (Unicode like nchar, nvarchar 2 bytes per char where as ASCII datatypes takes 1 bytes per char) datatypes because you may be going to support different types of clients.
• If your application is multi-lingual go for UNICODE.
• If you are planning to include CLRDatatype (SQL Server 2005) in the database go for UNICODE Datatypes instead of ASCII Datatypes, because, if CLRDatatype is going to consume the data then it must be in UNICODE.
• For numeric column, find the range that column is going to have and then choose the datatype. For eg. If you are sure that the column cannot have more than 255 like DepartmentID in a small organization may not go probably beyond 10 or 20. In such cases it is recommended to choose TinyINT datatype. Generally keeping all integer columns type INT without analyzing the range that going to support is not at all recommended from storage perspective.
• Description /Comments /Remarks sort of columns may or may not have data for all the rows. So it is better to go for Variable datatypes like Varchar ,Nvarchar.
• If you know the column is not nullable and it may contain more or less the same size of the data then for sure go for Fixed datatype like CHAR or NCHAR. Having said that it is important to know that, if you select fixed datatypes and if the column is nullable then if you donot have any data (null) then also the column will consume the space.
• If the size of the column is less than 10 char , use fixed width datatypes like NCHAR or CHAR.
• I have seen in many applications use Decimal to store currency kind of data though the application need less precision which can be supported by money. So, my point is, use Money datatype if you need only 4 precision.
• Use UniqueIdentitifier column as PK and CI or so only when it is unavoidable because UniqueIdentitifier takes 16 Bytes of the space.
Note : The point I want to make here is, if you do a proper analysis of the data and then select the datatype then you can control the row, page, table size and hence increase the performance.

FAQ : How to find the Index Creation /Rebuild Date in SQL Server

AFAIK, there is no system object gives you the information of Index creation Date in SQL Server. If the Clustered index is on PK then the creation date can get from sysobjects or sys.objects. But that is not the case always.

This query which uses STATS_DATE() function to get the STATISTICS updated date. This will not give you accurate result if you are updating STATISTICS explicitly. The logic behind the query is, if you rebuild indexes the STATISTICS are also being updated at the same time. So if you are not explicitly updating STATISTICS using UPDATE STATISTICS tableName command , then this query will give you the correct information

--In SQL Server 2000
Select Name as IndexName,
STATS_DATE ( id , indid ) as IndexCreatedDate
From sysindexes where id=object_id('HumanResources.Employee')

-- In SQL Server 2005
Select Name as IndexName,
STATS_DATE ( object_id , index_id ) as IndexCreatedDate
From sys.indexes where object_id=object_id('HumanResources.Employee')

Sunday, June 22, 2008

SQL Server 2008 Express RC0 now available

Tuesday, June 10, 2008

Upgrading SQL Server from lower version to higher version

Upgrade methodology

(a) First step, before going for any up gradation, you should take full backup of the existing databases including system databases. If anything goes wrong you should have a full backup to fall upon
(b) Run Upgrade advisor : You must run this tool before going for up-gradation. This tool will analyze the existing database in lower version and suggest the potential problems , which may be because of the feature is removed or deprecated in the newer version. Once the upgrade advisor projected any critical problem , you may address the problem before going for up-gradation.
(c) Once you found that there is no major issues reported by Upgrade Advisor, you can plan your up-gradation. You must document each and every step you follow. Also ensure that you have a rollback plan incase anything goes wrong. You must have a proper testing script as well.
(d) You have two choices in upgrade.
In Place upgrade : When you upgrade an earlier SQL Server release in-place through the install upgrade process, all existing application connections remain the same because the server and server instance do not change. This approach requires a more thorough fallback plan and testing. By performing an in-place upgrade, logins and users remain in-sync, database connections remain the same for applications, and SQL Agent jobs and other functionality is concurrently upgraded during the installation. Note that several features, such as log shipping, replication, and cluster environments, have special upgrade considerations.

Side By Side : This method is more or less manual process hence you have full control over the process. You can test the upgrade (migration) process by running parallel system , test it and once proven bring it online. But the disadvantage here is, you have to have additional resource (Hardware/licenses ) . Side by side will be always better since you have the existing instance intact. But at time it may not be feasible because of hardware constraint or Instance Name (like the application demands Default Instance. I would say this method is more clean and controllable.

(e) Once you have upgraded test the system with the application with full load before going online.

Migrating Database in Side by Side Upgrade.
(a) If you have chosen side by side method, after installation of new version of sql server you must migrate the user database from older version sql server instance. There are three options available here.
(a) Backup /Restore : Best and easy method. Benefit is source database can still be online , no downtime
(b) Detach / Attach : You have down time at the source but easy method.
(c) Copy Database Wizard : This tool internally does detach/attach only.

Migration of SQL Logins
This is one of the major disadvantage of Side by Side up-gradation. You must transfer the SQL Logins explicitly. You must transfer the login with the password. Microsoft has provided the script for that. How to Transfer SQL Logins

Migration of SQL Scheduled Jobs
Simple , just need to create the script from source and run that script in target server.

Migration of SSIS Packages
You can use save as option of SSIS package . You can open the package and save the package as file system and then migrate.

Tuesday, June 03, 2008

FAQ: How to see what is inside hidden SQL Server 2005 Resource Database (mssqlsystemresource) ?

We all know the master database in SQL Server 2000 was split into two databases in SQL Server 2005. One kept as Master only and the other database is Resource database which is hidden. You can’t see that in Management studio or in any tool. You can see the physical files in Data ( default )folder. The idea behind this separation of objects is to allow very fast and safe upgrades.

If you want to see what is inside this database follow these steps

(a) Stop SQL Server
(b) Copy /paste mssqlsystemresource mdf and ldf files to a new location
(c) Create a new database by Attaching the files from the new location

Now you have the hidden database right in front….

FAQ : What is the difference between Lazywriter and Checkpoint in SQL Server

The lazywriter thread sleeps for a specific interval of time. When it is restarted, it examines the size of the free buffer list. If the free buffer list is below a certain point, dependent on the size of the cache, the lazywriter thread scans the buffer cache to reclaim unused pages. It then writes dirty pages that have a reference count of 0. On the Windows 2000, Windows Server 2003, and Windows XP operating systems, most of the work populating the free buffer list and writing dirty pages is performed by the individual threads. The lazywriter thread typically has little to do.

The checkpoint process also scans the buffer cache periodically and writes any dirty data pages to disk. The difference is that the checkpoint process does not put the buffer page back on the free list. The purpose of the checkpoint process is to minimize the number of dirty pages in memory to reduce the length of a recovery if the server fails. Its purpose is not to populate the free buffer list. Checkpoints typically find few dirty pages to write to disk, because most dirty pages are written to disk by the worker threads or the lazywriter thread in the period between two checkpoints


How to assign XML output of select * FROM Yourtable FOR XML AUTO query to a variable

declare @DataXML xml
select @DataXML

How to find the time taken (total elapsed time) by the query?

Though SQL Server provides SET STATISTICS TIME option to find the total elapsed time taken by a query or a sp, I generally do not use this because, for large number of statement reading the output of Statistics time will be difficult. It is better to keep your own script to get the elapsed time. Here is the script

@total_elapsed_time VARCHAR(100),
@start_time Datetime,
@complete_time Datetime

SELECT @start_time = GETDATE()
Print @start_time
-- here Paste the query which you want to execute
select *From sysobjects ,sys.tables – Replace this query with your query /sp

Set @complete_time=getdate()

@total_elapsed_time = 'Total Elapsed Time (minutes:seconds) ' +
DATEDIFF(SS,@start_time,@complete_time)/60) +
':' +
print @total_elapsed_time

Note : If you want to know how long compilation and optimization took then use SET STATISTICS TIME

Sunday, June 01, 2008

Cumulative update package 7 for SQL Server 2005 Service Pack 2

Check out Cumulative update package 7 for SQL Server 2005 Service Pack 2 here

Check the bug fix list and if your application need that bug fix then only apply cummulative hotfix in Production env.