SQL Question and Answers: November 2012

Thursday, November 29, 2012

SQL SERVER – Difference Between Union vs. Union All – Optimal Performance Comparison

March 11, 2009 by pinaldave

More than a year ago I had written article SQL SERVER – Union vs. Union All – Which is better for performance? I have got many request to update this article. It is not fair to update already written article so I am rewriting it again with additional information.

UNION
The UNION command is used to select related information from two tables, much like the JOIN command. However, when using the UNION command all selected columns need to be of the same data type. With UNION, only distinct values are selected.

UNION ALL
The UNION ALL command is equal to the UNION command, except that UNION ALL selects all values.

The difference between Union and Union all is that Union all will not eliminate duplicate rows, instead it just pulls all rows from all tables fitting your query specifics and combines them into a table.

A UNION statement effectively does a SELECT DISTINCT on the results set. If you know that all the records returned are unique from your union, use UNION ALL instead, it gives faster results.

Run following script in SQL Server Management Studio to see the result between UNION ALL and UNION. Download complete script from here.

/* Declare First Table */DECLARE @Table1 TABLE (ColDetail VARCHAR(10))INSERT INTO @Table1SELECT 'First'UNION ALLSELECT 'Second'UNION ALLSELECT 'Third'UNION ALLSELECT 'Fourth'UNION ALLSELECT 'Fifth'/* Declare Second Table */DECLARE @Table2 TABLE (ColDetail VARCHAR(10))INSERT INTO @Table2SELECT 'First'UNION ALLSELECT 'Third'UNION ALLSELECT 'Fifth'/* Check the data using SELECT */SELECT *FROM @Table1SELECT *FROM @Table2/* UNION ALL */SELECT *FROM @Table1UNION ALLSELECT *FROM @Table2/* UNION */SELECT *FROM @Table1UNION
SELECT *FROM @Table2GO

In our example we have two tables: @Table1 and @Table2.

Now let us run UNION ALL and UNION together and see the resultset as well as Execution Plan compared to complete set of query. You can always turn on actual execution plan using CTRL+M.

We can see from the resultset of UNION ALL that it returns everything from both the table but from UNION it is very clear that only DISTINCT rows from both the table is only retrieved.

Additionally, when comparing the execution plan of UNION ALL and UNION it is also quite clear that UNION ALL is way less expensive than UNION as it does not have DISTINCT SORT operation.

Let me know what do you think about this article. If you have any suggestion for improvement please let me know and I will update articles according to that.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – 2012 – All Download Links in Single Page – SQL Server 2012

March 11, 2012 by pinaldave

SQL Server 2012 RTM is just announced and recently I wrote about all the SQL Server 2012 Certification on single page. As a feedback, I received suggestions to have a single page where everything about SQL Server 2012 is listed. I will keep this page updated as new updates are announced.

Microsoft SQL Server 2012 EvaluationMicrosoft SQL Server 2012 enables a cloud-ready information platform that will help organizations unlock breakthrough insights across the organization.

Microsoft SQL Server 2012 ExpressMicrosoft SQL Server 2012 Express is a powerful and reliable free data management system that delivers a rich and reliable data store for lightweight Web Sites and desktop applications.

Microsoft SQL Server 2012 Feature PackThe Microsoft SQL Server 2012 Feature Pack is a collection of stand-alone packages which provide additional value for Microsoft SQL Server 2012.

Microsoft SQL Server 2012 Report BuilderReport Builder provides a productive report-authoring environment for IT professionals and power users. It supports the full capabilities of SQL Server 2012 Reporting Services.

Microsoft SQL Server 2012 Master Data Services Add-in For Microsoft ExcelThe Master Data Services Add-in for Excel gives multiple users the ability to update master data in a familiar tool without compromising the data’s integrity in Master Data Services.

Microsoft SQL Server 2012 Performance Dashboard ReportsThe SQL Server 2012 Performance Dashboard Reports are Reporting Services report files designed to be used with the Custom Reports feature of SQL Server Management Studio.

Microsoft SQL Server 2012 PowerPivot for Microsoft Excel® 2010Microsoft PowerPivot for Microsoft Excel 2010 provides ground-breaking technology; fast manipulation of large data sets, streamlined integration of data, and the ability to effortlessly share your analysis through Microsoft SharePoint.

Microsoft SQL Server 2012 Reporting Services Add-in for Microsoft SharePoint Technologies 2010The SQL Server 2012 Reporting Services Add-in for Microsoft SharePoint 2010 technologies allows you to integrate your reporting environment with the collaborative SharePoint 2010 experience.

Microsoft SQL Server 2012 Semantic Language StatisticsThe Semantic Language Statistics Database is a required component for the Statistical Semantic Search feature in Microsoft SQL Server 2012 Semantic Language Statistics.

Microsoft ®SQL Server 2012 FileStream Driver – Windows Logo CertificationCatalog file for Microsoft SQL Server 2012 FileStream Driver that is certified for WindowsServer 2008 R2. It meets Microsoft standards for compatibility and recommended practices with the Windows Server 2008 R2 operating systems.

Microsoft SQL Server StreamInsight 2.0Microsoft StreamInsight is Microsoft’s Complex Event Processing technology to help businesses create event-driven applications and derive better insights by correlating event streams from multiple sources with near-zero latency.

Microsoft JDBC Driver 4.0 for SQL ServerDownload the Microsoft JDBC Driver 4.0 for SQL Server, a Type 4 JDBC driver that provides database connectivity through the standard JDBC application program interfaces (APIs) available in Java Platform, Enterprise Edition 5 and 6.

Data Quality Services Performance Best Practices GuideThis guide focuses on a set of best practices for optimizing performance of Data Quality Services (DQS).

Microsoft Drivers 3.0 for SQL Server for PHPThe Microsoft Drivers 3.0 for SQL Server for PHP provide connectivity to Microsoft SQLServer from PHP applications.

Product Documentation for Microsoft SQL Server 2012 for firewall and proxy restricted environmentsThe Microsoft SQL Server 2012 setup installs only the Help Viewer…install any documentation. All of the SQL Server documentation is available online.

Reference: Pinal Dave (http://blog.sqlauthority.com)

What are slowly changing dimensions (SCD)?

SCD is abbreviation of Slowly changing dimensions. SCD applies to cases where the attribute for a record varies over time.
There are three different types of SCD.
1) SCD1 : The new record replaces the original record. Only one record exist in database – current data.
...

2) SCD2 : A new record is added into the customer dimension table. Two records exist in database – current data and previous history data.
3) SCD3 : The original data is modified to include new data. One record exist in database – new information are attached with old information in same row.

(Read more here http://bit.ly/sql-interview)

Tuesday, November 27, 2012

L SERVER – Sends backups to a Network Folder, FTP Server, Dropbox, Google Drive or Amazon S3

November 26, 2012 by pinaldave

Let me tell you about one of the most useful SQL tools that every DBA should use – it is SQLBackupAndFTP. I have been using this tool since 2009 – and it is the first program I install on a SQL server. Download a free version, 1 minute configuration and your daily backups are safe in the cloud.
In summary, SQLBackupAndFTP

Creates SQL Server database and file backups on schedule
Compresses and encrypts the backups
Sends backups to a network folder, FTP Server, Dropbox, Google Drive or Amazon S3
Sends email notifications of job’s success or failure

SQLBackupAndFTP comes in Free and Paid versions (starting from $29) – see version comparison. Free version is fully functional for unlimited ad hoc backups or for scheduled backups of up to two databases – it will be sufficient for many small customers.

What has impressed me from the beginning – is that I understood how it works and was able to configure the job from a single form (see Image 1 – Main form above)

Connect to you SQL server and select databases to be backed up
Click “Add backup destination” to configure where backups should go to (network, FTP Server, Dropbox, Google Drive or Amazon S3)
Enter your email to receive email confirmations
Set the time to start daily full backups (or go to Settings if you need Differential or Transaction Log backups on a flexible schedule)
Press “Run Now” button to test

You can get to this form if you click “Settings” buttons in the “Schedule section”. Select what types of backups and how often you want to run them and you will see the scheduled backups in the “Estimated backup plan” list

A detailed tutorial is available on the developer’s website.

Along with SQLBackupAndFTP setup gives you the option to install “One-Click SQL Restore” (you can install it stand-alone too) – a basic tool for restoring just Full backups.

However basic, you can drag-and-drop on it the zip file created by SQLBackupAndFTP, it unzips the BAK file if necessary, connects to the SQL server on the start, selects the right database, it is smart enough to restart the server to drop open connections if necessary – very handy for developers who need to restore databases often.

You may ask why is this tool is better than maintenance tasks available in SQL Server? While maintenance tasks are easy to set up, SQLBackupAndFTP is still way easier and integrates solution for compression, encryption, FTP, cloud storage and email which make it superior to maintenance tasks in every aspect.

On a flip side SQLBackupAndFTP is not the fanciest tool to manage backups or check their health. It only works reliably on local SQL Server instances. In other words it has to be installed on the SQL server itself. For remote servers it uses scripting which is less reliable. This limitations is actually inherent in SQL server itself as BACKUP DATABASE command creates backup not on the client, but on the server itself.

This tool is compatible with almost all the known SQL Server versions. It works with SQL Server 2008 (all versions) and many of the previous versions. It is especially useful for SQL Server Express 2005 and SQL Server Express 2008, as they lack built in tools for backup.

I strongly recommend this tool to all the DBAs. They must absolutely try it as it is free and does exactly what it promises. You can download your free copy of the tool from here.

Please share your experience about using this tool. I am eager to receive your feedback regarding this article.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Changing Default Installation Path for SQL Server

November 11, 2012 by pinaldave

Earlier I wrote a blog post about SQL SERVER – Move Database Files MDF and LDF to Another Location and in the blog post we discussed how we can change the location of the MDF and LDF files after database is already created. I had mentioned that we will discuss how to change the default location of the database. This way we do not have to change the location of the database after it is created at different locations.

The ideal scenario would be to specify this default location of the database files when SQL Server Installation was performed. If you have already installed SQL Server there is an easy way to solve this problem. This will not impact any database created before the change, it will only affect the default location of the database created after the change.

To change the default location of the SQL Server Installation follow the steps mentioned below:

Go to Right Click on Servers >> Click on Properties >> Go to the Database Settings screen

You can change the default location of the database files. All the future database created after the setting is changed will go to this new location.

You can also do the same with T-SQL and here is the T-SQL code to do the same.

USE [master]
GOEXEC xp_instance_regwrite N'HKEY_LOCAL_MACHINE', N'Software\Microsoft\MSSQLServer\MSSQLServer', N'DefaultData', REG_SZ, N'F:\DATA'GOEXEC xp_instance_regwrite N'HKEY_LOCAL_MACHINE', N'Software\Microsoft\MSSQLServer\MSSQLServer', N'DefaultLog', REG_SZ, N'F:\DATA'GO

What are the best practices do you follow with regards to default file location for your database? I am interested to know them.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

What is SQL Profiler ?

SQL Profiler is a graphical tool that allows system administrators to monitor events in an instance of Microsoft SQL Server. You can capture and save data about each event to a file or SQL Server table to analyze later. For example, you can monitor a production environment to see which stored procedures are hampering performance by executing too slowly.

Use SQL Profiler to monitor only the events in which you are interested. If traces are becoming too large, you can filter them based on the information you want, so that only a subset of the event data is collected. Monitoring too many events adds overhead to the server and the monitoring process and can cause the trace file or trace table to grow very large, especially when the monitoring process takes place over a long period of time.

(Read more here http://bit.ly/sql-interview)

Monday, November 26, 2012

QL SERVER – Find Weekend and Weekdays from Datetime in SQL Server 2012

November 25, 2012 by pinaldave

Yesterday we had very first SQL Bangalore User Group meeting and I was asked following question right after the session.
“How do we know if today is a weekend or weekday using SQL Server Functions?”
Well, I assume most of us are using SQL Server 2012 so I will suggest following solution. I am using SQL Server 2012′s CHOOSE function. It is

SELECT GETDATE() Today,DATENAME(dw, GETDATE()) DayofWeek,CHOOSE(DATEPART(dw, GETDATE()), 'WEEKEND','Weekday','Weekday','Weekday','Weekday','Weekday','WEEKEND') WorkDay
GO

You can use the choose function on table as well. Here is the quick example of the same.

USE AdventureWorks2012
GOSELECT A.ModifiedDate,DATENAME(dw, A.ModifiedDate) DayofWeek,CHOOSE(DATEPART(dw, A.ModifiedDate), 'WEEKEND','Weekday','Weekday','Weekday','Weekday','Weekday','WEEKEND') WorkDay
FROM [Person].[Address] A
GO

If you are using an earlier version of the SQL Server you can use a CASE statement instead of CHOOSE function.
Please read my earlier article which discusses CHOOSE function and CASE statements. Logical Function – CHOOSE() – A Quick Introduction
Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Denali – Logical Function – CHOOSE() – A Quick Introduction

September 11, 2011 by pinaldave

In SQL Server Denali, there are two new logical functions being introduced, namely:

IIF()
CHOOSE()

Today we will quickly take a look at the CHOOSE() function. This function is very simple and it returns specified index from a list of values. If Index is numeric, it is converted to integer. On the other hand, if index is greater than the element in the list, it returns NULL.

Now let us look at these examples showing how CHOOSE() works:

Example 1: CHOOSE Usage

SELECT CHOOSE ( 0, 'TRUE', 'FALSE', 'Unknown' ) AS Returns_Null;SELECT CHOOSE ( 1, 'TRUE', 'FALSE', 'Unknown' ) AS Returns_First;SELECT CHOOSE ( 2, 'TRUE', 'FALSE', 'Unknown' ) AS Returns_Second;SELECT CHOOSE ( 3, 'TRUE', 'FALSE', 'Unknown' ) AS Returns_Third;SELECT CHOOSE ( 4, 'TRUE', 'FALSE', 'Unknown' ) AS Result_NULL;

You can see that when index is Zero or greater than the elements in the list, it returns the value as NULL and it does not return error.

Example 2: Usage of CHOOSE when Index is Not Integer

SELECT CHOOSE ( 1.1, 'TRUE', 'FALSE', 'Unknown' ) AS Returns_First;SELECT CHOOSE ( 2.9, 'TRUE', 'FALSE', 'Unknown' ) AS Returns_Second;

You can see that Float value is automatically converted to Integer value and appropriate list value is selected.

Example 3: Usage of CHOOSE with Table

CHOOSE() function can be very useful when it is used in the table as well. In the following example, I am trying to figure out if the day is weekend or weekday using CHOOSE function. There are other ways to figure that out as well, but here in this example I am using that to demonstrate the usage of the CHOOSE function.

USE AdventureWorks2008R2
GOSELECT A.ModifiedDate,DATEPART(dw, A.ModifiedDate) DayofWeek,DATENAME(dw, A.ModifiedDate) DayofWeek,CHOOSE(DATEPART(dw, A.ModifiedDate), 'WEEKEND','Weekday','Weekday','Weekday','Weekday','Weekday','WEEKEND') WorkDayFROM [Person].[Address] AINNER JOIN [Person].[StateProvince] SPON SP.StateProvinceID = A.StateProvinceID
GO

In the following example, we see the result of the query listed above.

Example 4: Usage of CHOOSE with Table and CASE statement and performance comparison

Let us re-write the above query with CASE statement first.

USE AdventureWorks2008R2
GOSELECT A.ModifiedDate,DATEPART(dw, A.ModifiedDate) DayofWeek,DATENAME(dw, A.ModifiedDate) DayofWeek,CASE DATEPART(dw, A.ModifiedDate)WHEN 1 THEN 'WEEKEND'WHEN 2 THEN 'Weekday'WHEN 3 THEN 'Weekday'WHEN 4 THEN 'Weekday'WHEN 5 THEN 'Weekday'WHEN 6 THEN 'Weekday'WHEN 7 THEN 'WEEKEND'END  WorkDayFROM [Person].[Address] AINNER JOIN [Person].[StateProvince] SPON SP.StateProvinceID = A.StateProvinceID
GO

The following image demonstrates that the results from both the queries are the same.

Now let us compare the execution plans of both the query – the CHOOSE function query and CASE Statement.

When the execution plans are compared, it is very clear that the cost of both queries is the same. The execution plans also look similar. Let us now examine the properties of the Compute Scalar function for both execution plans.

When the properties of the Compute Scalar function are examined, it is clear that both of them are used at the end CASE statement. Just like IIF function, the CHOOSE function is also the shorthand of the CASE statement.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

Sunday, November 25, 2012

How to implement one-to-one, one-to-many and many-to-many relationships while designing tables?

One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships. One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships.

Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite primary key of the junction table.

(Read more here http://bit.lysql-interview/)

What is Trigger?

A trigger is a SQL procedure that initiates an action when an event (INSERT, DELETE or UPDATE) occurs. Triggers are stored in and managed by the DBMS.Triggers are used to maintain the referential integrity of data by changing the data in a systematic fashion. A trigger cannot be called or executed; the DBMS automatically fires the trigger as a result of a data modification to the associated table. Triggers can be viewed as similar to stored procedures in that both consist of procedural logic that is stored at the database level. Stored procedures, however, are not event-drive and are not attached to a specific table as triggers are. Stored procedures are explicitly executed by invoking a CALL to the procedure while triggers are implicitly executed. In addition, triggers can also execute stored procedures.
Nested Trigger: A trigger can also contain INSERT, UPDATE and DELETE logic within itself, so when the trigger is fired because of data modification it can also cause another data modification, thereby firing another trigger. A trigger that contains data modification logic within itself is called a nested trigger.

(Read more here http://bit.ly/sql-interview)

Monday, November 19, 2012

SQL SERVER – Using RAND() in User Defined Functions (UDF)

November 20, 2012 by pinaldave

Here is the question I received in email.

“Pinal,

I am writing a function where we need to generate random password. While writing T-SQL I faced following issue. Everytime I tried to use RAND() function in my User Defined Function I am getting following error:

Msg 443, Level 16, State 1, Procedure RandFn, Line 7
Invalid use of a side-effecting operator ‘rand’ within a function.

Here is the simplified T-SQL code of the function which I am using:

CREATE FUNCTION RandFn()RETURNS INT
AS
BEGIN
DECLARE @rndValue INT
SET @rndValue = RAND()RETURN @rndValueENDGO

I must use UDF so is there any workaround to use RAND function in UDF.”

Here is the workaround how RAND() can be used in UDF. The scope of the blog post is not to discuss the advantages or disadvantages of the function or random function here but just to show how RAND() function can be used in UDF.

RAND() function is directly not allowed to use in the UDF so we have to find alternate way to use the same function. This can be achieved by creating a VIEW which is using RAND() function and use the same VIEW in the UDF. Here is the step by step instructions.

Create a VIEW using RAND function.

CREATE VIEW rndViewAS
SELECT RAND() rndResult
GO

Create a UDF using the same VIEW.

CREATE FUNCTION RandFn()RETURNS DECIMAL(18,18)AS
BEGIN
DECLARE @rndValue DECIMAL(18,18)SELECT @rndValue = rndResultFROM rndViewRETURN @rndValueENDGO

Now execute the UDF and it will just work fine and return random result.

SELECT dbo.RandFn()GO

In T-SQL world, I have noticed that there are more than one solution to every problem. Is there any better solution to this question? Please post that question as a comment and I will include it with due credit.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

Responses

on November 20, 2012 at 8:08 am | Reply Alphonso Jones
You can also use a stored procedure with an output parameter:
Create procedure dbo.usp_Randomnumber @Somenumber float OUTPUT
as
Begin
Set @Somenumber = rand()
End
Go
Declare @Number float
Exec dbo.usp_Randomnumber @Number OUTPUT
Select @Number

What is Identity ?

Identity (or AutoNumber) is a column that automatically generates numeric values. A start and increment value can be set, but most DBA leave these at 1. A GUID column also generates numbers, the value of this cannot be controled. Identity/GUID columns do not need to be indexed.

(Read more here http://bit.ly/sql-interview)

SQL SERVER – Removing Leading Zeros From Column in Table

November 19, 2012 by pinaldave

Some questions surprises me and make me write code which I have never explored before. Today was similar experience as well. I have always received the question regarding how to reserve leading zeroes in SQL Server while displaying them on the SSMS or another application. I have written articles on this subject over here.

Today I received a very different question where the user wanted to remove leading zero and white space. I am using the same sample sent by user in this example.

USE tempdb
GO-- Create sample tableCREATE TABLE Table1 (Col1 VARCHAR(100))INSERT INTO Table1 (Col1)SELECT '0001'UNION ALLSELECT '000100'UNION ALLSELECT '100100'UNION ALLSELECT '000 0001'UNION ALLSELECT '00.001'UNION ALLSELECT '01.001'GO-- Original dataSELECT *FROM Table1
GO-- Remove leading zerosSELECTSUBSTRING(Col1, PATINDEX('%[^0 ]%', Col1 + ' '), LEN(Col1))FROM Table1
GO-- Clean upDROP TABLE Table1
GO

Here is the resultset of above script. It will remove any leading zero or space and will display the number accordingly.

This problem is a very generic problem and I am confident there are alternate solutions to this problem as well. If you have an alternate solution or can suggest a sample data which does not satisfy the SUBSTRING solution proposed, I will be glad to include them in follow up blog post with due credit.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL Server: Database Basics November 19th, 2012 Vinod Kumar

Database learning can never stop. Based on the previous posts on the basics I saw few comments asking to cover some of the basics in a one liner mode. In this blog post, we will talk about the database fundamentals of system DB’s, what are the DB files and what are the filegroups. I get an opportunity to talk to a lot of developers who come from the compete platform and are often asking these fundamental and basic questions. I am sure this blog post will help them understand these fundamentals.
System Databases

Master: composed of system tables that keep track of server installation as a whole and all other databases that are eventually created. Master DB has system catalogs that keep info about disk space, file allocations and usage, configuration settings, endpoints, logins, etc.
Model: template database. Gets cloned when a new database is created. Any changes that one would like be applied by default to a new database should be made here
Tempdb: re-created every time SQL Server instance is restarted. Holds intermediate results created internally by SQL Server during query processing and sorting, maintaining row versions, etc. Recreated from the model database. Sizing and configuration of tempdb is critical for SQL Server performance.
Resource [hidden database]: stores executable system objects such as stored system procedures and functions. Allows for very fast and safe upgrades.
MSDB: used by the SQL Server Agent service and other companion services. Used for backups, replication tasks, Service Broker, supports jobs, alerts, log shipping, policies, database mail and recovery of damaged pages.

Database Files

Primary data files: every database must have at least one primary data file that keeps track of all the rest of the files in the database. Has the extension .mdf.
Secondary data files: a database may have zero or more secondary data files. Has the extension .ndf.
Log files: every database has at least one log file that contains information necessary to recover all transactions in a database. Has the extension .ldf.

Creating a Database

New user database files must be at least 3 MB or larger including the transaction log
The default size of the data file is the size of the primary data file of the model database (2 MB) and the default size of the log file is 0.5 MB
If LOG ON is not specified but data files are specified during a create database, the size of the log file is 25% of the sum of the sizes of all the data files.

Expanding or Shrinking a Database

Automatic File Expansion:

The file property FILEGROWTH determines how automatic expansion happens
File property MAXSIZE sets the upper limit on the size

Manual File Expansion: use the ALTER DATABASE command with the MODIFY FILE option to change the SIZE property to increase the database file size

Fast File Initialization: adds space to the data file without filling the newly added space with zeros. New disk content is overwritten as new data is written to the files. Security is managed through Windows security setting SE_MANAGE_VOLUME_NAME

Automatic Shrinkage:

Same as doing DBCC SHRINKDATABASE (dbname, 25). Leave 25 % free space in the database after the shrink
Thread performs autoshrink as often as 30 minutes, very resource intensive

Manual Shrinkage: use DBCC SHRINKDATABASE if you want to shrink.

I highly recommend not to shrink the database.

Filegroups

Can group data files for a database into filegroups for allocation and administration purposes.
Improves performance by controlling the placement of data and indexes into specific filegroups on specific drives or volumes.
Filegroup containing the primary data file is called the primary filegroup, there is only one primary filegroup.
Default filegroup: there is at least one filegroup with the property of DEFAULT, can be changed by DBA.
Use cases when -not- to use filegroups:

DBA might decide to spread out the I/O for a database: easiest way is to create a database file on a RAID device.
DBA might want multiple files, perhaps to create a database that uses more space than is available on a single drive: can be accomplished by doing CREATE DATABASE with a list of files on separate drives

Use cases when you want to use filegroups:

DBA might want to have different tables assigned to different drives or to use the table and index partitioning feature in SQL Server.

Benefits:

Allows backup of parts of the database.
Table is created on a single filegroup, allows for backup of critical tables by backing up selected filegroups.
Same for restoration. Database can be online as soon as primary filegroup is restored, but only objects on the restored filegroups will be available.

Given the nature of content, also look at the other topics we have discussed before.
Concurrency Basics
Locking Basics
Transaction Log Basics

Can a stored procedure call itself or recursive stored procedure? How many level SP nesting possible?

Yes. Because Transact-SQL supports recursion, you can write stored procedures that call themselves. Recursion can be defined as a method of problem solving wherein the solution is arrived at by repetitively applying it to subsets of the problem. A common application of recursive logic is to perform numeric computations that lend themselves to repetitive evaluation by the same processing steps. Stored procedures are nested when one stored procedure calls another or executes managed code by referencing a CLR routine, type, or aggregate. You can nest stored procedures and managed code references up to 32 levels.

(Read more here http://bit.ly/sql-interview)

What are the authentication modes in SQL Server? How can it be changed?

Windows mode and mixed mode (SQL & Windows).
To change authentication mode in SQL Server click Start, Programs, Microsoft SQL Server and click SQL Enterprise Manager to run SQL Enterprise Manager from the Microsoft SQL Server program group. Select the server then from the Tools menu select SQL Server Configuration Properties, and choose the Security page.
(Read more here http://bit.ly/sql-interview)

Sunday, November 18, 2012

SQL Server: Locking basics

Vinod Kumar

The learning journey can never stop. As we learn the basic building blocks of SQL Server there is always a need to do refreshers. This blog is on the same lines. I would urge you to read the Blog around Concurrency over at Pinal’s site and then continue the reading below because the basics of Locking is incomplete if we don’t understand what are the concurrency basics. Just like you have traffic signals that regulates traffic to make sure there are no grid locks or long waits for commuters. The parallel lanes for traffic is synonymous to concurrency. SQL Server uses locks as an mechanism to bring the consistency at the same time build the concurrency for the system. Locks are inevitable in any database world and those are also fundamental building blocks.

Locks are applied in both optimistic and pessimistic concurrency models.
Locking is used for maintaining consistency for concurrent transactions.
Read operations acquire shared locks, exclusive locks by write operations while update locks are got during initial portion of an update operation when SQL Server is searching for the data to update.
SQL Server releases locks automatically and manages compatibility between various locking modes, resolves deadlocks and lock escalations are done to reduce overheads.
SQL Server controls locks on tables, on the pages of the table, on index keys and on individual rows of data based on the operation done.

Lock Modes

Shared Locks
1. Automatically acquired while reading data.
2. This lock can be on a page, table, index key or individual row based on the isolation level.
3. Multiple processes can hold shared locks on the same data and read from it too.
4. No other process can get an exclusive lock on data that has a shared lock. Neither can a process acquire a shared lock when there is an exclusive lock already on the table.
5. Shared locks are not held for the whole duration of read in the default isolation level.
Exclusive Locks
1. This lock is acquired on data when modifications are done by DML statements like INSERT, UPDATE or DELETE operation.
2. At any point in time exclusive locks can be held by only one process.
3. In an exclusive locking mode no other process can take locks in this process.
4. Exclusive locks are taken for the duration of the transaction length as the process can either commit or rollback.
Update Locks

This is an hybrid of exclusive and shared locks.
This lock is taken when SQL Server searches the table to find the resource that needs to be modified.
This can also be initiated using using query hints. It is desirable not to use the same.
Even in this lock, it is not the sufficient level to change data. SQL Server must take exclusive lock to modify data.
Used as an intermediate lock before escalating to exclusive locks.

Intent Locks

As the name suggests is a pre-qualifiers to shared/update/exclusive locks.
Since SQL Server can acquire locks at different levels, before the process completes or the statements completes SQL Server might want to know if locks are already acquired.
This is an intermediate locking state and not to be confused as a separate locking mode.

Special Lock Modes

Schema stability locks – at the time of compilations, this lock makes sure that no other process is modifying the schema of the underlying objects.
Schema modification locks – acquired when a table’s structure is being modified.
Bulk update locks – locks taken for BULK INSERT command or bcp utility operation. This lock can also be turned on using the table hints.

Conversion Locks

As the name suggests, this lock happens during a conversion from one mode to another. There are three types SIX, SIU, UIX where SIX is the most common and the details are below.

Key-Range Locks

Locks taken in serializable isolation level for locking ranges of data.
Can be taken only on keys.

Let us look at the common abbreviation and locking mode descriptions:

S – Shared: Allows processes to read concurrently but cannot change the resource locked.
X – Exclusive: Prevents processes from modifying or reading data in the data.
U – Update: Prevents external processes from acquiring an update or exclusive lock.
IS – Intent shared: Acquired when resource is locked with a shared lock.
IU – Intent update: Acquired when the resource is locked with an update lock.
IX – Intent exclusive: Acquired when the resource is locked with an exclusive lock.
SIX – Shared with intent exclusive: Shows that a shared lock pages have either a page or row with exclusive locks.
SIU – Shared with intent update: Shows that a shared lock pages have either a page or row with an update lock.
UIX – Update with intent exclusive: Shows that a update lock pages have either a page or row with exclusive locks.
Sch-S – Schema stability: Used to show a query using the object is being compiled at that moment.
Sch-M – Schema modification: Used to indicate the table’s structure is being updated at this moment.
BU – Bulk update: Lock acquired when bulk copy operation (like bcp) is copying data into a table.

Lock Granularity

SQL Server can lock resources at the table, page, partition or row level.
If locks are escalated, SQL Server can also lock a single partition of a table or index.
SQL Server can also lock index keys and ranges of index keys if needed.
DMV sys.dm_tran_locks view keeps track of each lock and contains information about the resource which is locked and an identifier for the specific resource

Key Locks

SQL Server use of key locks depends on the isolation level of the transaction.

If isolation level is Read Committed, Repeatable Read or Snapshot, SQL Server locks the actual index keys accessed while processing the query.
If isolation level is serializable, to counter phantoms reads, key-range locks are acquired for the range of values queried.
Incase of tables with clustered index, leaf node contains the data and hence the locks are acquired at this level.

There are 9 types of key-range locks:

RangeS-S
RangeS-U
RangeIn-Null
RangeX-X
RangeIn-S
RangeIn-U
RangeIn-X
RangeX-S
RangeX-U

Other Locking Assets

We saw locks on objects, pages, keys, partitions and rows. Three are other SQL Server can lock too.
Extents are units of 64KB disk space that can be locked. This is a special type of lock that is taken when an object needs to expand and a new extent is allocated.
Generally connections hold a lock on at least one database object and will have resource_type = DATABASE at a minimum running in the session.
You occasionally have locks on individual partitions, indicated in the lock metadata as HOBT locks. Occurs when locks are escalated and escalation to that partition level is allowed

Lock Duration

The length of time that a lock is held depends primarily on the mode of the lock and transaction isolation level in effect.
Shared locks in Read Committed and Snapshot isolation levels are released as soon as SQL Server has read and processed the locked data
Shared locks behave the same as exclusive locks in Repeatable Read and Serializable isolation levels – they are not released until the end of the transaction
Update locks are also held until the end of the transaction unless it has been promoted to an exclusive lock in which case it behaves like the exclusive lock

Lock Ownership

Lock duration is directly affected by lock ownership
Ownership == scope of the lock
Four types of lock scopes
1. Transactions – Durations depend on isolation level and lock mode (shared locks and update/exclusive locks)
2. Cursors – requested explicitly when a cursor is declared. If a cursor is opened using a locking mode of SCROLL_LOCKS, a cursor lock is held on every row fetched until the next row is fetched or the cursor is closed. Even if the transaction commits before the next fetch, the cursor lock is not released
3. Transaction_Workspaces – acquired every time a database is accessed and the resource associated with these locks is always a database. A workspace holds database locks for sessions – usually there is one workspace per session and all DATABASE locks acquired in the session are kept in the same workspace object. In distributed transactions, multiple sessions are enlisted into the same workspace so they share the database locks
4. Sessions – must be requested explicitly using the sp_getapplock procedure, and apply only to APPLICATION locks. Its duration is until the session disconnects or the lock is released explicitly

Thanks for reading this far. I know there is more to learn and write like Lock Timeout, Locking Hints, Version Stores, Snapshot Isolation Level, Row Versioning, Deadlocks, Lock Escalations, Row-level Locking and Page-level Locking, Lock Blocks, internal locking etc !!! I will surely keep you posted similarly in future.

SQL SERVER – Concurrency Basics – Guest Post by Vinod Kumar

November 15, 2012 by pinaldave

This guest post is by Vinod Kumar. Vinod Kumar has worked with SQL Server extensively since joining the industry over a decade ago. Working on various versions from SQL Server 7.0, Oracle 7.3 and other database technologies – he now works with the Microsoft Technology Center (MTC) as a Technology Architect.

Let us read the blog post in Vinod’s own voice.

Learning is always fun when it comes to SQL Server and learning the basics again can be more fun. I did write about Transaction Logs and recovery over my blogs and the concept of simplifying the basics is a challenge. In the real world we always see checks and queues for a process – say railway reservation, banks, customer supports etc there is a process of line and queue to facilitate everyone. Shorter the queue higher is the efficiency of system (a.k.a higher is the concurrency). Every database does implement this using checks like locking, blocking mechanisms and they implement the standards in a way to facilitate higher concurrency. In this post, let us talk about the topic of Concurrency and what are the various aspects that one needs to know about concurrency inside SQL Server. Let us learn the concepts as one-liners:

Concurrency can be defined as the ability of multiple processes to access or change shared data at the same time.
The greater the number of concurrent user processes that can be active without interfering with each other, the greater the concurrency of the database system.
Concurrency is reduced when a process that is changing data prevents other processes from reading that data or when a process that is reading data prevents other processes from changing that data.
Concurrency is also affected when multiple processes are attempting to change the same data simultaneously.
Two approaches to managing concurrent data access:
- Optimistic Concurrency Model
- Pessimistic Concurrency Model

Concurrency Models

Pessimistic Concurrency
- Default behavior: acquire locks to block access to data that another process is using.
- Assumes that enough data modification operations are in the system that any given read operation is likely affected by a data modification made by another user (assumes conflicts will occur).
- Avoids conflicts by acquiring a lock on data being read so no other processes can modify that data.
- Also acquires locks on data being modified so no other processes can access the data for either reading or modifying.
- Readers block writer, writers block readers and writers.
Optimistic Concurrency
- Assumes that there are sufficiently few conflicting data modification operations in the system that any single transaction is unlikely to modify data that another transaction is modifying.
- Default behavior of optimistic concurrency is to use row versioning to allow data readers to see the state of the data before the modification occurs.
- Older versions of the data are saved so a process reading data can see the data as it was when the process started reading and not affected by any changes being made to that data.
- Processes modifying the data is unaffected by processes reading the data because the reader is accessing a saved version of the data rows.
- Readers do not block writers and writers do not block readers, but, writers can and will block writers.

Transaction Processing

A transaction is the basic unit of work in SQL Server.
Transaction consists of SQL commands that read and update the database but the update is not considered final until a COMMIT command is issued (at least for an explicit transaction: marked with a BEGIN TRAN and the end is marked by a COMMIT TRAN or ROLLBACK TRAN).
Transactions must exhibit all the ACID properties of a transaction.

ACID Properties

Transaction processing must guarantee the consistency and recoverability of SQL Server databases.
Ensures all transactions are performed as a single unit of work regardless of hardware or system failure.
A – Atomicity C – Consistency I – Isolation D- Durability
- Atomicity: Each transaction is treated as all or nothing – it either commits or aborts.
- Consistency: ensures that a transaction won’t allow the system to arrive at an incorrect logical state – the data must always be logically correct. Consistency is honored even in the event of a system failure.
- Isolation: separates concurrent transactions from the updates of other incomplete transactions. SQL Server accomplishes isolation among transactions by locking data or creating row versions.
- Durability: After a transaction commits, the durability property ensures that the effects of the transaction persist even if a system failure occurs. If a system failure occurs while a transaction is in progress, the transaction is completely undone, leaving no partial effects on data.

Transaction Dependencies

In addition to supporting all four ACID properties, a transaction might exhibit few other behaviors (known as dependency problems or consistency problems).
- Lost Updates: Occur when two processes read the same data and both manipulate the data, changing its value and then both try to update the original data to the new value. The second process might overwrite the first update completely.
- Dirty Reads: Occurs when a process reads uncommitted data. If one process has changed data but not yet committed the change, another process reading the data will read it in an inconsistent state.
- Non-repeatable Reads: A read is non-repeatable if a process might get different values when reading the same data in two reads within the same transaction. This can happen when another process changes the data in between the reads that the first process is doing.
- Phantoms: Occurs when membership in a set changes. It occurs if two SELECT operations using the same predicate in the same transaction return a different number of rows.

Isolation Levels

SQL Server supports 5 isolation levels that control the behavior of read operations.
Read Uncommitted
- All behaviors except for lost updates are possible.
- Implemented by allowing the read operations to not take any locks, and because of this, it won’t be blocked by conflicting locks acquired by other processes. The process can read data that another process has modified but not yet committed.
- When using the read uncommitted isolation level and scanning an entire table, SQL Server can decide to do an allocation order scan (in page-number order) instead of a logical order scan (following page pointers). If another process doing concurrent operations changes data and move rows to a new location in the table, the allocation order scan can end up reading the same row twice.
- Also can happen if you have read a row before it is updated and then an update moves the row to a higher page number than your scan encounters later.
- Performing an allocation order scan under Read Uncommitted can cause you to miss a row completely – can happen when a row on a high page number that hasn’t been read yet is updated and moved to a lower page number that has already been read.
Read Committed
- Two varieties of read committed isolation: optimistic and pessimistic (default).
- Ensures that a read never reads data that another application hasn’t committed.
- If another transaction is updating data and has exclusive locks on data, your transaction will have to wait for the locks to be released. Your transaction must put share locks on data that are visited, which means that data might be unavailable for others to use. A share lock doesn’t prevent others from reading but prevents them from updating.
- Read committed (snapshot) ensures that an operation never reads uncommitted data, but not by forcing other processes to wait. SQL Server generates a version of the changed row with its previous committed values. Data being changed is still locked but other processes can see the previous versions of the data as it was before the update operation began.
Repeatable Read
- This is a Pessimistic isolation level.
- Ensures that if a transaction revisits data or a query is reissued the data doesn’t change.
- That is, issuing the same query twice within a transaction cannot pickup any changes to data values made by another user’s transaction because no changes can be made by other transactions. However, this does allow phantom rows to appear.
- Preventing non-repeatable read is a desirable safeguard but cost is that all shared locks in a transaction must be held until the completion of the transaction.
Snapshot
- Snapshot Isolation (SI) is an optimistic isolation level.
- Allows for processes to read older versions of committed data if the current version is locked.
- Difference between snapshot and read committed has to do with how old the older versions have to be.
- It’s possible to have two transactions executing simultaneously that give us a result that is not possible in any serial execution.
Serializable
- This is the strongest of the pessimistic isolation level.
- Adds to repeatable read isolation level by ensuring that if a query is reissued rows were not added in the interim, i.e, phantoms do not appear.
- Preventing phantoms is another desirable safeguard, but cost of this extra safeguard is similar to that of repeatable read – all shared locks in a transaction must be held until the transaction completes.
- In addition serializable isolation level requires that you lock data that has been read but also data that doesn’t exist. Ex: if a SELECT returned no rows, you want it to return no. rows when the query is reissued. This is implemented in SQL Server by a special kind of lock called the key-range lock.
- Key-range locks require that there be an index on the column that defines the range of values. If there is no index on the column, serializable isolation requires a table lock.
- Gets its name from the fact that running multiple serializable transactions at the same time is equivalent of running them one at a time.

Now that we understand the basics of what concurrency is, the subsequent blog posts will try to bring out the basics around locking, blocking, deadlocks because they are the fundamental blocks that make concurrency possible.

Now if you are with me – let us continue learning for SQL Server Locking Basics.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Retrieving Random Rows from Table Using NEWID()

November 16, 2012 by pinaldave

I have previously written about how to get random rows from SQL Server.

SQL SERVER – Generate A Single Random Number for Range of Rows of Any Table – Very interesting Question from Reader

SQL SERVER – Random Number Generator Script – SQL Query

However, I have not blogged about following trick before. Let me share the trick here as well. You can generate random scripts using following methods as well.
USE AdventureWorks2012 GO-- Method 1SELECT TOP 100 *FROM Sales.SalesOrderDetailORDER BY NEWID()GO-- Method 2SELECT TOP 100 *FROM Sales.SalesOrderDetailORDER BY CHECKSUM(NEWID())GO

You will notice that using NEWID() in the ORDER BY will return random rows in the result set. How many of you knew this trick? You can run above script multiple times and it will give random rows every single time.

Note: This method can be very resource intensive for large resultsets.

Reference: Pinal Dave (http://blog.sqlauthority.com)