Common programming Concepts: 2012

Thursday, 14 June 2012

What is XML DataType in SQL DataType ?

XML has been widely adopted as a platform-independent mechanism for representing data, and is also commonly used to exchange data between disparate and loosely-coupled systems, such as B2B applications and workflow solutions. More recently, XML has been used to represent semi-structured (as well as unstructured) data such as documents and emails. If information in these models has to be queried, then XML is probably the simplest way to represent such information.

For example, if documents are represented in XML, it is very easy to write a DOM or XPATH query to extract the contents of a section titled, for example, "Recipe". Many applications also require other pieces of information that typically reside in a database. For example, consider an application that allows a user to personalize the look and feel of an application. It is not uncommon to store the preferences of the user against the user record in the database. XML lends itself well to storing such unstructured information to cope with an ever expanding list of user preferences.

When we store this kind of semi-structured or unstructured information as XML in the database, it is not always feasible to extract the content to the application tier and then process the XML using XML parsers. If the database platform is able to provide native XML processing capabilities then that is a massive advantage. In that way we can also take advantage of other database capabilities such as query optimization, indexing, backups and, most importantly of all, interoperability with relational data.

The XML Data Type

The XML data type is a built-in data type similar to other built-in types in SQL Server 2005. You can define XML columns, XML variables, XML parameters for procedures and functions, and can also return XML as return values from functions. Internally, the XML data type is stored in a proprietary binary format. The maximum permissible size of the XML content is 2GB.

Defining XML Variables and Columns

To declare a variable of type XML is as simple as this:

DECLARE @xmlDoc XML

In order to create an XML column (using untyped XML – more on this later), we could use the following:

CREATE TABLE rssFeeds
(
  feedName VARCHAR(20) NOT NULL PRIMARY KEY,
  generatedDate DATETIME DEFAULT GETDATE(),
  feedXML XML
)

Loading Data into XML Columns and Variables

Having defined an XML variable or column, we can store information in it via the following methods:

Assigning a string to the XML variable / column
Using the output of the SELECT…FOR XML clause
Using bulkload

Simple String Assignment

We can load a string into our @xmlDoc variable as follows:

SET @xmlDoc = '<contact><name>Bob Smith</name></contact>'
SELECT @xmlDoc

Note that string content is implicitly converted to the XML type. You can also explicitly convert the string to XML as shown:

SET @xmlDoc = CAST ('<contact><name>Bob Smith</name></contact>' AS XML)
SELECT @xmlDoc

In each example, we then select this XML content. If you run this example in SQL Server Management Studio with the "results to grid" option enabled, the XML output will be as shown below:

Notice that the XML content is hyperlinked and if you click this link, a new window opens and shows the entire XML document. This is a great improvement over SQL Server 2000 Query Analyzer, which would show the results of XML queries as a single line of output and also truncate the results if the result size is greater than 8192 bytes.

We can insert regular string content into the feedXML column of our rssFeeds table using the following method:

INSERT INTO rssFeeds (feedName, feedXML) VALUES
  (‘MSDN’, ‘<feedXML/>’)

Using FOR XML

In the following example, we initialize the @xmlDOC variable using a SELECT…FOR XML statement:

USE AdventureWorks
GO
DECLARE @xmlDoc XML
SET @xmlDoc =
  (
  SELECT name, salesYTD FROM sales.salesterritory
  FOR XML AUTO, ELEMENTS
  )
SELECT @xmlDoc

The output of the above command is as follows (only partial results are shown. To see the full results, you need to click on the hyperlink in Management Studio):

Streaming results from FOR XML statements directly into XML variables is a great improvement over SQL Server 2000 (which did not allow this). Also note that the above example returns an XML fragment and not an XML document. If you need to return an XML document, you can use the new ROOT directive in the FOR XML command as shown:

USE AdventureWorks
GO
DECLARE @xmlDoc XML
SET @xmlDoc =
  (
  SELECT name, salesYTD FROM sales.salesterritory
  FOR XML AUTO, ROOT ('territories'), ELEMENTS
  )
SELECT @xmlDoc

Executing the above command produces the following output (again, only partial results are shown):

Bulkloading XML data using the OPENROWSET Function

SQL Server 2005 provides extensions to the OPENROWSET function to bulk load an XML document from the file system. The following example shows how to bulkload our @xmlDoc variable using this technique:

DECLARE @xmlDoc XML
SET @xmlDoc = (
  SELECT * FROM OPENROWSET (
    BULK 'C:\Temp\rss.xml', SINGLE_CLOB
  ) AS xmlData
)
SELECT @xmlDoc

The RSS.XML file that is referred to in the above query is a sample RSS XML that I generated from the MSDN site. To do this, you can query the MSDN RSS site at: http://msdn.microsoft.com/rss.xml and then save results as XML from Internet Explorer.

The output of the above command is as follows:

Similarly, the following code uses the OPENROWSET command to load an XML document into our feedXMLcolumn:

GO
INSERT INTO rssFeeds (feedName, feedXML)
SELECT 'MSDNRSS', xmlData FROM
  (
  SELECT * FROM OPENROWSET
  (BULK 'C:\Temp\rss.xml', SINGLE_CLOB)
  AS xmlData
  ) AS feed (xmlData)
GO
SELECT * FROM rssFeeds

The following output is generated:

Typing XML

Up to this point, the XML content that we have used is referred to as untyped XML. An untyped XML variable/column is one that is not associated with a schema. If we associate a schema with an XML type, it is then referred to astyped XML. Once an XML variable/column is typed using a schema, any data that is inserted into it is automatically validated against the schema, and SQL Server 2005 raises an error if the data does not conform to the defined schema rules.

It is usually recommended to associate schemas to XML content in SQL Server, since query optimization can use the information and also it provides better operational semantics. For example, we can perform arithmetic operations on numbers, but not strings. If no schema is attached to an XML, then all values are considered to be strings. MSDN provides further information about schemas and their maintenance.

Creating typed XML involves two basic steps: 1) Create an XML schema and 2) Register it in the database.

Creating a Schema

Using tools such as Visual Studio 2005 and XML Spy, creating a schema is pretty simple. The following link contains some excellent tutorials on this topic: Creating Schemas in Visual Studio. Using the tool of your choice, create the following schema definition:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema
   id="Contacts"
   targetNamespace="http://www.srisamp.com/contacts.xsd"
   elementFormDefault="qualified"
   xmlns="http://www.srisamp.com/contacts.xsd"
   xmlns:mstns="http://www.srisamp.com/contacts.xsd"
   xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:complexType name="contact">
        <xs:sequence>
            <xs:element name="name" type="xs:string" />
        </xs:sequence>
    </xs:complexType>
    <xs:element name="contacts">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="contact" type="contact" />
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

This schema defines an XML document of the form:

<contacts><contact><name></name></contact></contacts>

Save this schema in a file called ContactsSchema.xsd in the C:\Temp folder.

Registering a Schema in SQL Server 2005

You register a schema in SQL Server 2005 using the CREATE XML SCHEMA COLLECTION command. The following script loads our XML schema into the database.

IF EXISTS (SELECT * FROM sys.xml_schema_collections
      WHERE [name] = 'MyContacts')
   DROP XML SCHEMA COLLECTION MyContacts
GO
DECLARE @contactsSchema XML
SET @contactsSchema =
   (
      SELECT * FROM OPENROWSET
         (BULK 'C:\Temp\ContactsSchema.xsd', SINGLE_CLOB)
            AS xmlData
   )
CREATE XML SCHEMA COLLECTION MyContacts AS @contactsSchema
GO

In the following code, we type an XML variable to the MyContacts schema and then assign both valid and invalid content to the XML (by invalid, we mean that the content does not adhere to the schema):

DECLARE @xmlDoc XML (MyContacts)
BEGIN TRY
   -- This entry conforms to our schema rules
   SET @xmlDoc = '
      <contacts xmlns="http://www.srisamp.com/contacts.xsd">
         <contact>
            <name>Bob Smith</name>
         </contact>
      </contacts>
   '
   SELECT @xmlDoc
 
   -- This one does not conform to the schema
   SET @xmlDoc = '
      <contacts xmlns="http://www.srisamp.com/contacts.xsd">
         <contact>
            <names>Bob Smith</names>
         </contact>
      </contacts>
   '
   SELECT @xmlDoc
END TRY
BEGIN CATCH
   PRINT 'Error at line '+CAST(error_line() AS VARCHAR)+'. Message: '+error_message()
END CATCH

Note how we have declared the XML variable. The schema reference is specified in parentheses after the variable. We have also used the new TRY…CATCH error handling syntax of SQL Server 2005 to capture the error and display the corresponding message.

What is covering index ?

The term covering index does not mean a separate kind of index having a different internal structure. Rather, this term is used to describe a certain technique that is used to improve performance. It is discussed here because, as will be shown later, for most practical purposes, a clustering index can be thought of as a covering index.
Let's consider a simple but typical example. Suppose that we need to join two tables—Master, defined as (MasterId int primary key, Data int), and Detail, defined as (DetailId int primary key, Data int, MasterId int)—and select columns (MasterID, Detail.Data) where Master.Data satisfies a certain condition. The condition can be anything that is built directly on the value of Master.Data—for example, Data = @data, Data between @start and @end, or Data in (1, 2, 3). Let's also assume that, besides primary key indexes, the Master table has an index idxMaster built on (Data, MasterID), and the Detail table has an indexidxDetail built on (MasterID). The query plan will look as shown in Figure

What does that mean, exactly? Let's analyze it step by step:

The query engine searches idxMaster, to find the entries that satisfy our condition for Master.Data, and it builds a list of corresponding Master.MasterIDvalues. This is fast, because:
- Index pages are sorted by Master.Data, so the engine can jump directly to leaf index pages that satisfy our condition.
- Those leaf index pages already contain values of Master.MasterId, so there is no need to jump somewhere else to read them.
The engine selects entries from idxDetail having values of Detail.MasterID that match the list of Master.MasterID built during Step 1. This can be done by simply looping through the list of Master.MasterID and looking up each corresponding entry in idxDetail, or by using more complicated hash or mergealgorithms. The query engine determines which algorithm to use in each particular case, based on the relative cost of each algorithm. This, in turn, depends on the following criteria:
- Selectivity of the index
- Estimated number of lookups
- Estimated number of rows to find.
For the low relative values of the second and third criteria, the loop algorithm is preferred, whereas for higher values, the hash/merge algorithms become more efficient. The output of this step is a list of (MasterID, Detail:FID:PN:RN) values (see Figure 1).
A list of heap pointers (FID:PN:RN), also known as bookmarks, is used to read randomly located heap data pages of the Detail table, and to retrieve values of the Data field in order to compile the requested result set (MasterID, Detail.Data). This step is called bookmark lookup, and, in our case, it is the most expensive one. The cost of bookmark lookup can be anywhere from 90 to 99 percent of the total query cost. This means that eliminating this step can improve the performance ten-fold or even one-hundred-fold. The cost of bookmark lookup depends on the fragmentation of data pages, the number of data pages to read, that availability of read-ahead optimization, the IO-to-CPU cost ratio, and other factors.

Please note that, in this example, bookmark lookup has only been performed on the Detail table, not on the Master table. This is because the required output fieldMasterID was a part of the index. This leads us to the definition of covering index: A covering index is the index that contains all output fields required by the operation performed on that index. Please note that if the Master table contained more fields, and at least one of these fields was present in the SELECT clause, the query engine would have to perform bookmark lookup on the Master table, and therefore idxMaster could not serve as a covering index anymore. That is why it is not possible to tell whether the index is covering without taking into consideration the query it is used in, and even the particular execution plan.

It is very easy now to devise a way to improve the performance of our query. Adding Detail.Data as a second field to idxDetail eliminates the need for bookmark lookup and considerably improves the performance.

Good candidates for covering indexes are junction tables (tables that exist to resolve many-to-many relationships). In the simplest case of a many-to-many relationship between T1 and T2, it is enough to have two indexes, (T1_Id, T2_Id) and (T2_Id, T1_Id), in order to save the query engine from ever doing bookmark lookups on this table.

What is GETDATE() ,SYSDATETIME(), GETUTCDATE(), SYSUTCDATETIME() ?

GETDATE() : Its precision is in milliseconds.
SYSDATETIME() : Its precision is in nanseconds.
GETUTCDATETIME() : Returns data as UTC time ( Coordinated Universal Time ) and its precision is in milliseconds.
SYSUTCDATETIME() : Returns data as UTC time ( Coordinated Universal Time ) and its precision is in nanoseconds..

What is Bookmark Lookups in SQl Sever ?

One of the major overheads associated with the use of non-clustered indexes is the cost of bookmark lookups. Bookmark lookups are a mechanism to navigate from a non-clustered index row to the actual data row in the base table (clustered index) and can be very expensive when dealing with large number of rows.

When a small number of rows are requested by a query, the SQL Server optimizer will try to use a non-clustered index on the column or columns contained in the WHERE clause to retrieve the data requested by the query. If the query requests data from columns not contained in the non-clustered index, SQL Server must go back to the data pages to obtain the data in those columns. It doesn’t matter if the table contains a clustered index or not, the query will still have to return to the table or clustered index to retrieve the data.

Bookmark lookups require data page access in addition to the index page access needed to filter the table data rows. Because this requires the query to access two sets of pages instead of only one, the number of logical READS performed by the query will increase. If the data pages are not in the buffer cache, additional I/O operations will be required. And in the case of most large tables, the index page and the corresponding data pages are not usually located close to each other on the disk.

These additional requirements for logical READS and physical I/O can cause bookmark lookups to become quite costly. While this cost may be acceptable in the case of small result sets, this cost becomes increasingly prohibitive in the case of larger and larger result sets. In fact, as the result sets become larger and larger, the optimizer may consider the costs of the bookmark lookups to be too much and discard the non-clustered index and simply perform a table scan instead.

Example of a Bookmark Lookup
SET STATISTICS PROFILE ON
GO

USE pubs
GO

--Find phone number for White, Johnson
SELECT phone
FROM dbo.authors
WHERE au_lname = 'White'
AND au_fname = 'Johnson'
GO

Execution Plan (abridged)
Rows Executes StmtText
----------- ----------- ----------------------------------------------------------------------------------
1 1 SELECT [phone]=[phone] FROM [dbo].[authors]
1 1 |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([pubs].[dbo].[authors]))
1 1 |--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind

Because both the au_lname and the au_fname are contained in a non-clustered index, the optimizer can use the non-clustered index to filter the rows contained in the table to return only the phone number requests. However, because the phone column in the authors table is not contained in the index or another non-clustered index, the optimizer must return to the authors table in order to return the matching phone number creating a bookmark lookup.

Finding the offending column(s)
In order to resolve the bookmark lookup, you must find the column or columns that cause the bookmark lookup. To find offending columns look for the index usage in the execution plan to find what index is utilized by the optimizer for the query.

Execution Plan (abridged)
StmtText
----------- ----------- ----------------------------------------------------------------------------------
|--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind]), SEEK:([authors].[au_lname]='White' AND [authors].[au_fname]='Johnson') ORDERED FORWARD)

In this case we see that the authors.aunmind index is being used by the optimizer for the query. A quick check of the columns included in the index using sp_helpindex on the authors table will show that the index consists of the au_lname and au_fname columns.

Index_name index_description index_keys
aunmind nonclustered located on PRIMARY au_lname, au_fname

A review of the execution plan OutputList column reveals that the phone column is only remaining column being requested by the query.

Execution Plan (abridged)
OutputList
----------------------------------------------------------------------------------
[authors].[phone]

Since the phone column is not in the index, you can deduct that the phone column is the offending column in this case.

Resolving bookmark lookups
Once you discover the columns responsible for a bookmark lookup, you will need to consider one of four methods that are available to resolve the bookmark lookup.

Create a covering index
Remove the offending column
Convert a non-clustered index into a clustered index

Create a covering index
Given the example listed earlier in this section, if the following covering index had been created, the result would be the removal of the bookmark lookup from the execution plan.

CREATE NONCLUSTERED INDEX ncl_authors_phone ON authors(au_lname, au_fname, phone)
GO

Execution Plan (Abridged)
SELECT [phone]=[phone] FROM [dbo].[authors] WHERE [au_lname]=@1 AND [au_fname]=@2 |--Index Seek(OBJECT:([pubs].[dbo].[authors].[ncl_authors_phone]), SEEK:([authors].[au_lname]=[@1] AND [authors].[au_fname]=[@2]) ORDERED FORWARD)

Remove the offending column
In the simple query below, the developer returned all the columns from the authors table when all the query asked for was the ID of the author.

SET STATISTICS PROFILE ON
GO

USE pubs
GO

--Find ID number for White, Johnson
SELECT *
FROM dbo.authors
WHERE au_lname = 'White'
AND au_fname = 'Johnson'
GO

Execution Plan (abridged)
StmtText
----------- ----------- ----------------------------------------------------------------------------------
SELECT * FROM [dbo].[authors] WHERE [au_lname]=@1 AND [au_fname]=@2 |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([pubs].[dbo].[authors])) |--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind]), SEEK:([authors].[au_lname]='White' AND [authors].[au_fname]='Johnson') ORDERED FORWARD)

Removing the additional, unneeded columns and only returning the au_id column will remove the bookmark lookup in this case

SET STATISTICS PROFILE ON
GO

USE pubs
GO

--Find ID number for White, Johnson
SELECT au_id
FROM dbo.authors
WHERE au_lname = 'White'
AND au_fname = 'Johnson'
GO

Execution Plan (abridged)
StmtText
----------- ----------- ----------------------------------------------------------------------------------
SELECT [au_id]=[au_id] FROM [dbo].[authors] WHERE [au_lname]=@1 AND [au_fname]=@2 |--Index Seek(OBJECT:([pubs].[dbo].[authors].[aunmind]), SEEK:([authors].[au_lname]=[@1] AND [authors].[au_fname]=[@2]) ORDERED FORWARD)

Bookmark lookups are often caused by additional columns being returned in the data set “just in case” they are needed at a later date. Developers should strive to only include columns in their result sets which are needed for the defined query requirements. Additional columns can always be added at a later date.

Convert a non-clustered index into a clustered index
When developers are faced with bookmark lookups that cannot be removed with the other choices described above, an alternative choice would be to convert an existing index being used by the query into a clustered index. Converting an existing index into a clustered index will place all the columns of the table in the index and prevent the need for a bookmark lookup.

SET STATISTICS PROFILE ON
GO

USE pubs
GO

--Find information for White, Johnson
SELECT fname + ' ' + lname + ' Hire Date: ' + CAST(hire_date AS VARCHAR(12))
FROM dbo.employee
WHERE emp_id = 'PMA42628M'
GO

Execution Plan (abridged)
StmtText
----------- ----------- ----------------------------------------------------------------------------------
SELECT fname + ' ' + lname + ' Hire Date: ' + CAST(hire_date AS VARCHAR(12)) FROM dbo.employee
WHERE emp_id = 'PMA42628M'
|--Compute Scalar(DEFINE:([Expr1002]=[employee].[fname]+' '+[employee].[lname]+' Hire Date:
'+Convert([employee].[hire_date])))
|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([pubs].[dbo].[employee]))
|--Index Seek(OBJECT:([pubs].[dbo].[employee].[PK_emp_id]),
SEEK:([employee].[emp_id]='PMA42628M') ORDERED FORWARD)

To resolve the bookmark lookup, the developer can change the existing clustered index on the lname, fname, and minit columns into a non-clustered index.

--change original clustered index into a non-clustered index
DROP INDEX employee.employee_ind
GO

CREATE INDEX employee_ind ON employee(lname,fname,minit) 
GO

Once the clustered index has been changed into a non-clustered index, a new clustered index can be built on the emp_id column to resolve the bookmark lookup. In this particular case the emp_id is the PRIMARY KEY of the table, so instead of an index, the developer needs to recreate a clustered PRIMARY KEY.

--Create new clustered index
--Drop CONSTRAINT
ALTER TABLE employee
DROP CONSTRAINT PK_emp_id
GO

--Recreate CONSTRAINT
ALTER TABLE employee
ADD CONSTRAINT PK_emp_id PRIMARY KEY  CLUSTERED  (emp_id) 
GO

--Test removal of bookmark lookup 
--Find information for White, Johnson
SELECT fname + ' ' + lname + ' Hire Date: ' + CAST(hire_date AS VARCHAR(12)) 
FROM dbo.employee
WHERE emp_id = 'PMA42628M'
GO

Execution Plan (abridged) 
StmtText                                                                             
----------- ----------- ---------------------------------------------------------------------
SELECT fname + ' ' + lname + ' Hire Date: ' + CAST(hire_date AS VARCHAR(12))  FROM dbo.employee
                                                                 WHERE emp_id = 'PMA42628M'
  |--Compute Scalar(DEFINE:([Expr1002]=[employee].[fname]+' '+[employee].[lname]+' Hire Date: 
                                                   '+Convert([employee].[hire_date]))) 
       |--Clustered Index Seek(OBJECT:([pubs].[dbo].[employee].[PK_emp_id]), 
                           SEEK:([employee].[emp_id]='PMA42628M') ORDERED FORWARD)

While converting a non-clustered index into a clustered index is a possible solution to bookmark lookups, often applications depend on the current clustered index and this solution will be almost impossible to implement in the real world.

Thursday, 17 May 2012

Bind asp GridView using jquery.

I found a very interesting article define how to bind asp gridview using jqyery ajax() method. It fetched data in the Json format and then binds the gridview.
For more details click here..

Enjoy jQuery..

Tuesday, 15 May 2012

Fun Queries ..!!

Run the below queries in the SQL Server Mgmt studio and see the result..make sure you run after any select statement

select @@ERROR

select @@ROWCOUNT

select @@CPU_BUSY

select @@LANGUAGE

select @@SERVERNAME

select @@SPID

select @@PROCID

select @@TEXTSIZE

select @@VERSION

TRY CATCH Block in SQL Server ?

I found a very good topic on TRY CATCH block usage in SQL Server ..
Please click here to read....

Thanks

What is STUFF and REPLACE Functon in SQL Server ?

STUFF() can be used to stuff a string into another string. It inserts the string at a given position, and deletes the number of characters specified from the original string.

DECLARE @string1 VARCHAR(20) = 'Microsoft Server'

DECLARE @string2 VARCHAR(20) = 'SQL Server 2005'

SELECT @string1 + ' -> ' + STUFF(@string1, 11, 0, 'SQL ')

AS 'String 1',

@string2 + ' -> ' + STUFF(@string2, 15, 1, '8 R2')

AS 'String 2'

Result :

String 1 String 2
Microsoft Server -> Microsoft SQL Server SQL Server 2005 -> SQL Server 2008 R2

REPLACE() replaces all the specified characters with new characters.

DECLARE @string3 VARCHAR(35) = 'sql 2005, sql 2008, sql 2008 r2'

SELECT @string3, REPLACE(@string3,'sql','SQL')

Result Set:

———————————– ———————————–

sql 2005, sql 2008, sql 2008 r2 SQL 2005, SQL 2008, SQL 2008 r2

(1 row(s) affected)

However, it is not limited to same number of characters:

DECLARE @string3 VARCHAR(35) = '2008 R2'

SELECT @string3, REPLACE(@string3,'20','SQL Server 2')

Result Set:

————– ————————

2008 R2 SQL Server 208 R2

(1 row(s) affected)

What is Sql Injection ?

SQL injection is a technique used to take advantage of non-validated input vulnerabilities to pass SQL commands through a Web application for execution by a backend database. Attackers take advantage of the fact that programmers often chain together SQL commands with user-provided parameters, and can therefore embed SQL commands inside these parameters. The result is that the attacker can execute arbitrary SQL queries and/or commands on the backend database server through the Web application.

Details

Databases are fundamental components of Web applications. Databases enable Web applications to store data, preferences and content elements. Using SQL, Web applications interact with databases to dynamically build customized data views for each user. A common example is a Web application that manages products. In one of the Web application's dynamic pages (such as ASP), users are able to enter a product identifier and view the product name and description. The request sent to the database to retrieve the product's name and description is implemented by the following SQL statement.

SELECT ProductName, ProductDescription 
FROM Products 
WHERE ProductNumber = ProductNumber

Typically, Web applications use string queries, where the string contains both the query itself and its parameters. The string is built using server-side script languages such as ASP, JSP and CGI, and is then sent to the database server as a single SQL statement. The following example demonstrates an ASP code that generates a SQL query.

sql_query= "
SELECT ProductName, ProductDescription 
FROM Products 
WHERE ProductNumber = " & Request.QueryString("ProductID")

The call Request.QueryString("ProductID") extracts the value of the Web form variable ProductID so that it can be appended as the SELECT condition.

When a user enters the following URL:

http://www.mydomain.com/products/products.asp?productid=123

The corresponding SQL query is executed:

SELECT ProductName, ProductDescription 
FROM Products 
WHERE ProductNumber = 123

An attacker may abuse the fact that the ProductID parameter is passed to the database without sufficient validation. The attacker can manipulate the parameter's value to build malicious SQL statements. For example, setting the value "123 OR 1=1" to the ProductID variable results in the following URL:

http://www.mydomain.com/products/products.asp?productid=123 or 1=1

The corresponding SQL Statement is:

SELECT ProductName, Product Description
FROM Products
WHERE ProductNumber = 123 OR 1=1

This condition would always be true and all ProductName and ProductDescription pairs are returned. The attacker can manipulate the application even further by inserting malicious commands. For example, an attacker can request the following URL:

http://www.mydomain.com/products/products.asp?productid=123; DROP 
TABLE Products

In this example the semicolon is used to pass the database server multiple statements in a single execution. The second statement is "DROP TABLE Products" which causes SQL Server to delete the entire Products table.

An attacker may use SQL injection to retrieve data from other tables as well. This can be done using the SQL UNION SELECT statement. The UNION SELECT statement allows the chaining of two separate SQL SELECT queries that have nothing in common. For example, consider the following SQL query:

SELECT ProductName, ProductDescription 
FROM Products 
WHERE ProductID = '123' UNION SELECT Username, Password FROM Users;

The result of this query is a table with two columns, containing the results of the first and second queries, respectively. An attacker may use this type of SQL injection by requesting the following URL:

http://www.mydomain.com/products/products.asp?productid=123 UNION 
SELECT user-name, password FROM USERS

The security model used by many Web applications assumes that an SQL query is a trusted command. This enables attackers to exploit SQL queries to circumvent access controls, authentication and authorization checks. In some instances, SQL queries may allow access to host operating system level commands. This can be done using stored procedures. Stored procedures are SQL procedures usually bundled with the database server. For example, the extended stored procedure xp_cmdshell executes operating system commands in the context of a Microsoft SQL Server. Using the same example, the attacker can set the value of ProductID to be "123;EXEC master..xp_cmdshell dir--", which returns the list of files in the current directory of the SQL Server process.

Prevention

The most common way of detecting SQL injection attacks is by looking for SQL signatures in the incoming HTTP stream. For example, looking for SQL commands such as UNION, SELECT or xp_. The problem with this approach is the very high rate of false positives. Most SQL commands are legitimate words that could normally appear in the incoming HTTP stream. This will eventually case the user to either disable or ignore any SQL alert reported. In order to overcome this problem to some extent, the product must learn where it should and shouldn't expect SQL signatures to appear. The ability to discern parameter values from the entire HTTP request and the ability to handle various encoding scenarios are a must in this case.

Monday, 7 May 2012

What is DeadLock ?

A deadlock is a situation wherein two transactions wait for each other to give up their respective locks.

When this happens, the SQL Server ends the deadlock by automatically choosing one and aborting the process, allowing the other process to continue. The aborted transaction is rolled back and an error message is sent to the user of the aborted process. Generally, the transaction that requires the least amount of overhead to rollback is the transaction that is aborted.

eg.> Transaction A attempts to update table 1 and subsequently read/update data from table 2, whereas transaction B attempts to update table 2 and subsequently read/update data from table 1. In such situations, transaction A holds locks that transaction B needs to complete its task and vice versa; neither transaction can complete until the other transaction releases locks.

I found a very good artical regarding Deadlocks in SQL you can also enjoy the same here.

What is Blocking in SQL Server ?

Blocking in SQL Server is a scenario where one connection to SQL Server locks one or more records, and a second connection to SQL Server requires a conflicting lock type on the record or records locked by the first connection. This causes the second connection to wait until the first connection releases its locks. By default, a connection will wait an unlimited amount of time for the blocking lock to go away.
Blocking often happens on the server when the system is under heavy transactional workload. The way to resolver blocking is to identify the blocking statements which will then allow optimization of the clocking statements i.e. Re-writing T-SQL ,Indexing etc.

Tuesday, 1 May 2012

What is B-TREE in SQL Server ?

A B-tree is a method of placing and locating files (called records or keys) in a database (The meaning of the letter B has not been explicitly defined.) The B-tree algorithm minimizes the number of times a medium must be accessed to locate a desired record, thereby speeding up the process.

Indexes are organized as B-trees. Each page in an index B-tree is called an index node. The top node of the B-tree is called the root node. The bottom level of nodes in the index is called the leaf nodes. Any index levels between the root and the leaf nodes are collectively known as intermediate levels.

B-trees are preferred when decision points, called nodes, are on hard disk rather than in random-access memory (RAM). It takes thousands of times longer to access a data element from hard disk as compared with accessing it from RAM, because a disk drive has mechanical parts, which read and write data far more slowly than purely electronic media. B-trees save time by using nodes with many branches (called children), compared with binary trees, in which each node has only two children. When there are many children per node, a record can be found by passing through fewer nodes than if there are two children per node. A simplified example of this principle is shown below.

In a practical B-tree, there can be thousands, millions, or billions of records. Not all leaves necessarily contain a record, but at least half of them do. The difference in depth between binary-tree and B-tree schemes is greater in a practical database than in the example illustrated here, because real-world B-trees are of higher order (32, 64, 128, or more). Depending on the number of records in the database, the depth of a B-tree can and often does change. Adding a large enough number of records will increase the depth; deleting a large enough number of records will decrease the depth. This ensures that the B-tree functions optimally for the number of records it contains.

Monday, 30 April 2012

What is View in SQL ?

In SQL, a view is a virtual table based on the result-set of an SQL statement.

A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database.

You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were coming from one single table.

SQL CREATE VIEW Syntax

CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition

A View is a "Virtual Table". It is not like a simple table, but is a virtual table which contains columns and data from different tables (may be one or more tables). A View does not contain any data, it is a set of queries that are applied to one or more tables that is stored within the database as an object. After creating a view from some table(s), it used as a reference of those tables and when executed, it shows only those data which are already mentioned in the query during the creation of the View.

Use of a View

Views are used as security mechanisms in databases. Because it restricts the user from viewing certain column and rows. Views display only those data which are mentioned in the query, so it shows only data which is returned by the query that is defined at the time of creation of the View. The rest of the data is totally abstract from the end user.

Along with security, another advantage of Views is data abstraction because the end user is not aware of all the data in a table.

Views are used for several different reasons:

To hide data complexity. Instead of forcing your users to learn the T-SQL JOIN syntax you might wish to provide a view that runs a commonly requested SQL statement.
To protect the data. If you have a table containing sensitive data in certain columns, you might wish to hide those columns from certain groups of users. For instance, customer names, addresses and their social security numbers might all be stored in the same table; however, for lower level employees like shipping clerks, you can create a view that only displays customer name and address. You can grant permissions to a view without allowing users to query the underlying tables. There are a couple of ways you might want to secure your data:
1. Create a view to allow reading of only certain columns from a table. A common example of this would be the salary column in the employee table. You might not want all personnel to be able to read manager's or each other's salary. This is referred to as partitioning a table vertically and is accomplished by specifying only the appropriate columns in the CREATE VIEW statement.
2. Create a view to allow reading only certain rows from a table. For instance, you might have a view for department managers. This way, each manager can provide raises only to the employees of his or her department. This is referred to as horizontal partitioning and is accomplished by providing a WHERE clause in the SELECT statement that creates a view.
Enforcing some simple business rules. For example, if you wish to generate a list of customers that need to receive the fall catalog, you can create a view of customers that have previously bought your shirts during the fall.
Data exports with BCP. If you are using BCP to export your SQL Server data into text files, you can format the data through views since BCP's formatting ability is quite limited.
Customizing data. If you wish to display some computed values or column names formatted differently than the base table columns, you can do so by creating views.

Disadvantages

Even though views can be a great tool for securing and customizing data, they can be slow. Indeed, they are not any faster than the query that defines them. With SQL Server 2000, indexed views (also referred to as "materialized" views) are supported to overcome this limitation.

What is index in SQL Server ?

Index in sql is created on existing tables to retrieve the rows quickly.

When there are thousands of records in a table, retrieving information will take a long time. Therefore indexes are created on columns which are accessed frequently, so that the information can be retrieved quickly. Indexes can be created on a single column or a group of columns. When a index is created, it first sorts the data and then it assigns a ROWID for each row.

Syntax to create Index

CREATE INDEX index_name
ON table_name (column_name)

One of the most important routes to high performance in a SQL Server database is the index. Indexes speed up the querying process by providing swift access to rows in the data tables, similarly to the way a book’s index helps you find information quickly within that book. In this article, I provide an overview of SQL Server indexes and explain how they’re defined within a database and how they can make the querying process faster. Most of this information applies to indexes in both SQL Server 2005 and 2008; the basic structure has changed little from one version to the next. In fact, much of the information also applies to SQL Server 2000. This does not mean there haven’t been changes. New functionality has been added with each successive version; however, the underlying structures have remained relatively the same. So for the sake of brevity, I stick with 2005 and 2008 and point out where there are differences in those two versions.

Index Structures

Indexes are created on columns in tables or views. The index provides a fast way to look up data based on the values within those columns. For example, if you create an index on the primary key and then search for a row of data based on one of the primary key values, SQL Server first finds that value in the index, and then uses the index to quickly locate the entire row of data. Without the index, a table scan would have to be performed in order to locate the row, which can have a significant effect on performance.

You can create indexes on most columns in a table or a view. The exceptions are primarily those columns configured with large object (LOB) data types, such as image, text, and varchar(max). You can also create indexes on XML columns, but those indexes are slightly different from the basic index and are beyond the scope of this article. Instead, I'll focus on those indexes that are implemented most commonly in a SQL Server database.

An index is made up of a set of pages (index nodes) that are organized in a B-tree structure. This structure is hierarchical in nature, with the root node at the top of the hierarchy and the leaf nodes at the bottom

When a query is issued against an indexed column, the query engine starts at the root node and navigates down through the intermediate nodes, with each layer of the intermediate level more granular than the one above. The query engine continues down through the index nodes until it reaches the leaf node. For example, if you’re searching for the value 123 in an indexed column, the query engine would first look in the root level to determine which page to reference in the top intermediate level. In this example, the first page points the values 1-100, and the second page, the values 101-200, so the query engine would go to the second page on that level. The query engine would then determine that it must go to the third page at the next intermediate level. From there, the query engine would navigate to the leaf node for value 123. The leaf node will contain either the entire row of data or a pointer to that row, depending on whether the index is clustered or nonclustered.

Clustered Indexes

A clustered index stores the actual data rows at the leaf level of the index. Returning to the example above, that would mean that the entire row of data associated with the primary key value of 123 would be stored in that leaf node. An important characteristic of the clustered index is that the indexed values are sorted in either ascending or descending order. As a result, there can be only one clustered index on a table or view. In addition, data in a table is sorted only if a clustered index has been defined on a table.

Note: A table that has a clustered index is referred to as a clustered table. A table that has no clustered index is referred to as a heap.

Nonclustered Indexes

Unlike a clustered indexed, the leaf nodes of a nonclustered index contain only the values from the indexed columns and row locators that point to the actual data rows, rather than contain the data rows themselves. This means that the query engine must take an additional step in order to locate the actual data.

A row locator’s structure depends on whether it points to a clustered table or to a heap. If referencing a clustered table, the row locator points to the clustered index, using the value from the clustered index to navigate to the correct data row. If referencing a heap, the row locator points to the actual data row.

Nonclustered indexes cannot be sorted like clustered indexes; however, you can create more than one nonclustered index per table or view. SQL Server 2005 supports up to 249 nonclustered indexes, and SQL Server 2008 support up to 999. This certainly doesn’t mean you should create that many indexes. Indexes can both help and hinder performance, as I explain later in the article.

In addition to being able to create multiple nonclustered indexes on a table or view, you can also add included columns to your index. This means that you can store at the leaf level not only the values from the indexed column, but also the values from non-indexed columns. This strategy allows you to get around some of the limitations on indexes. For example, you can include non-indexed columns in order to exceed the size limit of indexed columns (900 bytes in most cases).

Index Types

In addition to an index being clustered or nonclustered, it can be configured in other ways:

Composite index: An index that contains more than one column. In both SQL Server 2005 and 2008, you can include up to 16 columns in an index, as long as the index doesn’t exceed the 900-byte limit. Both clustered and nonclustered indexes can be composite indexes.
Unique Index: An index that ensures the uniqueness of each value in the indexed column. If the index is a composite, the uniqueness is enforced across the columns as a whole, not on the individual columns. For example, if you were to create an index on the FirstName and LastName columns in a table, the names together must be unique, but the individual names can be duplicated.

A unique index is automatically created when you define a primary key or unique constraint:

Primary key: When you define a primary key constraint on one or more columns, SQL Server automatically creates a unique, clustered index if a clustered index does not already exist on the table or view. However, you can override the default behavior and define a unique, nonclustered index on the primary key.
Unique: When you define a unique constraint, SQL Server automatically creates a unique, nonclustered index. You can specify that a unique clustered index be created if a clustered index does not already exist on the table.

Covering index: A type of index that includes all the columns that are needed to process a particular query. For example, your query might retrieve the FirstName and LastName columns from a table, based on a value in the ContactID column. You can create a covering index that includes all three columns

To see demo please see Index demo

Also see very good explanation of the clustered and non- clustered index here