Ranga Babu
database level collation

Overview of the Collate SQL command

October 22, 2019 by

In this article, we will review the Collate SQL command. First, let us see what collation in the SQL Server is.

Collation is a set of rules that tell database engine how to compare and sort the character data in SQL Server.

Collation can be set at different levels in SQL Server. Below are the three levels:

  • SQL Server level
  • Database level
  • Column level

SQL Server level

Collation setting at SQL Server level can be specified while installing SQL Server. Please refer to the below image:

COLLATE SQL command overview

The collation of the SQL Server is set to SQL_Latin1_General_CP1_CI_AS.

Break down of the collation setting is as below:

  • SQL – All SQL Server collations will have the prefix SQL
  • Latin1_General – represents the sort rule, CI means case insensitive and AS means accent sensitive

Execute the following T-SQL script to know the collation of the SQL Server instance:

You can also use the stored procedure sp_helpsort to know the collation of the SQL server instance.

Database level

By default, all system and user databases will inherit the collation setting that was specified at the SQL Server level during installation. The below image shows the default collation inherited from the server level when collation is not specified explicitly while creating a database. Here the collation of SQL Server instance is SQL_Latin1_General_CP1_CI_AS which is the same as the database:

collation setting at database level

We can also specify the collation of a database while creating the database using the Collate SQL command. Use the below T-SQL script which creates the database with collation SQL_Latin1_General_CP1_CS_AS:

Please refer to the below image that shows the database is created with the collation specified using the Collate SQL command:

database level collation

You can also modify the database collation after creating the database. In this case, use the below T-SQL scripts which modifies the collation setting of a database:

Execute the following T-SQL script to know the collation of the database. In this case, the “Sample” is the name of the database. Replace it with yours:

Column level

By default, a new character type column inherits the collation of the database unless you specify the collation explicitly while creating the table.

To create a column with a different collation you can specify collation using Collate SQL command. Please refer to the below T-SQL script which creates a column with the specified collation:

Collation setting can be specified on the below character column types:

  • varchar
  • nvarchar
  • char
  • nchar
  • text
  • ntext

To view the collation of a column on the table, execute the following stored procedure with the table name as a parameter:

column level collation

Examples

Let us create a table and insert sample data for the demo purpose:

Now let us query the Product table. Please refer to the below image it returns two rows and ignores case as the collation is case insensitive. i.e. SQL_Latin1_General_CP1_CI_AS:

select statement with out collate SQL command

Let’s explicitly specify the collation using the Collate SQL command in the Select statement:

Check out the below image that shows only one row in the result set as we have specified the collation in the select statement which is case sensitive:

select statement with collate sql command

Let’s execute the following query that returns all the three records as the collation specified is case insensitive and accent insensitive:

case insensitive and accent insensitive collation

Joining the two columns of a different collation

For the demo purpose, I am creating two tables Products and ProductDesc with different collation on the column Pcode. Use the below T-SQL script:

Let’s join these two tables on the column Pcode. It throws an error:

Msg 468, Level 16, State 9, Line 22
Cannot resolve the collation conflict between “SQL_Latin1_General_CP850_CI_AI” and “SQL_Latin1_General_CP1_CI_AS” in the equal to operation.

This is because the collation of the column Pcode in both tables are different:

joining two columns with different collation

To make the join work, you must use the Collate SQL command in the Select statement as shown in the below T-SQL script:

Tempdb collation

The collation of the tempdb will be the same as the collation of the SQL Server instance. In some cases, we may use different collations at the database level and server level. In this case, we may run to issues if we use temporary tables and making joins with the tables in the database.

For example, the collation of SQL Server instance is SQL_Latin1_General_CP1_CI_AS and I am creating a database with different collation, i.e. SQL_Latin1_General_CP437_BIN. Now create a table with a column of type varchar which will inherit the database collation SQL_Latin1_General_CP437_BIN:

Now create a temp table with a column of type varchar which will inherit the collation of tempdb database i.e. SQL_Latin1_General_CP1_CI_AS:

Now joining these tables on column Pcode will throw an error as the collations are different:

collation on temp table

In these cases, we must explicitly specify the collation using Collate SQL command while creating a temp table in stored procedures or T-SQL batch.

It is better to use the same collation at the database level and SQL Server level in order to avoid such issues while performing operations on temp tables or cross-database operations when the two databases have a different collation.

Conclusion

In this article, we explored the collation setting in SQL Server at different levels and querying tables using the Collate SQL command. In case you have any questions, feel free to ask in the comment section below.

Ranga Babu
168 Views