Ahmad Yaseen

SQL Server Data Type Conversion Methods and performance comparison

October 3, 2017 by

When you define SQL Server database tables, local variables, expressions or parameters, you should specify what kind of data will be stored in those objects, such as text data, numbers, money or dates. This attribute is called the Data Type. SQL Server provides us with a big library of system data types that define all types of data that can be used with SQL Server, from which we can choose the data type that is suitable for the data we will store in that object. You can also define your own customized user defined data type using T-SQL script. SQL Server data types can be categorized into seven main categories:

  • Exact Numeric types, such as INT, BIGINT and numeric.
  • Approximate Numeric types, such as FLOAT and REAL.
  • Date and Time types, such as DATE, TIME and DATETIME.
  • Character Strings, such as CHAR and VARCHAR.
  • Unicode Character Strings, such as NCHAR and NVARCHAR.
  • Binary Strings, such as BINARY and VARBINARY.
  • Other Data Types, such as CURSOR, XML and Spatial Geography Types.

When two expressions with different data types are combined together by an operator, they may not always play well together. They may be automatically, aka implicitly, converted into the suitable data type, making the generated combination better, without raising any errors. But you may face other cases, in which you need to manually, aka explicitly, perform a suitable data type conversion, in order to avoid the combination error. To perform the explicit data conversion, SQL Server provides us with three main conversion functions; CAST, CONVERT and PARSE. The main concern in this article is to compare the performance of these conversion methods and see which method is the best to use. Before going through the performance comparison between the three conversion functions, we will briefly describe these functions.

CAST

The cast function is an ANSI SQL-92 compliant function that is used to convert an expression from one data type to another. You need to provide the CAST function with the expression that will be converted and the target data type that the provided value will be converted to. Being supported by ANSI SQL Standard, it is a preferable choice to convert between the different data types, especially when this conversion query will be executed on different DBMS such as Oracle or MySQL. The syntax that is used for the CAST function is shown below:

CONVERT

The CONVERT functions is a non- ANSI SQL-92 compliant function that is used to convert the provided expression datatype to another data type with formatting. You need to provide the CONVERT function with the expression that will be converted and the target data type that the provided value will be converted to, in addition to the style. The style is an integer expression that specifies the format that the provided expression will be translated to using the CONVERT function, which is supported for some conversions, such as conversion between the text and date time values. Specifying the style helps in formatting the converted expression into a more readable format. The syntax that is used for the CAST function is shown below:

PARSE

The PARSE function is used to convert the provided string expression into the requested data type. You need to provide the PARSE function with the string value that will be converted, the target data type that the provided value will be converted to, in addition to the culture, which is an optional string that specifies the culture, any language supported by the .NET Framework, that the provided string value will be formatted in, with the current session language as the default culture value. The culture can be provided with the optional USING clause. The syntax that is used for the CAST function is shown below:

Performance Comparison

Let us see practically how we can use these three conversion functions, and the performance differences between it. For this purpose, we will create the below simple table and fill it with 20K records using the T-SQL script below:

The table is ready now. In our demo here, we will perform the conversion operations on three columns; EmpID, AvgValue, and EmpDate. And for each column, we will do the conversion three times, using the three conversion methods described previously. We will start with converting the EmpID column into INT data type using the CAST, CONVERT and PARSE commands after enabling the TIME statistics for performance comparison purposes using the T-SQL script below:

The three commands will convert the EmpID column into INT successfully and will return the same result. Checking the TIME statistics result, you will see that the CAST conversion function has the least execution time and the PARSE function has the largest execution time, which is about 3 times the execution time for the CAST function, in addition to the large CPU time consumed by the PARSE function, opposite to the other conversion functions that have no CPU time consumed. The TIME statistics is shown below:

Let us now move to converting the AvgValue column into DECIMAL data type. Again, we will use the three conversion functions described previously, as shown in the T-SQL script below:

You can derive the same previous result from the TIME statistics below; the CAST function is the fastest function to convert the provided expression into the DECIMAL data type, and the PARSE function is the slowest, which is 2 times the time consumed by the CAST function. Another thing to concentrate on is that the execution time for three functions increased when converting to DECIMAL data type compared with converting to INT data type. The TIME statistics is shown below:

What about converting to DATETIME data type? Converting the EmpDate column to DATETIME data type using the three conversion functions:

Extra overhead will be noticed from the execution time increase and the CPU time increase when converting to DATETIME data type using the three conversion functions, keeping the previous rank of the CAST function as the fastest and the PARSE as the slowest. What we need to concentrate on also here is the execution time of the PARSE function which is 12 times the time consumed by the CAST function, and the CPU time of it is about 120 times the CPU time consumed by the CAST function. The TIME statistics shown below tell you not to take the risk of using the PARSE function to convert into DATETIME data type:

Conversion Exception Handling

If you include the previous conversion functions within your script to perform data type conversion operations that may fail due to data issue, the query will throw an error and fail, rolling back the parent transaction that it runs inside. SQL Server 2012 introduced new conversion functions that have the same functionality as the previous three converting functions, in addition to data type conversion exception handling mechanism to handle the errors generated by invalid conversions. The three newly introduced conversion functions are: TRY_CAST, TRY_CONVERT and TRY_PARSE. The functions try to convert the provided expression to the requested data type. If the conversion succeeds, it will return the expression value after conversion. If an error occurs, it will return NULL value. The statements used for the three new conversion functions, which is very similar to the previous ones, are shown below:

Let us truncate the testing table and insert new two records; a normal record and another record with data that can’t be converted to the target data types, using the T-SQL script below:

Performing the same previous data conversion of the three columns using the CAST, CONVERT and PARSE functions:

All conversion queries will fail;

  • converting the EmpID into INT will fail due to the ‘K’ character in the second row,
  • converting the AvgValue into DECIMAL will fail due to the ‘i’ character in the second row
  • and converting the EmpDate into DATE TIME data type will fail due to the ‘F’ character in the second row.

You can see that the whole conversation operations failed due to the data issue in the second-row values, although the first row values can be converted with no issue. The generated error message is shown below:

Trying to perform the conversion operations using the new conversion commands; TRY_CAST, TRY_CONVERT and TRY_PARSE commands, using the T-SQL script below:

The query will be executed successfully; the correct data in the first row will be converted successfully with no issue, and converting the second row’s data will return NULL due to the invalid data conversation error. In this way, the data conversion error will be handled within the code without stopping the overall script. The returned result from the previous script will be as shown below:

Conclusion

SQL Server provides us with different ways to convert the data type of an expression into another data type. Starting from SQL Server 2012, three new functions are introduced, that can be also used for data types conversion, in addition to the data conversion exception handling mechanism that makes the data conversion process more optimal.

The available commands to perform that are; CAST, CONVERT and PARSE. In this article, we described briefly the differences between these three functions, how to use it and finally compare their performance. In our demo, that is performed using three different data types conversions, it can be found that CAST is the fastest function that can be used to convert the data type of the provided value, and the PARSE function is the slowest.

Useful Links


Ahmad Yaseen

Ahmad Yaseen

Ahmad Yaseen is a SQL Server database administration leader at Aramex International Company with a bachelor’s degree in computer engineering as well as .NET development experience.

He is a Microsoft Certified Professional with a good experience in SQL server development, administration, performance tuning, monitoring and high availability and disaster recovery technologies. Also, he is contributing with his SQL tips in many blogs

View all posts by Ahmad Yaseen
Ahmad Yaseen
SQL Database development

About Ahmad Yaseen

Ahmad Yaseen is a SQL Server database administration leader at Aramex International Company with a bachelor’s degree in computer engineering as well as .NET development experience. He is a Microsoft Certified Professional with a good experience in SQL server development, administration, performance tuning, monitoring and high availability and disaster recovery technologies. Also, he is contributing with his SQL tips in many blogs View all posts by Ahmad Yaseen

317 Views