Jefferson Elias

SQL Server DateTime data type considerations and limitations

May 11, 2017 by

Introduction

As a DBA or SQL developer, we are used to dealing with data for many kinds of applications. In the life of a DBA, it’s common to collect values for miscellaneous aspects (like CPU, RAM or disk usage) and store these values into a dedicated table designed with a column that refers to the moment of collection.

Here is an example of such a table:

CollectionId CollectionTime CPU Usage RAM Usage Disk Reads Disk Writes Disk I/O

These timed collections provide the ability to create a set of charts with X axis as time and Y axis as the collected value or an aggregation of these values. Based on the frequency of collection, we might need to keep track of either the date only (for daily collection) or a more complete format till millisecond (or lower) because we would want to correlate that information with data from another collection.

So, based on the aim of the process, we have different date storage needs. That’s the reason why Microsoft provides different time data types that we can CONVERT to a string data type like VARCHAR in order to store or display it in a readable format.

You will find below a list of those data types:

Data type Comments
Time Only keeps track of nanoseconds in a day. There is no reference to a date.
Is stored using between 3 and 5 bytes.
Date Only keeps track of days, starting 01/01/01 to 31/12/9999.
Takes 3 bytes for storage.
SmallDateTime Can be used to store dates between 01/01/1900 and 06/06/2079.
Also allow to keep track of time information till seconds.
Takes 4 bytes for storage.
DateTime Is used to store date and time between 01/01/1753 to 31/12/9999.
Minimum unit of time it keeps is milliseconds with an accuracy of 3.33 ms.
Takes 8 bytes for storage.
DateTime2 Is the most complete data type that is a real combination of Date and Time data types.
For this reason, it takes between 6 and 8 bytes for storage.
DateTimeOffset Can be seen as a DateTime2 data type, but that takes timezone into account.

So, as you can see above, Microsoft offers many possibilities and depending on our needs, we might use one in preference to the others.

Most of us would choose DateTime as a suitable candidate because we might think that keeping down to the millisecond is enough for our collections. Those who step to that conclusion and don’t go to check its documentation page on Microsoft’s website won’t stand a chance to read following warning:

So, while this data type is still available, it can be considered as depreciated and other data types should be preferred to that one.

Despite this warning, DateTime data type seems to be (still) very often used by developers, even in newer applications, scripts or stored procedures (and I was not apart from them). The only reason that could sound a good excuse for this is to keep compatibility with SQL Server 2005 but, except in some situations (migration tool), it’s a false good reason because SQL Server 2005 is not supported anymore by Microsoft. Attentive guys would also notice that Microsoft even keeps using this data type as input to its built-in functions as DATEADD or DATEPART.

I agree that it is not because Microsoft fires a note telling to use another data type (and does not so) that we might follow his advises but my recent experience made me write this article to provide other reasons, so that every reader actually understand the fundamental reasons why Microsoft almost depreciated DateTime data type.

Actually, if you look closely at DateTime definition, you see an accuracy of 0.003 seconds. While this simple statement does not seem harmful, it can cause you headaches during development when you your application, script or stored procedure tackles the field of milliseconds…

This article will present some limitations that can be encountered when using DateTime data type and we will even see a trial to bypass these limitations that finally leads to the same conclusion: DateTime should be avoided in new developments.

In the following sections, we will first review the definition of DateTime data type and how it’s stored internally. Then we will present the situation that made me willing to share my experience and so write this article and review the steps I made to come to the final conclusion to change to DateTime2 data type.

DateTime data type considerations

Definition and storage overview

If we go on technet, Microsoft defines it as a date combined with a time of day with fractional seconds that is based on a 24-hour clock.

As explained above, this data type allows us to store dates from 01/01/1753 to 31/12/9999 and time information from 00:00:00.000 to 23:59:59.997. It’s internally represented as a 8 bytes value where first four bytes are used to represent the date part while last four are there for time part. So, we can see this representation as a floating-point value.

We could think that Microsoft stores date and time as two concatenated integer from a particular value. We could expect it’s not the case as that 3 milliseconds accuracy wouldn’t even exist as an the maximum value of an integer is 2 147 483 647, which is far bigger than the number of milliseconds in a day is 1000*60*60*24 = 86 400 000.

Anyway, the better way to know is to test and check. To do so, we will use CONVERT built-in function extensively.

Storage of the Date part of a DateTime

First, we will check the way SQL Server stores the date part of a DateTime.

Here is a sample result when running the following query. Instead of using GETDATE() function, I preferred taking a fixed value to be sure to run following statements and compare results.

This statement will give you back a 0x0000A76600F9C8AE value.

If we take up only the date value, with following query :

This query gives back the first 4 bytes of previous value: 0x0000A766. We deduct that 00F9C8AE is the time value for 15:09:26.128.

Now, what’s the corresponding integer value of the first four bytes?

And the value is 42854. Would this possibly be the number of days since 01/01/1900?

Running previous query will provide a YES answer.

There is one last question for this tour : does a 01/01/1900 value for a DateTime really imply a value of 0 in its storage?

If we run following query, we will get back a value of 0x0000.

Now, we can conclude that, actually, the date part of a DateTime is a signed integer from 01/01/1900.

Storage of the time part of a DateTime

Now, we will consider how SQL Server handles the time part of a DateTime. We will modify the second query in previous subsection and proceed the same way.

As expected, we get back following value: 0x00F9C8AE.

If we convert it to integer, what does it give?

It returns 16369838. Well this number does not seem to be the number of seconds since 00:00:00.000 as:

  • 15 hours = 540 000 000 ms
  • 9 minutes = 540 000 ms
  • 26 seconds = 26 000 ms
  • 128 milliseconds = 128 milliseconds
  • Total is : 54 566 128 milliseconds

If we divide the total above by 3.33, we will get a value near the one we got back from the last query. So, for an obscure reason, SQL Server « counts » by 3.33 milliseconds for DateTimes.

DateTime limitations by example

From now on, we will focus on practical cases where DateTime is a limitation and should absolutely be avoided.

First, let’s present the example that drove me to write this article.

My plan was to take back contents of a data source (log file, SQL Server Error Log, Extended Events…) and load it into a SQL Server table. When it would be done, I could generate reports on these data and even correlate each record with the results of the importation of another data source.

This process had to be designed so that it could run regularly. To do so, it would store timestamp value of last imported record at the end of execution. Once another run is launched, it would take back this timestamp and filter out data from its data source so that it only takes care of records after that timestamp.

This design is quite simple actually and, I’m pretty sure, is intensely used as it’s the basics of an incremental ETL process.

Now, let’s review some of the limitations that I faced.

DateTime transformation: Data alteration

The contents of the log was prepended with timing information with a millisecond precision. Even aware of the advice of Microsoft, I naively chose to ignore it and took the DateTime data type to store the information. But thanks to the accuracy, some of my data was different from the source…

You will find below a simple query that simulates the outcome of a load. We first define some string values representing a DateTime in milliseconds, using Dates CTE, and use this CTE to convert these string values to its DateTime data type version. It also compares source and generated DateTimes to provide either « success » or « data loss » outcome.

Here is a screen capture of query results :

It’s a 2/5 success for this simple example… Why?

Because of this 3.33 accuracy that actually leads to round values and only keep track of 3 values in milliseconds: 0, 3 and 7.

Incremental load: False comparison results

For the same reason as above, comparing two DateTimes with a precision to the milliseconds is not acceptable as you can see with the results of following query:

The results:

Workarounds won’t work

When we face a problem, it’s common to keep trying to make it work. We can be very inventive and we could imagine different solutions.

When I faced this DateTime problem, I figured out that comparison was not reliable when there was only one millisecond that separates two DateTimes. So I tried to run following statement:

As I so it did not give me the expected results, I even tried to manipulate a string version to add 1 millisecond with following code then compare.

Obviously, no matter the solution we can imagine, this would not work due to limitations and I came to conclusion that I had to read documentation.

Conclusion

It’s now time to conclude. In this article, we’ve seen that Microsoft provides several data types to store date and time information. Among them, there is DateTime data type. This data type is extensively used, even inside SQL Server, but it has by design some limitations that everyone need to be well aware of.

Due to that limitation, when we need a precision lower than seconds, it’s an absolute necessity to use DateTime2 over DateTime data type.

References


Jefferson Elias

Jefferson Elias

Living in Belgium, I obtained a master degree in Computer Sciences in 2011 at the University of Liege.

I'm one of the rare guys out there who started to work as a DBA immediately after his graduation. So, I work at the university hospital of Liege since 2011. Initially involved in Oracle Database administration (which are still under my charge), I had the opportunity to learn and manage SQL Server instances in 2013. Since 2013, I've learned a lot about SQL Server in administration and development.

I like the job of DBA because you need to have a general knowledge in every field of IT. That's the reason why I won't stop learning (and share) the products of my learnings.

View all posts by Jefferson Elias
Jefferson Elias
SQL Database development

About Jefferson Elias

Living in Belgium, I obtained a master degree in Computer Sciences in 2011 at the University of Liege. I'm one of the rare guys out there who started to work as a DBA immediately after his graduation. So, I work at the university hospital of Liege since 2011. Initially involved in Oracle Database administration (which are still under my charge), I had the opportunity to learn and manage SQL Server instances in 2013. Since 2013, I've learned a lot about SQL Server in administration and development. I like the job of DBA because you need to have a general knowledge in every field of IT. That's the reason why I won't stop learning (and share) the products of my learnings. View all posts by Jefferson Elias

232 Views