Dangerous implementations of time representation in today's programming languages

Many programming languages have some kind of object to represent time. In PHP it's DateTimeImmutable, in JavaScript it's Date. Python has it's datetime.datetime. All of them were designed in such a way, that they contain timezone as part of their time representation. And what's dangerous, they expose it (mostly directly via a method).

In this article, I will explain why it poses a danger to you and propose a technique to avoid it.

en 8. 5. 2018
PHP

How it works now in most programming languages

I'll be using PHP as an example but this applies to many other languages as well.

PHP has DateTimeImmutable object. It contains a representation of date and time. Date and time are represented by datetime components (years, months, days, hours, minutes, seconds, milliseconds) and timezone. The values of the components are related to the timezone.

This is what it looks like:

object(DateTimeImmutable)#1 (3) {
  ["date"]=> string(26) "2018-04-23 11:46:13.854892"
  ["timezone_type"]=> int(3)
  ["timezone"]=> string(10) "US/Pacific"
}

As everybody knows, you can change the timezone used in this representation by:

  1. Passing it as the second argument to the constructor
  2. Using the date_default_timezone_set('Europe/Prague') function
  3. Setting the directive date.timezone = "Europe/Prague" in your php.ini configuration file

Having the timezone inside of the object is necessary. It's essential for comparison and conversion into the timestamp.

So far, so good. But things are about to get serious.

The problem

Contextual usage of timezone from the object

We have discussed how timezone works inside the datetime object to represent its value. But this object also exposes this information via the getTimezone() method. And this is dangerous.

The timezone alone, outside of the object, has no meaning. Whenever we use this information for a different purpose (outside of the DateTimeImmutable object), we rely on some contextual variable which can (and one day surely will) change. This leads to really nasty bugs.

This also applies to timezones in other representations, such as the ISO 8601 format.

Conclusion 1: We should never use getTimezone() method in code. If we have a semantic meaning for timezone in our code, we process it separately from the representation of datetime. We name this variable properly and store it separately.

Hidden usage of timezone information

There is another dangerous method format(). It provides the possibility to expose timezone information from the object in a much more dangerous way.

$date = new DateTimeImmutable(); // Europe/Pague  
echo($date->format('Y-m-d H:i:s'));
// 2018-05-05 16:45:55 

$date = $date->setTimezone(new DateTimeZone('UTC'));
// 2018-05-05 14:45:55

Whenever we use the format() method, we rely on the information about timezone inside of the object. As explained above, it has no meaning outside of the object. In this case we've represented time in the correct timezone only by chance. This creates space for bugs and makes localization of our application really painful.

In smaller applications, this is very common. And it only "works" because:

  • There is no need to use multiple time zones.
  • The application, the server and the database are all configured to use the same timezone.

But once we start scaling our application and some of following example scenarios happen:

  • The page is now visited by customers from different parts of the world
  • The complexity of the domain grows and new use cases become a thing
  • The application uses other sources of data, rather than just one MySQL database (external APIs, …)

it's very easy to forget something, somewhere… 🐛

Conclusion 2: Use the format() method in your code only once to create the formatting service. This service takes the desired timezone explicitly as an argument (as shown in the example below).

final class DateTimeFormatter
{
    public function formatInTimezone(DateTimeImmutable $dateTime, DateTimeZone $timeZone, string $format): string 
    {
        $dateTimeImmutableInTimezone = $dateTimeImmutable->setTimezone($dateTimeZone);
        return $dateTimeImmutableInTimezone->format($format);
    }
}

By following these simple conclusions we can avoid bugs, scalability issues and have more robust code at little cost.

Not convinced?

Did you know this?

$date = DateTime::createFromFormat('U', '1524509605', new DateTimeZone('Europe/Prague'))
var_dump($date);

It will NOT create an object with the desired Europe/Prague timezone, but with UTC instead!

From documentation:

The timezone parameter and the current timezone are ignored when the time parameter either contains a UNIX timestamp (e.g. 946684800) or specifies a timezone (e.g. 2010-01-28T15:00:00+02:00).

Beside of the usual coding mistakes, this is how you end up with a totally unexpected timezone in the object.

Proposal

This article proposes the following practices:

  • Never use getTimezone() nor format() method on a DateTimeImmutable object. Introduce a service responsible for formatting datetime into an explicitly provided timezone.
  • Code MUST work correctly with any DateTimeImmutable containing any timezone.
  • Avoid MySQL Y-m-d H:i:s representations without a timezone that depend on server configuration. Use formats containing a timezone (or timestamp).
  • Always use ISO 8601 in your APIs. It's a standard and is human readable.
  • Keep timezone values that have meaning (for example timezone of the user) in your application domain separated.
  • Use UTC everywhere (servers, configuration) as it's not affected by daylight savings.
  • Regardless of the previous point, your application MUST NOT rely on this setting.
  • There are some cases when you need to know the timezone to which the datetime components are related, because yes ‒ timezones offset can change. [1]

Sources

  1. How to save datetimes for future events - (when UTC is not the right answer)