udatetime a fast RFC3339 compliant date-time Python library
Working with date-time formats can be pretty upsetting because of the variate of different formats people can come up with. date-times are used everywhere not just only logging or meta data in database entries and are pretty important. That’s why I encourage developers in using the ISO 8601 derived RFC3339 standard for their projects.
RFC3339 date-time: 2016-07-18T12:58:26.485897 +02:00
The RFC3339 specification offers the following advantages:
- Defined date, time, timezone, date-time format
- 4 digit year
- Fractional seconds
- Human readable
- No redundant information like weekday name
- Simple specification
- Machine readable
Having a date-time standard is nice, but using Python’s datetime
library
to parse/format a RFC3339 date-time string or even create a datetime
object
in UTC or local timezone can be painful and slowwwww. That’s why I decided to
implement a Python 2 library to deal with such tasks. The library is called
udatetime
and available on github
or PyPI.
$ pip install udatetime
The goal of the library is to be fast and handy with RFC3339 date-time
formatted strings. The average performance increase of udatetime
compared to
the equivalent datetime
code is 76%. Due to the usage of Python2
CPython API and POSIX features the library is currently only supported on POSIX
systems and not Python3 or Pypy compatible. I’m working on cross-platform and
Pypy support. Support in working on the library is greatly appreciated.
Benchmark
The benchmark setup is the following.
from datetime import datetime
import udatetime
RFC3339_DATE = '2016-07-18'
RFC3339_TIME = '12:58:26.485897+02:00'
RFC3339_DATE_TIME = RFC3339_DATE + 'T' + RFC3339_TIME
RFC3339_DATE_TIME_DTLIB = RFC3339_DATE_TIME[:-6] # datetime can't parse timezones through strptime
DATE_TIME_FORMAT = '%Y-%m-%dT%H:%M:%S.%f'
DATETIME_OBJ = datetime.strptime(RFC3339_DATE_TIME_DTLIB, DATE_TIME_FORMAT)
def benchmark_parse():
def datetime_strptime():
datetime.strptime(RFC3339_DATE_TIME_DTLIB, DATE_TIME_FORMAT)
def udatetime_parse():
udatetime.from_string(RFC3339_DATE_TIME)
return (datetime_strptime, udatetime_parse)
def benchmark_format():
def datetime_strftime():
DATETIME_OBJ.strftime(DATE_TIME_FORMAT)
def udatetime_format():
udatetime.to_string(DATETIME_OBJ)
return (datetime_strftime, udatetime_format)
def benchmark_utcnow():
def datetime_utcnow():
datetime.utcnow()
def udatetime_utcnow():
udatetime.utcnow()
return (datetime_utcnow, udatetime_utcnow)
def benchmark_now():
def datetime_now():
datetime.now()
def udatetime_now():
udatetime.now()
return (datetime_now, udatetime_now)
def benchmark_utcnow_to_string():
def datetime_utcnow_to_string():
datetime.utcnow().strftime(DATE_TIME_FORMAT)
def udatetime_utcnow_to_string():
udatetime.utcnow_to_string()
return (datetime_utcnow_to_string, udatetime_utcnow_to_string)
def benchmark_now_to_string():
def datetime_now_to_string():
datetime.now().strftime(DATE_TIME_FORMAT)
def udatetime_now_to_string():
udatetime.now_to_string()
return (datetime_now_to_string, udatetime_now_to_string)
If you like you can run the benchmark yourself by running the bench.py
script
from the repository.
The results of 1 million executions and 3 repeats look like this.
benchmark_parse
datetime.strptime(RFC3339_DATE_TIME_DTLIB, DATE_TIME_FORMAT)
vs
udatetime.from_string(RFC3339_DATE_TIME)
benchmark_format
DATETIME_OBJ.strftime(DATE_TIME_FORMAT)
vs
udatetime.to_string(DATETIME_OBJ)
benchmark_now
datetime.now()
vs
udatetime.now()
benchmark_utcnow
datetime.utcnow()
vs
udatetime.utcnow()
benchmark_now_to_string
datetime.now().strftime(DATE_TIME_FORMAT)
vs
udatetime.now_to_string()
benchmark_utcnow_to_string
datetime.utcnow().strftime(DATE_TIME_FORMAT)
vs
udatetime.utcnow_to_string()