Python 3 type checking and data validation with type hints
Type hints are one of my favorite features in Python 3. Starting with Python 3.0 PEP 3107 – Function Annotations introduced function annotations and was later enhanced by PEP 484 – Type Hints in Python 3.5 and PEP 526 – Syntax for Variable Annotations in Python 3.6.
To this point Python (Python < 3) had no type declaration for variables, classes, methods or functions. A type of a variable comes with the object the variable is pointing to. Type declaration can still be something useful if you need it and it makes your code more explicit. I’m working a lot on APIs and I need to ensure only validated data will be accepted. This is where type hints come in handy, with a little extra work though, because even with annotated variables Python will not enforce those types out of the box. For type checking your code you may have a look at mypy. There is also an Atom plugin called linter-mypy. This article will cover a simple example for type checking at run time.
def addition(number: int, other_number: int) -> int:
return number + other_number
mynumber: int = 1
myother_number: 1 = 1
print(addition(mynumber, myother_number))
# -> 2
mystring: str = 'Hello'
myother_string: str = ' World!'
print(addition(mystring, myother_string))
# -> Hello World!
The syntax is pretty straight forward and as you can see it’s possible to pass in two strings into addition(int, int)
, which means type checking is not enforced. To change that we need to find out if a variable or a parameter is annotated or not.
As of now (Python 3.6.5) the syntax for variable annotations is implemented but the function to inspect the annotations get_type_hints(...)
is not yet enhanced to support PEP 526. That’s why I will not cover variable type inspection at this time.
from typing import get_type_hints
def addition(number: int, other_number: int) -> int:
return number + other_number
print(get_type_hints(addition))
# -> {'number': <class 'int'>, 'other_number': <class 'int'>, 'return': <class 'int'>}
The function get_type_hints(obj[, globals[, locals]])
inspects a module, class, method or function and returns it’s input signature as well as the return type. Based on this information we can validate the input arguments to addition(...)
.
from typing import get_type_hints
def validate_input(obj, **kwargs):
hints = get_type_hints(obj)
# iterate all type hints
for attr_name, attr_type in hints.items():
if attr_name == 'return':
continue
if not isinstance(kwargs[attr_name], attr_type):
raise TypeError(
'Argument %r is not of type %s' % (attr_name, attr_type)
)
def addition(number: int, other_number: int) -> int:
validate_input(addition, number=number, other_number=other_number)
return number + other_number
print(addition(1, 2))
# -> 3
print(addition(1, '2'))
# -> TypeError: Argument 'other_number' is not of type <class 'int'>
validate_input(...)
now ensures that only valid data types are passed into addition(...)
, though always calling validate_input(...)
is not very “Pythonic”, using a decorator would be much nicer.
from typing import get_type_hints
from functools import wraps
from inspect import getfullargspec
def validate_input(obj, **kwargs):
hints = get_type_hints(obj)
# iterate all type hints
for attr_name, attr_type in hints.items():
if attr_name == 'return':
continue
if not isinstance(kwargs[attr_name], attr_type):
raise TypeError(
'Argument %r is not of type %s' % (attr_name, attr_type)
)
def type_check(decorator):
@wraps(decorator)
def wrapped_decorator(*args, **kwargs):
# translate *args into **kwargs
func_args = getfullargspec(decorator)[0]
kwargs.update(dict(zip(func_args, args)))
validate_input(decorator, **kwargs)
return decorator(**kwargs)
return wrapped_decorator
@type_check
def addition(number: int, other_number: int) -> int:
return number + other_number
print(addition(1, 2))
# -> 3
print(addition(1, '2'))
# -> TypeError: Argument 'other_number' is not of type <class 'int'>
Using a decorator improves readability and decouples the data validation step from the function logic. Maybe you noticed the step where *args
are translated to **kwargs
. As get_type_hints(...)
returns a dict
of argument name and data type, we need to ensure the same structure for the arguments passed into addition(...)
. If we would just pass in *args
the arg -> value
mapping would be lost and we wouldn’t know which value belongs to which argument.
Of course this kind of type checking is only a very simple example how you could approach it. It doesn’t cover more complex scenarios like default values or nested data types.