Gradual Type System
- Is Optional
- Be default, the type checker does not emit warnings for no type hints.
- Does not catch type errors in runtime
- Type hints are only checked by static type checkers, linters, and IDEs to raise warning.
- Does not enhance performance
Gradual Type Hint with Mypy
Mypy is a popular Python type checker. You can install Mypy with
1
pip install mypy
messages.py
1
2
3
4
5
6
7
8
9
def show_count(count, word):
if count == 1:
return f"1 {word}"
count_str = str(count) if count else 'no'
return f"{count_str} {word}s"
print(show_count(99, "bird")) # 99 birds
print(show_count(1, "bird")) # 1 bird
print(show_count(0, "bird")) # no birds
Type check messages.py
by
1
mypy messages.py
By default, mypy does not enforce type hints for functions.
Make Mypy More Strict
--disallow-untyped-defs
option--disallow-incomplete-defs
1 2 3 4 5
def show_count(count, word) -> str: if count == 1: return f"1 {word}" count_str = str(count) if count else 'no' return f"{count_str} {word}s"
Instead of providing command-line options, you can provide mypy.ini
file for settings.
1
2
3
python_version = 3.9
warn_unused_configs = True
disallow_incomplete_defs = True
A Default Parameter Value
The show_count
function in messages.py
does not allow plural noun such as “children”.
Let’s add a test code.
messages_test.py
1
2
3
def test_irregular() -> None:
got = show_count(2, 'child', 'children')
assert got == '2 children'
If we run mypy messages_test.py
, Mypy detects the error
Don’t forget to add the return type hint, otherwise Mypy will not check it
flake8 and blue for Code Style
A recommended code style for type hints are
- No space b/w the parameter name and the
:
, one space after the:
- Spaces on both sides of
=
that precedes a default param value
- Use tools like flake8 and blue to make enforce standardized code style.
Using None as Default
1
2
3
4
5
6
7
def show_count(count: int, singular: str, plural: str = '') -> str:
if count == 1:
return f'1 {singular}'
count_str = str(count) if count else 'no'
if not plural:
plural = singular + 's'
return f'{count_str} {plural}'
The plural
parameter has a default argument ''
.
However, in other contexts, especially when the optional parameter is a mutable type. Then, None
is the only sensible default.
To have None
as the default parameter,
1
2
3
4
from typing import Optional
def show_count(count: int, singular: str, plural: Optional[str] = None) -> str:
...
Optional[str]
means plural can bestr
orNone
Types Are Defined by Supported Operations
- It’s more useful to consider the set of supported operations as the defining characteristic of a type
Simple Types and Classes
- Simple types can be directly used in type hints
int
float
str
bytes
- Concrete classes from the standard library, external packages, or user defined can be used in type hints
- ex)
def func(p: Person, d: Duck)
- ex)
Among classes, consistent-with is defined like subtype-of
- A subclass is consistent-with all its superclasses
One exception:
int
is consistent-withcomplex
“Practically beats purity” - There’s no nominal subtype relationship b/w
int
,float
, andcomplex
. However, PEP 484 declares thatint
is consistent-withfloat
, andfloat
is consistent-withcomplex
. It makes sense becauseint
implements all operations offloat
andint
implements additional ones such as&, |, <<
, etc. Fori=3
,i.real = 3
,i.imag = 0
.
Optional and Union Types
Optional[str]
is a shortcut forUnion[str, None]
: means it could bestr
orNone
Better Syntax for Optional & Union in Python 3.10
We can write
str | bytes
instead ofUnion[str, bytes]
since Python 3.10. The|
operator also works withisinstance
andissubclass
. (ex.isinstance(x, int | str)
)
Avoid functions that return
Union
typesThey put an extra burden on the user - forcing them to check the type of the returned value at runtime to know what to do with it!
Nested
Union
types have the same effect as a flattenedUnion
.
Union[A, B, Union[C, D, E]]
is the same asUnion[A, B, C, D, E]
Union
is more useful with types that are not consistent among themselves
Union[int, float]
is redundant sinceint
is consistent-withfloat
.- We can just use
float
to annotate the parameter since it acceptsint
as well.
Generic Collections
Generic types can be declared with type params to specify the type of items they can handle
1
2
def tokenize(text: str) -> list[str]:
reeturn text.upper().split()
list
can be parametrized to constrain the type of elements in ittokenize
returns alist
where every item is of typestr
- PEP 585 lists collections from the standard library accepting generic type hints. They only show collections that use the simplest form like
container[item]
list
set
frozenset
collections.deque
abc.Container
abc.Collection
abc.Sequence
abc.Set
abc.MutableSequence
abc.MutableSet
- We’ll see later types that support more complex type hints like
tuple
andmapping
For Python 3.7 and 3.8, you need a
__future__
import to make[]
notation work with built-in collections such aslist
.
1 2 3 from __future__ import annotations def tokenize(text: str) -> list[str]: return text.upper().split()For Python 3.5 and 3.6, you must use
typing
module
1 2 3 from typing import List def tokenize(text: str) -> List[str]: return text.upper().split()
Tuple Types
There’re 3 ways to annotate tuple types
- Tuples as records
- Use the
tuple
built-in and declare the types of the fields within[]
- Ex)
tuple[str, float, str]
to accept a tuple with city name, population, and country:('Seoul', 24.25, 'Korea')
- Use the
- Tuples as records with named fields
- To annotate a tuple with many fields, or specific types of tuple your code uses in different places, use typing.NamedTuple
1 2 3 4 5 6 7 8 9
from typing import NamedTuple PRECISION = 9 class Coordinate(NamedTuple): lat: float lon: float def geohash(lat_lon: Coordinate) -> str: return gh.encode(*lat_lon, PRECISION)
typeing.NamedTuple
is a factory fortuple
subclasses, soCoordinate
is consistent-withtuple[float, float]
but the reverse is not true sinceCoordinate
has extra methods added byNamedTuple
like._asdict()
or user-defined functions.- In practice, it’s allowed to pass
Coordinate
instace to the following function
1 2
def display(lat_lon: tuple[float, float]) -> str: ...
- Tuples as immutable sequences
- To annotate tuples with unspecified length that are used as immutable lists, you must specify a single type, followed by
...
.tuple[int, ...]
is a tuple withint
items- The ellipsis indicates that any number of elements >= 1 is acceptable
- (There’s no way to specify types for arbitrary length tuple of course)
- So
def func(x: tuple[Any, ...])
is the same asdef fun(x: tuple)
- To annotate tuples with unspecified length that are used as immutable lists, you must specify a single type, followed by
Generic Mappings
- Generic mapping types are annotated as
MappingType[KeyType, ValueType]
. - For Python >= 3.9, the built-in
dict
and the mapping types incollections
andcollections.abc
accept that notation. - For earlier versions, you must use
typing.Dict
and other mapping types fromtying
module.
Abstract Base Classes
“Be conservative in what you send, be liberal in what you accept” - Postel’s law, a.k.a. the Robustness Principle
1
2
3
from collections.abc import Mapping
def name2hex(name: str, color_map: Mapping[str, int]) -> str:
...
Ideally, a function should accept arguments of abstract types
- Ideally, a function should accept arguments of abstract types (or
typing
equivalent), not concrete types to give more flexibility to the caller - Using
abc.Mapping
, the caller can pass an instance ofdict
,defaultdict
,ChainMap
,UserDict
subclass or any other type that is a subtype-ofMapping
.
In contrast, let’s see what happens if the argument accepts a concrete type.
1
2
def name2hex(name: str, color_map: dict[str, int]) -> str:
...
- Now,
color_map
must be adict
or one of its subtypes such asdefaultdict
orOrderedDict
. - A subclass of
collections.UserDict
would not pass the type check such as Mypy since it’s not a subclass ofdict
. dict
andcollections.UserDict
are siblings - both are subclasses ofabc.MutableMapping
- Hence, it’s better to use
abc.Mapping
orabc.MutableMapping
in type hints, instead ofdict
- Moreover, if the
name2hex
does not mutatecolor_map
, then the most accurate type hint isabc.Mapping
- The caller doesn’t need to provide an object that implements methods like
setdefault
,pop
,update
- they’re part ofMutableMapping
interface , but not ofMapping
- “Be liberal in what you accept”
- The caller doesn’t need to provide an object that implements methods like
A function should return concrete types
1
2
def tokenize(text: str) -> list[str]:
return text.upper().split()
Iterable
The
typing.List
documentations recommends to useSequence
andIterable
for function parameter type hints
1
2
3
4
5
6
from collections.abc import Iterable
FromTo = tuple[str, str]
def zip_replace(text: str, changes: Iterable[FromTo]) -> str:
...
1
2
3
4
zip_replace(
"abc",
[('a', '4'), ('e', '5')]
)
FromTo
is a type alias fotuple[str, str]
(more readable)
Explicit TypeAlias in Python 3.10
PEP 613 introduced a special type
TypeAlias
to make the assignments that create type aliases more visible and easier to type check.
1 2 from typing import TypeAlias FromTo: TypeAlias - tuple[str, str]
Stub Files and the Typeshed Project
As of Python 3.10, the standard library has no annotations. But Mypy, PyCharm, etc can find the type hints in the Typeshet project, in the form of stub files, special source files with
.pyi
extension, that have annotated function and method signature just like header files in C.
abc.Iterable
vs abc.Sequence
1
2
3
4
from collections.abc import Sequence
def func1(sequence: Sequence[str]) -> int:
return len(sequence)
1
2
3
4
5
6
from collections.abc import Iterable
def func2(lst: Iterable[int]) -> int:
_sum = 0
for e in lst:
_sum += e
return _sum
Iterable
func1
must iterate over the entire Iterable to return a result.- Given an endless iterable such as
itertools.cycle
would consume all memory
- Given an endless iterable such as
- Despite this potential danger, it’s common to offer functions that accept
Iterable
to allow the caller to provide the input data as generator to save a lot of memory
Sequence
func2
must acceptSequence
because it must get thelen()
of the input.
Parameterized Generics and TypeVar
A parametrized generic is a generic type, written as list[T]
T
: Type variable that will be bound to a specific type with each usage
1
2
3
4
5
6
7
8
9
10
from collections.abc import Sequence
from random import shuffle
from typing import TypeVar
T = TypeVar('T')
def sample(population: Sequence[T], size: int) -> list[T]:
result = list(population)
shuffle(result)
return result[:size]
- In the example,
T
is a type variable. - If
population
is a sequence withint
, then the return type islist[int]
- If
population
is a sequence withstr
, then the return type islist[str]
Let’s take another example.
1
2
3
4
5
6
7
8
from collections import Counter
from collections.abc import Iterable
def mode(data: Iterable[float]) -> float:
pairs = Counter(data).most_common(1)
if len(pairs) == 0:
raise ValueError('no mode for empty data')
return pairs[0][0]
We have a problem here. Some users might want to find the mode among int
or other numerical types (Complex, etc).
Let’s improve it using TypeVar
.
1
2
3
4
5
6
7
8
9
10
11
from collections import Counter
from collections.abc import Iterable
from typing import TypeVar
T = TypeVar('T')
def mode(data: Iterable[T]) -> T:
pairs = Counter(data).most_common(1)
if len(pairs) == 0:
raise ValueError('no mode for empty data')
return pairs[0][0]
In the improved version, the type parameter T
can be any type including the unhashables which collections.Counter
cannot handle.
Hence, we need to restrict the possible types assigned to T
.
Restricted TypeVar
1
2
3
4
5
6
from typing import TypeVar
NumberT = TypeVar('NumberT', float, Decimal, Fraction)
def mode(data: Iterable[NumberT]) -> NumberT:
...
It’s better than before. But we still have a problem. Some users might want to pass str
or tuple
types to find the mode. Then, the name NumberT
is very misleading.
Bounded TypeVar
To solve that problem, we can use bounded TypeVar
1
2
3
4
5
6
7
from collections.abc import Iterable, Hashable
from typing import TypeVar
HashableT = TypeVar('HashableT', bound=Hashable)
def mode(data: Iterable[HashableT]) -> HashableT:
...
A bounded type variable will be set to the inferred type of the expression as long as the inferred type is consistent-with the boundary declared in the bound=...
keyword.
Static Protocols
In Python, a protocol definition, similar to interfaces in Go, is written as typing.Protocol
subclass.
- A protocol type is defined by specifying one or more methods
- And the type-checker verifies that those methods are implemented where that protocol type is required.
- In short, a protocol defines an interface that a type-checker can verify
Classes that implement a protocol do not need to inherit, register, or declare any relationship with the class that defines the protocol.
Let’s look at an example.
Suppose top(it, n)
is a function that returns the largest n
elements of the given iterable it
.
1
2
3
4
5
6
7
8
9
10
def top(series: Iterable[T], length: int) -> list[T]:
ordered = sorted(series, reverse=True)
return ordered[:length]
>>> top([4, 1, 5, 2, 6, 7, 3], 3)
[7, 6, 5]
>>> l = 'mango pear apple kiwi banana'.split()
>>> top(l, 3)
['pear', 'mango', 'kiwi']
Now the problem is: “How to constrain T?”
It cannot be any type such as Any
or object
because series
must work with sorted
.
sortes
actually acceptsIterable[Any]
, but it’s just because the optional paramkey
takes a compare function for sorting.- What if you give
sorted
a list of plain objects without providingkey
argument? It will throw an error. - The error message shows that
sorted
uses<
operator.
More specifically, T
type param should be limited to types that implement __lt__
.
We can deal with this problem using typing.Protocol
1
2
3
4
from typing import Protocol, Any
class SupportsLessThan(Protocol):
def __lt__(self, other: Any) -> bool: ...
- A protocol is a subclass of
typing.Protocol
- The body of the protocol has one or more method definitions, with
...
in their bodies.
A type T
is consistent-with a protocol P
if T
implements all the methods defined in P
, with matching type signatures
Now let’s use it to complete top
function.
1
2
3
4
5
6
7
8
9
10
from collections.abc import Iterable
from typing import TypeVar
from comparable import SupportsLessThan
LT = TypeVar('LT', bound=SupportsLessThan) # using bound TypeVar
def top(series: Iterable[LT], length: int) -> list[LT]:
ordered = sorted(series, reverse=True)
return ordered[:length]