Data types#
Here we look at things related to data types in Python.
Basic datatypes#
Here are the basic datatypes that are implemented in Python by default. This is just a brief overview of the basic datatypes - see the specific page for more information.
Data Type |
Mutable |
Collection |
Ordered |
Description |
---|---|---|---|---|
|
No |
No |
- |
Integer values (e.g., 1, -10) |
|
No |
No |
- |
Floating-point numbers (e.g., 3.14) |
|
No |
Yes |
Yes |
Strings (e.g., “hello”) |
|
No |
No |
- |
Boolean values ( |
|
Yes |
Yes |
Yes |
Lists (e.g., [1, 2, 3]) |
|
No |
Yes |
Yes |
Tuples (e.g., (1, 2, 3)) |
|
Yes |
Yes |
Yes (>=3.7) |
Dictionaries (e.g., {“key”: “value”}) |
|
Yes |
Yes |
No |
Sets (e.g., {1, 2, 3}) |
|
No |
Yes |
No |
Immutable sets (e.g., frozenset([1, 2, 3])) |
|
No |
No |
- |
Represents the absence of a value |
In the table above we mentioned many Python datatypes, now let us describe the properties that define different types more precisely.
Mutable#
The main feature of the mutable datatypes is that they can change it content.
The following example shows how to add another element to the Python list. The same list now has different contents - that’s why it’s mutable.
orginal_list = [1,2,3]
orginal_list.append(4)
orginal_list
[1, 2, 3, 4]
However, it’s important to note that when you assign a new value to an immutable variable, say integer, you’re not changing the value stored in the original integer. Instead, you create a new integer object and assign it to the variable name. This means that integers are immutable in Python; once an integer object is created, its value cannot be changed.
The following example shows that each time you change the value of intger (or any other mutable type), it’s a new object.
a = 5
print(id(5))
a = 7
print(id(a))
129449186738544
129449186738608
Collection#
In Python, a “collection” refers to a group of multiple elements that are stored together and can be manipulated as a unit. Collections are fundamental data structures that allow you to manage and organize data efficiently.
Each collection has its own complexity for different operations. Check out “Time complexity” for more information.
Ordered#
Some collections in python allow user to define order other not.
Ordered collections allow you to define an order. The following example shows that the list retains the same order that was specified when it was created. This property of the list means that it’s ordered.
print(['a', 'b', 'c', 'd', 'e'])
print(['e', 'd', 'c', 'b', 'a'])
['a', 'b', 'c', 'd', 'e']
['e', 'd', 'c', 'b', 'a']
The same example but with an unordered set. It completely ignores the order in which the elements were created and records them according to specific rules.
print({'a', 'b', 'c', 'd', 'e'})
print({'e', 'd', 'c', 'b', 'a'})
{'c', 'd', 'a', 'b', 'e'}
{'c', 'd', 'a', 'b', 'e'}
Datetime#
Datetime data types are commonly used but can sometimes be tricky. This section covers datetime data types in core Python. Check corresponding page on the official python documentaion or the special page on this site.
The following classes are available for working with datetime data:
Class |
Description |
---|---|
|
Represents a date (year, month, and day) without time information. |
|
Represents a time (hour, minute, second, microsecond) without any date. |
|
Combines both date and time information (year, month, day, hour, minute, second, microsecond). |
|
Represents a duration, i.e., the difference between two dates or times. |
|
A base class for dealing with time zones, used to handle timezone conversions. |
|
A subclass of |
The following cells show, according to my experience, the most commonly used features among these:
Calculating periods between dates (time deltas).
Formatting dates to specific formats.
The following cells have created some datetime
objects:
from datetime import date
begin = date(2022, 10, 20)
end = date(2022, 3, 4)
begin, end
(datetime.date(2022, 10, 20), datetime.date(2022, 3, 4))
We can easily compute the period between them just by using the -
operator.
begin - end
datetime.timedelta(days=230)
The following cell applies the strftime
method to format the transformation of the datetime
to a string.
begin.strftime('%a %d %b %Y, %I:%M%p')
'Thu 20 Oct 2022, 12:00AM'
Subtypes#
Python supports the concept of subtyping. Formally, we say that a type T is a subtype of U if the following two conditions hold:
Every value of type T is also a valid value of type U.
Every operation (method or function) that can be performed on type U can also be performed on type T, with T maintaining all the guarantees of U.
Consider example where T=bool and U=int. Anything you can do with int is acceptable to do with bool.
print(sum([True, False, True, False]))
print(True**False)
2
1
You can check if T is a subtype of U using the function issubclass(<T>, <U>)
- returns true if T is a subtype of U.
print("Is bool subclass of int -", issubclass(bool, int))
print("Is list subclass of int -", issubclass(list, int))
Is bool subclass of int - True
Is list subclass of int - False
Here is code that generates table where elements \(r_{ij}\) mark whether the type of the \(i\)-th row is a subtype of the type of the \(j\)-th column.
import numbers
import collections
my_types = [
bool, int, float, complex, numbers.Number,
list, bytearray, tuple, bytes, set, frozenset, dict,
collections.abc.MutableSequence,
collections.abc.Sequence, collections.abc.Set,
collections.abc.Mapping
]
cell_wrapper = (
lambda content, color:
f"<td style='background:{color};text-align:center'>{content}</td>"
)
issubclasses = [
[
(
cell_wrapper('✓', "green")
if issubclass(t1, t2)
else cell_wrapper('x', "red ")
)
if t1 != t2 else cell_wrapper('-', "gray")
for t2 in my_types
]
for t1 in my_types
]
def type_ecraniser(s):
replacements = {
'<','>','class', "'"
}
for sumb in replacements:
s = s.replace(sumb, '')
return s.strip()
header = "".join([
"<th>" + type_ecraniser(str(t)) + "</th>"
for t in [""] + my_types
])
header = "<tr>" + header + "</tr>"
content = "".join([
(
"<tr>" +
f"<td>{type_ecraniser(str(my_types[i]))}</td>" +
"".join(row) +
"</tr>"
)
for i, row in enumerate(issubclasses)
])
HTML(f"<table>{header + content}</table>")
bool | int | float | complex | numbers.Number | list | bytearray | tuple | bytes | set | frozenset | dict | collections.abc.MutableSequence | collections.abc.Sequence | collections.abc.Set | collections.abc.Mapping | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
bool | - | ✓ | x | x | ✓ | x | x | x | x | x | x | x | x | x | x | x |
int | x | - | x | x | ✓ | x | x | x | x | x | x | x | x | x | x | x |
float | x | x | - | x | ✓ | x | x | x | x | x | x | x | x | x | x | x |
complex | x | x | x | - | ✓ | x | x | x | x | x | x | x | x | x | x | x |
numbers.Number | x | x | x | x | - | x | x | x | x | x | x | x | x | x | x | x |
list | x | x | x | x | x | - | x | x | x | x | x | x | ✓ | ✓ | x | x |
bytearray | x | x | x | x | x | x | - | x | x | x | x | x | ✓ | ✓ | x | x |
tuple | x | x | x | x | x | x | x | - | x | x | x | x | x | ✓ | x | x |
bytes | x | x | x | x | x | x | x | x | - | x | x | x | x | ✓ | x | x |
set | x | x | x | x | x | x | x | x | x | - | x | x | x | x | ✓ | x |
frozenset | x | x | x | x | x | x | x | x | x | x | - | x | x | x | ✓ | x |
dict | x | x | x | x | x | x | x | x | x | x | x | - | x | x | x | ✓ |
collections.abc.MutableSequence | x | x | x | x | x | x | x | x | x | x | x | x | - | ✓ | x | x |
collections.abc.Sequence | x | x | x | x | x | x | x | x | x | x | x | x | x | - | x | x |
collections.abc.Set | x | x | x | x | x | x | x | x | x | x | x | x | x | x | - | x |
collections.abc.Mapping | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | - |