Tensor

Tensor#

Here is a description of the basic data type in PyTorch: torch.Tensor.

import torch
from math import prod

Create tensor#

Torch has tons of methods to create tensors. This page lists the methods I know for now.

The most straightforward way is to use torch.tensor.

torch.Tensor(3,2,5)

tensor([[[ 0.0000e+00,  0.0000e+00,  1.4013e-45,  0.0000e+00,  1.4013e-45],
         [ 0.0000e+00,  9.1084e-44,  0.0000e+00, -3.7852e+06,  3.3707e-41]],

        [[-7.6466e+07,  3.3707e-41,  4.4842e-44,  0.0000e+00,  4.4842e-44],
         [ 0.0000e+00,  6.3884e-27,  3.3703e-41,  0.0000e+00,  1.4013e-45]],

        [[ 1.3004e-42,  0.0000e+00,  1.1210e-43,  0.0000e+00,  6.4326e-27],
         [ 3.3703e-41,  4.2427e-08,  1.2964e+16,  2.1707e-18,  7.0952e+22]]])

Dimentionality#

One of the most important properties of the tensor is it’s dimensionality. Torch provides a set of tools to manage tensor dimensionality. Find out more on the dedicated page.

This section overviews tools that allows to work with diemntionality of the tensors. They are listed in the follwing table:

Function/Method	Description
`shape`	Returns the shape (dimensions) of a tensor as a tuple.
`reshape`	Changes the shape of a tensor while preserving its data. The number of elements must remain the same.
`transpose`	Permutes the dimensions of a tensor. Useful for switching axes, like converting between column-major and row-major.
`squeeze`	Removes dimensions of size 1 from a tensor.
`unsqueeze`	Adds a dimension of size 1 at the specified position, effectively increasing the tensor’s rank.
`pad`	Adds padding to a tensor along specified dimensions. Useful in tasks like image processing.

The following cell shows the usage of torch.Tensor.shape to print the dimensionality of the tensor and torch.Tensor.reshape to change the dimensionality.

test_tensor = torch.zeros(2, 5, 3)
print('torch.Tensor.shape', test_tensor.shape)
test_tensor.reshape(6, 5)

torch.Tensor.shape torch.Size([2, 5, 3])

tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])

Indexing#

Indexing in torch supports all the classic condepts, just like in numpy or pandas. But there are some features specific for torch findout more in the specific page.

The following example shows the most basic methods of indexing in Torch. For the first dimensionality it takes all available elements, for the second it takes the slice 0:5:2 and along the last dimensionality it takes the elements counted in list.

dimentionality = (2,6,6)
experimental = torch.arange(prod(dimentionality)).reshape(dimentionality)

print("Original tensor")
print(experimental)
print("Sliced tensor")
print(experimental[:, 0:5:2, [5,1,4]])

Original tensor
tensor([[[ 0,  1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10, 11],
         [12, 13, 14, 15, 16, 17],
         [18, 19, 20, 21, 22, 23],
         [24, 25, 26, 27, 28, 29],
         [30, 31, 32, 33, 34, 35]],

        [[36, 37, 38, 39, 40, 41],
         [42, 43, 44, 45, 46, 47],
         [48, 49, 50, 51, 52, 53],
         [54, 55, 56, 57, 58, 59],
         [60, 61, 62, 63, 64, 65],
         [66, 67, 68, 69, 70, 71]]])
Sliced tensor
tensor([[[ 5,  1,  4],
         [17, 13, 16],
         [29, 25, 28]],

        [[41, 37, 40],
         [53, 49, 52],
         [65, 61, 64]]])

Element-wise operations#

There is a class of operations in Pytorch that are applied element by element - we’ll call them element wise operations.

The following table lists most of them and corresponding to them operators.

Operation	Function	Operator
Addition	`torch.add(tensor1, tensor2)`	`+`
Subtraction	`torch.sub(tensor1, tensor2)`	`-`
Multiplication	`torch.mul(tensor1, tensor2)`	`*`
Division	`torch.div(tensor1, tensor2)`	`/`
Equality	`torch.eq(tensor1, tensor2)`	`==`
Inequality	`torch.ne(tensor1, tensor2)`	`!=`
Greater Than	`torch.gt(tensor1, tensor2)`	`>`
Less Than	`torch.lt(tensor1, tensor2)`	`<`
Greater or Equal	`torch.ge(tensor1, tensor2)`	`>=`
Less or Equal	`torch.le(tensor1, tensor2)`	`<=`
Logical AND	`torch.logical_and(tensor1, tensor2)`	`&`
Logical OR	`torch.logical_or(tensor1, tensor2)`	`
Logical XOR	`torch.logical_xor(tensor1, tensor2)`	`^`
Logical NOT	`torch.logical_not(tensor)`	`~`
Exponentiation	`torch.pow(tensor1, tensor2)`	`**` or `^`
Square Root	`torch.sqrt(tensor)`	N/A

As a brief review, consider two matrices \(A = [a_{ij}]_{n \times m}\) and \(B = [b_{ij}]_{n \times m}\).

A = torch.randint(-5,5, [4,5])
A

tensor([[-5, -2,  4,  0, -4],
        [-1, -4,  1, -5, -5],
        [-4,  1,  4, -5, -5],
        [-2,  2,  0,  1,  0]])

B = torch.randint(-5,5, [4,5])
B

tensor([[-1, -3,  1, -4, -4],
        [-4,  4, -4, -4, -3],
        [ 3, -2,  0, -1,  0],
        [ 4, -2,  0,  2,  1]])

By applying the + operator to matrices we got the matrix \(\left[a_{ij} + b_{ij}\right]_{n \times m}\) - so the operation was applied element by element.

A + B

tensor([[-6, -5,  5, -4, -8],
        [-5,  0, -3, -9, -8],
        [-1, -1,  4, -6, -5],
        [ 2,  0,  0,  3,  1]])

Broadcasting#

Element-wise operations in PyTorch support the incredibly convenient concept of broadcasting, allowing you to apply these operations to arrays with different dimensionalities.

As an example, consider two tensors with different dimensionality, zero_tensor and arange_tensor.

zero_tensor = torch.zeros(3,4)
zero_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

arange_tensor = torch.arange(4)
arange_tensor

tensor([0, 1, 2, 3])

By applying the + operator to them, we got a result where each row of zeros_tensor was added arange_tensor element-wise.

zero_tensor + arange_tensor

tensor([[0., 1., 2., 3.],
        [0., 1., 2., 3.],
        [0., 1., 2., 3.]])

Algebraic operations#

The following table lists algebraic operations on torch.Tensor.

Operation	Function	Description
Matrix Multiplication	`torch.matmul()`	Matrix multiplication
	`tensor1 @ tensor2`	Matrix multiplication using `@` operator
Singular Value Decomposition	`torch.svd()`	Singular Value Decomposition (SVD)
	`torch.linalg.svd()`	SVD with advanced options
Eigenvalues and Eigenvectors	`torch.eig()`	Compute eigenvalues and eigenvectors
	`torch.linalg.eig()`	Eigenvalues and eigenvectors (advanced)
Matrix Inversion	`torch.linalg.inv()`	Matrix inversion
	`torch.inverse()`	Matrix inversion (deprecated)
Matrix Norms	`torch.norm()`	Compute the norm of a tensor
	`torch.linalg.norm()`	Norm with advanced options
Determinants	`torch.det()`	Compute the determinant of a matrix
	`torch.linalg.det()`	Determinant with advanced options
Matrix Trace	`torch.trace()`	Compute the trace of a matrix
Eigenvalues	`torch.linalg.eigvals()`	Compute eigenvalues of a square matrix
Matrix Rank	`torch.linalg.matrix_rank()`	Compute the rank of a matrix
Cholesky Decomposition	`torch.linalg.cholesky()`	Cholesky decomposition
QR Decomposition	`torch.linalg.qr()`	QR decomposition
Solving Linear Systems	`torch.linalg.solve()`	Solve a system of linear equations
	`torch.linalg.lstsq()`	Solve a least-squares problem
Kronecker Product	`torch.kron()`	Compute the Kronecker product

Data type#

Torch has it’s own system of data types. Here is a table that describes the available datatypes.

Type	Description
`torch.float16` / `torch.half`	16-bit half precision (floating point)
`torch.float32` / `torch.float`	32-bit single precision (floating point)
`torch.float64` / `torch.double`	64-bit double precision (floating point)
`torch.int8`	8-bit integer
`torch.int16` / `torch.short`	16-bit integer
`torch.int32` / `torch.int`	32-bit integer
`torch.int64` / `torch.long`	64-bit integer
`torch.uint8`	8-bit unsigned integer
`torch.bool`	Boolean type
`torch.complex64`	64-bit complex number (32-bit real and imaginary)
`torch.complex128`	128-bit complex number (64-bit real and imaginary)

You can get type of the tensor by using dtype field.

torch.Tensor(3, 3).dtype

torch.float32

Note there are some functions in torch that have dtype parameter. By passing special torch objects as dtype arguments, we can get tensors of the specific dtype.

torch.tensor([1,2,3], dtype=torch.float16)

tensor([1., 2., 3.], dtype=torch.float16)

torch.zeros((3,3), dtype=torch.float16)

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]], dtype=torch.float16)

Inplace methods#

Some methods of torch.Tensor allow you to change the value of the tensor on the fly. It’s typical for such methods to have the underscore symbol _ at the end of the name.

Method	Description
`add_`	Adds the input tensor to the current tensor in place.
`addcmul_`	Performs a component-wise multiplication of two tensors and adds the result to the current tensor in place.
`addcdiv_`	Performs a component-wise division of two tensors and adds the result to the current tensor in place.
`bernoulli_`	Applies the Bernoulli distribution to the tensor in place.
`bmm_`	Performs batch matrix multiplication in place.
`clamp_`	Clamps all elements in the input tensor to be within the specified range, in place.
`copy_`	Copies data from another tensor to the current tensor in place.
`div_`	Divides the current tensor by the input tensor in place.
`fill_`	Fills the tensor with the specified value in place.
`flatten_`	Flattens the tensor to a 1D tensor in place.
`index_add_`	Adds values to the tensor at specified indices in place.
`index_fill_`	Fills the tensor at specified indices with the given value in place.
`index_copy_`	Copies values from another tensor into the current tensor at specified indices, in place.
`mask_fill_`	Fills elements of the tensor where the mask is `True` with the specified value in place.
`mask_scatter_`	Scatters values into the tensor at indices specified by the mask in place.
`masked_fill_`	Fills elements of the tensor where the mask is `True` with the specified value in place.
`masked_scatter_`	Scatters values into the tensor where the mask is `True` in place.
`neg_`	Negates the tensor’s values in place.
`normal_`	Fills the tensor with random numbers from a normal distribution in place.
`relu_`	Applies the ReLU activation function in place.
`renorm_`	Renormalizes the tensor along a specified dimension in place.
`scatter_`	Scatters values into the tensor at specified indices in place.
`select_`	Selects a sub-tensor in place (used for slicing).
`set_`	Sets tensor values based on other tensors or values in place.
`sigmoid_`	Applies the sigmoid function in place.
`softmax_`	Applies the softmax function in place along a specified dimension.
`sub_`	Subtracts the input tensor from the current tensor in place.
`t_`	Transposes the tensor in place (2D tensors only).
`transpose_`	Transposes the tensor along specified dimensions in place.
`truncate_`	Truncates tensor values to a specified precision in place.
`zero_`	Sets all elements of the tensor to zero in place.

Here is an example of applying the relu transformation to the tensor.

my_tensor = torch.arange(-1,1,0.1)
my_tensor.relu_()
my_tensor

tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
        0.0000, 0.0000, 0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000,
        0.8000, 0.9000])

Aggregations#

Torch provides several aggregation methods, which reduce an array of numbers to a single value. The following table shows some of the methods.

Method	Description
`min()`	Returns the minimum value of all elements in the tensor.
`max()`	Returns the maximum value of all elements in the tensor.
`argmin()`	Returns the index of the minimum value.
`argmax()`	Returns the index of the maximum value.
`sum()`	Returns the sum of all elements in the tensor.
`mean()`	Returns the mean of all elements in the tensor.
`median()`	Returns the median value of elements in the tensor.
`prod()`	Returns the product of all elements in the tensor.
`std()`	Returns the standard deviation of the tensor elements.
`var()`	Returns the variance of the tensor elements.
`all()`	Tests if all elements evaluate to `True`.
`any()`	Tests if any element evaluates to `True`.

The following cell generates a Torch matrix where each row has larger average elements than the previous one. We’ll use this array to explore the principles of aggregation methods in Torch.

example_tensor = torch.cat(
    [torch.normal(i, 3, [1, 20]) for i in range(20)],
    dim=0
)

tensor(23.6230)

Applying the sum method to the array returns a zero-dimensional tensor representing the sum of all elements in the tensor.

example_tensor.sum()

tensor(3794.5530)

After specifying the dim parameter, the aggregation will occur along the chosen axis. The following cell aggregates each row.

example_tensor.sum(dim=1)

tensor([-16.4698,   9.2714,  31.5841,  59.7604,  78.6830, 102.7903, 139.8927,
        122.9686, 162.2701, 190.2476, 211.2355, 211.3985, 243.5579, 275.2428,
        262.9107, 298.6805, 310.6557, 327.2797, 371.0268, 401.5668])

As a result, we obtained an array where each element is larger than the previous one—this aligns with the principles we used during the generation process.

Concatenation/splitting#

There is a bunch of options that allows to consider set of tensors as unified tensor or vise versa difide one big tensor in to smaller pieces. All these functions have a bit different functional. Find out more in the particual page.

Function	Description
`torch.stack`	Concatenates tensors along a new dimension.
`torch.cat`	Concatenates tensors along an existing dimension.
`torch.vstack`	Stacks tensors vertically (along the first dimension).
`torch.hstack`	Stacks tensors horizontally (along the last dimension).
`torch.dstack`	Stacks tensors along a new third dimension (for 2D tensors, stacks along depth).
`torch.chunk`	Splits a tensor into a specified number of chunks along a given dimension.
`torch.split`	Splits a tensor into sub-tensors based on a list of sizes along a specified dimension.
`torch.repeat`	Repeats a tensor along specified dimensions to increase its size.

As an example, consider the tensor created in the following cell.

input = torch.randn([3, 3])
input

tensor([[ 0.7793, -0.2321, -1.1366],
        [ 0.4977, -0.3425,  1.0177],
        [ 0.1320,  0.8735, -1.2108]])

Using torch.chunk, the tensor can be transformed into a tuple of smaller tensors, where each tensor represents a line from the original tensor.

chunks = torch.chunk(A, chunks=3)
chunks

(tensor([[1.1286, 0.5528, 0.7930]]),
 tensor([[-1.9157,  0.3871, -0.5575]]),
 tensor([[ 0.6818,  0.2752, -1.2651]]))

Using torch.stack, we can join the smaller tensors back into one tensor, but now along a new axis.

torch.stack(chunks)

tensor([[[ 1.1286,  0.5528,  0.7930]],

        [[-1.9157,  0.3871, -0.5575]],

        [[ 0.6818,  0.2752, -1.2651]]])

Gather#

torch.gather fucntion allows to select some values of the input tensor by indices and speciyf output form.

Some notations are required for a more precise description:

\(X\): input tensor of shape \((s_0, s_1, \dots, s_{n-1})\).
An index tesnsor \(I\) and output tensor \(O\): both having dimentions \((s'_0, s'_1, \dots, s'_{n-1})\).
\(dim\): specified dimentions of the input to which will be substituted indeces from \(I\).

Operation can be written:

\[O \left[i_0, i_1, \dots, i_{n-1} \right] = X\left[ i_0, i_1, \dots, i_{dim-1}, I\left[i_0, i_1, \dots, i_{n-1} \right], i_{dim+1}, \dots, i_{n-1} \right]\]

So values from \(I\) is substituted to the \(dim\) index of the \(X\), it formulates \(O\).

Consider example:

\[ \begin{align}\begin{aligned}\begin{split} X = \left( \begin{array}{cc} 1&2&3 \\ 4&5&6 \end{array} \right),\end{split}\\\begin{split}I = \left( \begin{array}{cc} 0&1 \\ 1&2 \end{array} \right),\end{split}\\dim=1 \end{aligned}\end{align} \]

X = torch.Tensor([
    [1,2,3],
    [4,5,6]
])
I = torch.tensor([
    [0, 1],
    [1, 2]
])

torch.gather(X, 1, I)

tensor([[1., 2.],
        [5., 6.]])

\(dim=1\) so that all \(I\) indices are replaced by the corresponding index of \(X\) and form the same shape \(O\).

\[\begin{split} O = \left( \begin{array}{cc} X[0, I[0,0]]&X[0, I[0,1]] \\ X[1, I[1,0]]&X[1, I[1,1]] \end{array} \right) = \left( \begin{array}{cc} X[0, 0]&X[0, 1] \\ X[1, 1]&X[1, 2] \end{array} \right) = \left( \begin{array}{cc} 1&2 \\ 5&6 \end{array} \right) \end{split}\]