Tensor#

Here is a description of the basic data type in PyTorch: torch.Tensor.

import torch
from math import prod

Create tensor#

Torch has tons of methods to create tensors. This page lists the methods I know for now.


The most straightforward way is to use torch.tensor.

torch.Tensor(3,2,5)
tensor([[[ 0.0000e+00,  0.0000e+00,  1.4013e-45,  0.0000e+00,  1.4013e-45],
         [ 0.0000e+00,  9.1084e-44,  0.0000e+00, -3.7852e+06,  3.3707e-41]],

        [[-7.6466e+07,  3.3707e-41,  4.4842e-44,  0.0000e+00,  4.4842e-44],
         [ 0.0000e+00,  6.3884e-27,  3.3703e-41,  0.0000e+00,  1.4013e-45]],

        [[ 1.3004e-42,  0.0000e+00,  1.1210e-43,  0.0000e+00,  6.4326e-27],
         [ 3.3703e-41,  4.2427e-08,  1.2964e+16,  2.1707e-18,  7.0952e+22]]])

Dimentionality#

One of the most important properties of the tensor is it’s dimensionality. Torch provides a set of tools to manage tensor dimensionality. Find out more on the dedicated page.

This section overviews tools that allows to work with diemntionality of the tensors. They are listed in the follwing table:

Function/Method

Description

shape

Returns the shape (dimensions) of a tensor as a tuple.

reshape

Changes the shape of a tensor while preserving its data. The number of elements must remain the same.

transpose

Permutes the dimensions of a tensor. Useful for switching axes, like converting between column-major and row-major.

squeeze

Removes dimensions of size 1 from a tensor.

unsqueeze

Adds a dimension of size 1 at the specified position, effectively increasing the tensor’s rank.

pad

Adds padding to a tensor along specified dimensions. Useful in tasks like image processing.


The following cell shows the usage of torch.Tensor.shape to print the dimensionality of the tensor and torch.Tensor.reshape to change the dimensionality.

test_tensor = torch.zeros(2, 5, 3)
print('torch.Tensor.shape', test_tensor.shape)
test_tensor.reshape(6, 5)
torch.Tensor.shape torch.Size([2, 5, 3])
tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])

Indexing#

Indexing in torch supports all the classic condepts, just like in numpy or pandas. But there are some features specific for torch findout more in the specific page.


The following example shows the most basic methods of indexing in Torch. For the first dimensionality it takes all available elements, for the second it takes the slice 0:5:2 and along the last dimensionality it takes the elements counted in list.

dimentionality = (2,6,6)
experimental = torch.arange(prod(dimentionality)).reshape(dimentionality)

print("Original tensor")
print(experimental)
print("Sliced tensor")
print(experimental[:, 0:5:2, [5,1,4]])
Original tensor
tensor([[[ 0,  1,  2,  3,  4,  5],
         [ 6,  7,  8,  9, 10, 11],
         [12, 13, 14, 15, 16, 17],
         [18, 19, 20, 21, 22, 23],
         [24, 25, 26, 27, 28, 29],
         [30, 31, 32, 33, 34, 35]],

        [[36, 37, 38, 39, 40, 41],
         [42, 43, 44, 45, 46, 47],
         [48, 49, 50, 51, 52, 53],
         [54, 55, 56, 57, 58, 59],
         [60, 61, 62, 63, 64, 65],
         [66, 67, 68, 69, 70, 71]]])
Sliced tensor
tensor([[[ 5,  1,  4],
         [17, 13, 16],
         [29, 25, 28]],

        [[41, 37, 40],
         [53, 49, 52],
         [65, 61, 64]]])

Element-wise operations#

There is a class of operations in Pytorch that are applied element by element - we’ll call them element wise operations.

The following table lists most of them and corresponding to them operators.

Operation

Function

Operator

Addition

torch.add(tensor1, tensor2)

+

Subtraction

torch.sub(tensor1, tensor2)

-

Multiplication

torch.mul(tensor1, tensor2)

*

Division

torch.div(tensor1, tensor2)

/

Equality

torch.eq(tensor1, tensor2)

==

Inequality

torch.ne(tensor1, tensor2)

!=

Greater Than

torch.gt(tensor1, tensor2)

>

Less Than

torch.lt(tensor1, tensor2)

<

Greater or Equal

torch.ge(tensor1, tensor2)

>=

Less or Equal

torch.le(tensor1, tensor2)

<=

Logical AND

torch.logical_and(tensor1, tensor2)

&

Logical OR

torch.logical_or(tensor1, tensor2)

`

Logical XOR

torch.logical_xor(tensor1, tensor2)

^

Logical NOT

torch.logical_not(tensor)

~

Exponentiation

torch.pow(tensor1, tensor2)

** or ^

Square Root

torch.sqrt(tensor)

N/A


As a brief review, consider two matrices \(A = [a_{ij}]_{n \times m}\) and \(B = [b_{ij}]_{n \times m}\).

A = torch.randint(-5,5, [4,5])
A
tensor([[-5, -2,  4,  0, -4],
        [-1, -4,  1, -5, -5],
        [-4,  1,  4, -5, -5],
        [-2,  2,  0,  1,  0]])
B = torch.randint(-5,5, [4,5])
B
tensor([[-1, -3,  1, -4, -4],
        [-4,  4, -4, -4, -3],
        [ 3, -2,  0, -1,  0],
        [ 4, -2,  0,  2,  1]])

By applying the + operator to matrices we got the matrix \(\left[a_{ij} + b_{ij}\right]_{n \times m}\) - so the operation was applied element by element.

A + B
tensor([[-6, -5,  5, -4, -8],
        [-5,  0, -3, -9, -8],
        [-1, -1,  4, -6, -5],
        [ 2,  0,  0,  3,  1]])

Broadcasting#

Element-wise operations in PyTorch support the incredibly convenient concept of broadcasting, allowing you to apply these operations to arrays with different dimensionalities.


As an example, consider two tensors with different dimensionality, zero_tensor and arange_tensor.

zero_tensor = torch.zeros(3,4)
zero_tensor
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])
arange_tensor = torch.arange(4)
arange_tensor
tensor([0, 1, 2, 3])

By applying the + operator to them, we got a result where each row of zeros_tensor was added arange_tensor element-wise.

zero_tensor + arange_tensor
tensor([[0., 1., 2., 3.],
        [0., 1., 2., 3.],
        [0., 1., 2., 3.]])

Algebraic operations#

The following table lists algebraic operations on torch.Tensor.

Operation

Function

Description

Matrix Multiplication

torch.matmul()

Matrix multiplication

tensor1 @ tensor2

Matrix multiplication using @ operator

Singular Value Decomposition

torch.svd()

Singular Value Decomposition (SVD)

torch.linalg.svd()

SVD with advanced options

Eigenvalues and Eigenvectors

torch.eig()

Compute eigenvalues and eigenvectors

torch.linalg.eig()

Eigenvalues and eigenvectors (advanced)

Matrix Inversion

torch.linalg.inv()

Matrix inversion

torch.inverse()

Matrix inversion (deprecated)

Matrix Norms

torch.norm()

Compute the norm of a tensor

torch.linalg.norm()

Norm with advanced options

Determinants

torch.det()

Compute the determinant of a matrix

torch.linalg.det()

Determinant with advanced options

Matrix Trace

torch.trace()

Compute the trace of a matrix

Eigenvalues

torch.linalg.eigvals()

Compute eigenvalues of a square matrix

Matrix Rank

torch.linalg.matrix_rank()

Compute the rank of a matrix

Cholesky Decomposition

torch.linalg.cholesky()

Cholesky decomposition

QR Decomposition

torch.linalg.qr()

QR decomposition

Solving Linear Systems

torch.linalg.solve()

Solve a system of linear equations

torch.linalg.lstsq()

Solve a least-squares problem

Kronecker Product

torch.kron()

Compute the Kronecker product

Data type#

Torch has it’s own system of data types. Here is a table that describes the available datatypes.

Type

Description

torch.float16 / torch.half

16-bit half precision (floating point)

torch.float32 / torch.float

32-bit single precision (floating point)

torch.float64 / torch.double

64-bit double precision (floating point)

torch.int8

8-bit integer

torch.int16 / torch.short

16-bit integer

torch.int32 / torch.int

32-bit integer

torch.int64 / torch.long

64-bit integer

torch.uint8

8-bit unsigned integer

torch.bool

Boolean type

torch.complex64

64-bit complex number (32-bit real and imaginary)

torch.complex128

128-bit complex number (64-bit real and imaginary)


You can get type of the tensor by using dtype field.

torch.Tensor(3, 3).dtype
torch.float32

Note there are some functions in torch that have dtype parameter. By passing special torch objects as dtype arguments, we can get tensors of the specific dtype.

torch.tensor([1,2,3], dtype=torch.float16)
tensor([1., 2., 3.], dtype=torch.float16)
torch.zeros((3,3), dtype=torch.float16)
tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]], dtype=torch.float16)

Inplace methods#

Some methods of torch.Tensor allow you to change the value of the tensor on the fly. It’s typical for such methods to have the underscore symbol _ at the end of the name.

Method

Description

add_

Adds the input tensor to the current tensor in place.

addcmul_

Performs a component-wise multiplication of two tensors and adds the result to the current tensor in place.

addcdiv_

Performs a component-wise division of two tensors and adds the result to the current tensor in place.

bernoulli_

Applies the Bernoulli distribution to the tensor in place.

bmm_

Performs batch matrix multiplication in place.

clamp_

Clamps all elements in the input tensor to be within the specified range, in place.

copy_

Copies data from another tensor to the current tensor in place.

div_

Divides the current tensor by the input tensor in place.

fill_

Fills the tensor with the specified value in place.

flatten_

Flattens the tensor to a 1D tensor in place.

index_add_

Adds values to the tensor at specified indices in place.

index_fill_

Fills the tensor at specified indices with the given value in place.

index_copy_

Copies values from another tensor into the current tensor at specified indices, in place.

mask_fill_

Fills elements of the tensor where the mask is True with the specified value in place.

mask_scatter_

Scatters values into the tensor at indices specified by the mask in place.

masked_fill_

Fills elements of the tensor where the mask is True with the specified value in place.

masked_scatter_

Scatters values into the tensor where the mask is True in place.

neg_

Negates the tensor’s values in place.

normal_

Fills the tensor with random numbers from a normal distribution in place.

relu_

Applies the ReLU activation function in place.

renorm_

Renormalizes the tensor along a specified dimension in place.

scatter_

Scatters values into the tensor at specified indices in place.

select_

Selects a sub-tensor in place (used for slicing).

set_

Sets tensor values based on other tensors or values in place.

sigmoid_

Applies the sigmoid function in place.

softmax_

Applies the softmax function in place along a specified dimension.

sub_

Subtracts the input tensor from the current tensor in place.

t_

Transposes the tensor in place (2D tensors only).

transpose_

Transposes the tensor along specified dimensions in place.

truncate_

Truncates tensor values to a specified precision in place.

zero_

Sets all elements of the tensor to zero in place.

Here is an example of applying the relu transformation to the tensor.

my_tensor = torch.arange(-1,1,0.1)
my_tensor.relu_()
my_tensor
tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
        0.0000, 0.0000, 0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000,
        0.8000, 0.9000])

Aggregations#

Torch provides several aggregation methods, which reduce an array of numbers to a single value. The following table shows some of the methods.

Method

Description

min()

Returns the minimum value of all elements in the tensor.

max()

Returns the maximum value of all elements in the tensor.

argmin()

Returns the index of the minimum value.

argmax()

Returns the index of the maximum value.

sum()

Returns the sum of all elements in the tensor.

mean()

Returns the mean of all elements in the tensor.

median()

Returns the median value of elements in the tensor.

prod()

Returns the product of all elements in the tensor.

std()

Returns the standard deviation of the tensor elements.

var()

Returns the variance of the tensor elements.

all()

Tests if all elements evaluate to True.

any()

Tests if any element evaluates to True.


The following cell generates a Torch matrix where each row has larger average elements than the previous one. We’ll use this array to explore the principles of aggregation methods in Torch.

example_tensor = torch.cat(
    [torch.normal(i, 3, [1, 20]) for i in range(20)],
    dim=0
)
tensor(23.6230)

Applying the sum method to the array returns a zero-dimensional tensor representing the sum of all elements in the tensor.

example_tensor.sum()
tensor(3794.5530)

After specifying the dim parameter, the aggregation will occur along the chosen axis. The following cell aggregates each row.

example_tensor.sum(dim=1)
tensor([-16.4698,   9.2714,  31.5841,  59.7604,  78.6830, 102.7903, 139.8927,
        122.9686, 162.2701, 190.2476, 211.2355, 211.3985, 243.5579, 275.2428,
        262.9107, 298.6805, 310.6557, 327.2797, 371.0268, 401.5668])

As a result, we obtained an array where each element is larger than the previous one—this aligns with the principles we used during the generation process.

Concatenation/splitting#

There is a bunch of options that allows to consider set of tensors as unified tensor or vise versa difide one big tensor in to smaller pieces. All these functions have a bit different functional. Find out more in the particual page.

Function

Description

torch.stack

Concatenates tensors along a new dimension.

torch.cat

Concatenates tensors along an existing dimension.

torch.vstack

Stacks tensors vertically (along the first dimension).

torch.hstack

Stacks tensors horizontally (along the last dimension).

torch.dstack

Stacks tensors along a new third dimension (for 2D tensors, stacks along depth).

torch.chunk

Splits a tensor into a specified number of chunks along a given dimension.

torch.split

Splits a tensor into sub-tensors based on a list of sizes along a specified dimension.

torch.repeat

Repeats a tensor along specified dimensions to increase its size.


As an example, consider the tensor created in the following cell.

input = torch.randn([3, 3])
input
tensor([[ 0.7793, -0.2321, -1.1366],
        [ 0.4977, -0.3425,  1.0177],
        [ 0.1320,  0.8735, -1.2108]])

Using torch.chunk, the tensor can be transformed into a tuple of smaller tensors, where each tensor represents a line from the original tensor.

chunks = torch.chunk(A, chunks=3)
chunks
(tensor([[1.1286, 0.5528, 0.7930]]),
 tensor([[-1.9157,  0.3871, -0.5575]]),
 tensor([[ 0.6818,  0.2752, -1.2651]]))

Using torch.stack, we can join the smaller tensors back into one tensor, but now along a new axis.

torch.stack(chunks)
tensor([[[ 1.1286,  0.5528,  0.7930]],

        [[-1.9157,  0.3871, -0.5575]],

        [[ 0.6818,  0.2752, -1.2651]]])

Gather#

torch.gather fucntion allows to select some values of the input tensor by indices and speciyf output form.

Some notations are required for a more precise description:

  • \(X\): input tensor of shape \((s_0, s_1, \dots, s_{n-1})\).

  • An index tesnsor \(I\) and output tensor \(O\): both having dimentions \((s'_0, s'_1, \dots, s'_{n-1})\).

  • \(dim\): specified dimentions of the input to which will be substituted indeces from \(I\).

Operation can be written:

\[O \left[i_0, i_1, \dots, i_{n-1} \right] = X\left[ i_0, i_1, \dots, i_{dim-1}, I\left[i_0, i_1, \dots, i_{n-1} \right], i_{dim+1}, \dots, i_{n-1} \right]\]

So values from \(I\) is substituted to the \(dim\) index of the \(X\), it formulates \(O\).


Consider example:

\[ \begin{align}\begin{aligned}\begin{split} X = \left( \begin{array}{cc} 1&2&3 \\ 4&5&6 \end{array} \right),\end{split}\\\begin{split}I = \left( \begin{array}{cc} 0&1 \\ 1&2 \end{array} \right),\end{split}\\dim=1 \end{aligned}\end{align} \]
X = torch.Tensor([
    [1,2,3],
    [4,5,6]
])
I = torch.tensor([
    [0, 1],
    [1, 2]
])

torch.gather(X, 1, I)
tensor([[1., 2.],
        [5., 6.]])

\(dim=1\) so that all \(I\) indices are replaced by the corresponding index of \(X\) and form the same shape \(O\).

\[\begin{split} O = \left( \begin{array}{cc} X[0, I[0,0]]&X[0, I[0,1]] \\ X[1, I[1,0]]&X[1, I[1,1]] \end{array} \right) = \left( \begin{array}{cc} X[0, 0]&X[0, 1] \\ X[1, 1]&X[1, 2] \end{array} \right) = \left( \begin{array}{cc} 1&2 \\ 5&6 \end{array} \right) \end{split}\]