Learn about Tensors and how to use them in one of the most famous machine learning libraries, PyTorch
One of most important libraries in the Deep Learning field (and inclusively, where ChatGPT was built upon) is pytorch
. Along with the Tensorflow framework, pytorch
is one of the most famous neural network training frameworks available for software developers and data scientists. Apart from its usability and simple API, it excels in flexibility and memory usage, making it extremely fast in multi-dimensional calculus (one of the major components behind backpropagation, the important technique that is used to optimize Neural Network’s weights) — these details make it one of the most sought after libraries by companies when it comes to build Deep Learning models.
In this blog post, we’re going to check some basic operations using pytorch
and understand how we can work with the tensor
object! Tensors are mathematical representations of data that are commonly addressed by different names:
- 1 element Tensor: commonly called the scalar, consists of a single mathematical value.
- 1-Dimensional Tensor: consisting of n examples, they are normally called 1-D vectors and stores different mathematical elements in a single dimension.
- 2-Dimensional Tensors: commonly called matrices, are able to store data in two dimensions. Think of a normal SQL table or an excel spreadsheet.
- 3-Dimensional Tensors and beyond: Data organized with this dimensionality are normally harder to visualize and are generally called n-dimensional tensors.
With this small introduction on mathematical concepts, let’s explore how to use pytorch
‘s in Python!
The Tensor object
As we’ve described, the tensor object is a mathematical generalization of n-dimensional objects that can expand to virtually any dimension. Although in the context of Deep Learning, tensors
are generally multidimensional, we can also create single element tensors (normally called scalars) using torch
(although named pytorch
, we use the name torch
to manipulate the library in Python).
If tensors are the central object in torch
(or pytorch
) , how can we create them in the library?
Super easy! Let’s create our first single-element tensor:
import torch
scalar = torch.tensor(5)
Our scalar
object contains a single number — 5. Let’s visualize our tensor below by calling it in the Python console:
Fact 1: torch.tensor
is used to create tensor objects
Of course, we are not only tied to single element tensors — we can also create 1-dimensional objects with multiple elements. Let’s pass a list inside thetorch.tensor
and see how that will go:
vector = torch.tensor([7, 7])
vector
Our object vector
now contains two elements along a single dimension. Think as if this data contains 1 single row or a single column of data.
Having “dimensions” allows us to access interesting properties in our tensor — for example ndim
:
vector.ndim
Fact 2: tensor.ndim
is used to obtain the number of dimensions of a tensor object
In our case, the vector
object only has a single dimension. How can we know how many elements our tensor object has? By using another property - shape
!
vector.shape
Fact 3: tensor.shape
is used to obtain the shape of a tensor object
Our tensor object contains two elements in a single dimension. We’ll see how this output compares to multidimensional objects.
torch
tensors also contain a data type attached to it. To know which, we can use:
vector.dtype
Fact 4: tensor.dtype
outputs the object type of our tensor.
Our tensor contains data in int64
format.
Let’s now expand our object into a 2-D tensor:
matrix = torch.tensor([[10.0, 20.0],
[30.0, 40.0]])
matrix
Let’s see some properties about our matrix
object:
print(matrix.ndim)
print(matrix.shape)
print(matrix.dtype)
Our matrix
object contains data in float32
dtype in two dimensions with 2 elements each.
To finish our exploration on creating tensors, let’s see how we generate random tensors using torch.rand
:
torch.rand(size=(4, 4))
For example, in the tensor above, we are generating a 4 by 4 matrix using tensor.rand
. This is a very common operation in the context of deep learning (for example, generating random neural network layer weights to optimize later).
Tensor Operations
Let’s now see how we can perform operations with our tensors. If you’re already familiar with numpy
, this should be pretty easy! Starting with a simple add operation:
tensor = torch.tensor([1, 2, 3])
tensor + 20
Adding a scalar to a tensor is easy — just use the normal mathematical operation! Can you guess how you can multiply a tensor by a scalar?
Easy!
tensor * 10
You can also use the abstraction torch.multiply
:
torch.multiply(tensor, 10)
Two of the most common operations with tensors are the Hadamard and Dot Product, with the latter being one of the most famous calculations that is widely used in the Attention mechanism.
Let’s create two 2-D tensors to check these operations:
tensor_1 = torch.tensor([[1,2,3],[2,3,4]])
tensor_2 = torch.tensor([[1,2],[2,3],[3,4]])
To perform the Hadamard product, the tensor shapes must match. Let’s perform a calculation of tensor_1
with itself:
# Hadamard product
tensor_1 * tensor_1
In case of the dot product, the inner dimensions of the tensors must match. Let’s multiply tensor_1 (a 2x3 tensor) by tensor_2 (a 3x2 tensor):
torch.matmul(tensor_1, tensor_2)
We can also use the elegant @ operation, that does just the same:
tensor_1 @ tensor_2
Tensor Indexing
For our final examples, let’s see how we can pluck certain elements from our tensors. For these examples, we’ll use:
indexing_example = torch.tensor([[10,20,30],[40,50,60],[70,80,90]])
indexing_example
Indexing in pytorch
is similar to other Python objects — let’s try to index the first column:
indexing_example[0,:]
Using 0 index on the []
will give us the ability to extract the first row of the object. The :
symbol enables us to extract all elements from a certain dimension. In our case, we want all elements from the columns (2nd dimension).
Can you guess how to extract the first column? Just switch the position of the indices!
indexing_example[:,0]
For more complex objects, we can also use the same logic. Let’s try to index an element from a 3D tensor
:
indexing_example_3d = torch.tensor([[[10,20,30],[40,50,60],[70,80,90]], [[100,200,300],[400,500,600],[700,800,900]]])
indexing_example_3d
How can we extract the element “100” from this tensor? Let’s see, we want:
- First row
- First column
- Second matrix
Using indexing logic, we can do this easily:
indexing_example_3d[1,0,0]
In torch
, the index ordering for 3d objects is the following: matrix, row, column.
Can you try to index a 4D object?
Bonus —Where is the tensor stored?
One of the advantages of using torch
over other array libraries (such as numpy
) is the ability to save our tensors in the gpu
— this will be particularly useful if we need to speed up neural network calculations.
By default, your tensors are stored on the cpu
(and most computers only have a cpu available) but you can send your tensors to your gpu
by doing the following:
device = "cuda" if torch.cuda.is_available() else "cpu"
If torch.cuda.is_available()
finds a specific NVIDIA gpu in your machine, it will let you send your tensor to it.
Imagining you have a tensor stored in a tensor
named object, you can use the .to
method to send it to the device:
tensor_on_gpu = tensor.to(device)
Conclusion
Thank you for taking the time to read this post! Working with tensors is extremely fun and definitely give you a solid foundation to work with advanced neural networks.
The torch
API is extremely elegant and easy to visualize. Later, you can use these tensors to train Neural Networks (something I’ll show in the next blog posts of this series). Additionally, learning a bit of linear algebra as you go will be extremely helpful to learn some other Data Science and Machine Learning Algorithms.
The inspiration for this post came from
— this is an excellent free course on the Pytorch topic and I definitely recommend it. At DareData, we’ve been involved with a lot of Deep Learning projects and I can’t stress how important this course has been to train our people into learning this Machine Learning paradigm and all frameworks associated with it.
In the next post, we’ll take a look into training a Linear Regression using torch
— stay tuned!
If you would like to know more about how we use torch
in our projects, or would like to capacitate your team, contact me at ivo@daredata.engineering - looking to speaking to you!