![]() |
NeuZephyr
Simple DL Framework
|
Represents a matrix multiplication operation node in a computational graph. More...


Public Member Functions | |
| MatMulNode (Node *input_left, Node *input_right) | |
Constructor to initialize a MatMulNode for matrix multiplication. | |
| void | forward () override |
Forward pass for the MatMulNode to perform matrix multiplication. | |
| void | backward () override |
Backward pass for the MatMulNode to propagate gradients. | |
Public Member Functions inherited from nz::nodes::Node | |
| virtual void | print (std::ostream &os) const |
| Prints the type, data, and gradient of the node. | |
| void | dataInject (Tensor::value_type *data, bool grad=false) const |
| Injects data into a relevant tensor object, optionally setting its gradient requirement. | |
| template<typename Iterator > | |
| void | dataInject (Iterator begin, Iterator end, const bool grad=false) const |
| Injects data from an iterator range into the output tensor of the InputNode, optionally setting its gradient requirement. | |
| void | dataInject (const std::initializer_list< Tensor::value_type > &data, bool grad=false) const |
| Injects data from a std::initializer_list into the output tensor of the Node, optionally setting its gradient requirement. | |
Represents a matrix multiplication operation node in a computational graph.
The MatMulNode class performs matrix multiplication between two input tensors. It implements the matrix multiplication operation in the forward pass, and propagates the gradients during the backward pass. This node is typically used to represent fully connected layers or other linear algebraic operations in a neural network or computational graph. The node now leverages Tensor Cores for efficient half-precision matrix multiplication, improving performance during forward and backward passes.
Key features:
forward() method computes the matrix multiplication of two input tensors and stores the result in the output tensor. The computation is accelerated using Tensor Cores with half-precision (FP16) to speed up matrix multiplication operations.backward() method propagates the gradients from the output tensor to the input tensors using the chain rule of calculus.This class is part of the nz::nodes namespace and is used for matrix operations in a computational graph.
Constructor to initialize a MatMulNode for matrix multiplication.
This constructor initializes an MatMulNode which performs matrix multiplication between the outputs of two input nodes. It ensures that the shapes of the two input tensors are compatible for matrix multiplication. Specifically, the number of columns of the left input tensor must match the number of rows of the right input tensor. If the shapes do not match, an exception is thrown. The constructor also initializes the output tensor with the appropriate shape based on the input tensors and sets the requires_grad flag based on the input tensors' gradient tracking requirements.
| input_left | A pointer to the first input node. Its output tensor is used for the matrix multiplication. |
| input_right | A pointer to the second input node. Its output tensor is used for the matrix multiplication. |
The constructor checks that the number of columns in the left input tensor (input_left->output->shape()[1]) matches the number of rows in the right input tensor (input_right->output->shape()[0]), as required for matrix multiplication. The output tensor is created with the shape (input_left->output->shape()[0], input_right->output->shape()[1]), and the requires_grad flag is set to true if either of the input tensors requires gradients.
| std::invalid_argument | If the shapes of the input tensors are not compatible for matrix multiplication. |
requires_grad flag for the output tensor is set based on the gradient requirements of the input tensors.
|
overridevirtual |
Backward pass for the MatMulNode to propagate gradients.
The backward() method computes the gradients of the input tensors with respect to the output tensor for the matrix multiplication operation. During the backward pass, the gradients of the output tensor are propagated back to the two input tensors. The gradient computation follows the chain rule of calculus.
Specifically:
A), the gradient is computed as dA = dC * B^T, where dC is the gradient of the output tensor and B^T is the transpose of the right input tensor.B), the gradient is computed as dB = A^T * dC, where A^T is the transpose of the left input tensor and dC is the gradient of the output tensor.These gradients are computed on the GPU using CUDA kernels (GeneralMatrixMul), which parallelize the matrix operations.
requiresGrad() is true).GeneralMatrixMul kernel is used for efficient gradient computation on the GPU.Implements nz::nodes::Node.
Definition at line 169 of file Nodes.cu.

|
overridevirtual |
Forward pass for the MatMulNode to perform matrix multiplication.
The forward() method computes the matrix multiplication between the two input tensors using CUDA, and stores the result in the output tensor. The matrix multiplication is performed using the GeneralMatrixMul kernel on the GPU, which efficiently computes the product of the two matrices in parallel.
This method is called during the forward pass of the neural network. It calculates the matrix product of the left input tensor (inputs[0]) and the right input tensor (inputs[1]), and stores the result in the output tensor. The shape of the output tensor is determined by the number of rows in the left input tensor and the number of columns in the right input tensor.
GeneralMatrixMul performs the matrix multiplication using parallel computation on the GPU.M = A * B, where A is the left input tensor and B is the right input tensor.TILE_SIZE) and grid size are chosen to ensure efficient GPU parallelization of the operation.Implements nz::nodes::Node.