Represents a matrix multiplication operation node in a computational graph. More...

Inheritance diagram for nz::nodes::calc::MatMulNode:

Collaboration diagram for nz::nodes::calc::MatMulNode:

Public Member Functions
	MatMulNode (Node input_left, Node input_right)
	Constructor to initialize a `MatMulNode` for matrix multiplication.

void	forward () override
	Forward pass for the `MatMulNode` to perform matrix multiplication.

void	backward () override
	Backward pass for the `MatMulNode` to propagate gradients.

Public Member Functions inherited from nz::nodes::Node
virtual void	print (std::ostream &os) const
	Prints the type, data, and gradient of the node.

void	dataInject (Tensor::value_type *data, bool grad=false) const
	Injects data into a relevant tensor object, optionally setting its gradient requirement.

template<typename Iterator >
void	dataInject (Iterator begin, Iterator end, const bool grad=false) const
	Injects data from an iterator range into the output tensor of the InputNode, optionally setting its gradient requirement.

void	dataInject (const std::initializer_list< Tensor::value_type > &data, bool grad=false) const
	Injects data from a std::initializer_list into the output tensor of the Node, optionally setting its gradient requirement.

Detailed Description

Represents a matrix multiplication operation node in a computational graph.

The MatMulNode class performs matrix multiplication between two input tensors. It implements the matrix multiplication operation in the forward pass, and propagates the gradients during the backward pass. This node is typically used to represent fully connected layers or other linear algebraic operations in a neural network or computational graph. The node now leverages Tensor Cores for efficient half-precision matrix multiplication, improving performance during forward and backward passes.

Key features:

Forward Pass: The forward() method computes the matrix multiplication of two input tensors and stores the result in the output tensor. The computation is accelerated using Tensor Cores with half-precision (FP16) to speed up matrix multiplication operations.
Backward Pass: The backward() method propagates the gradients from the output tensor to the input tensors using the chain rule of calculus.
Shape Check: The constructor ensures that the number of columns in the left input tensor matches the number of rows in the right input tensor, as required for matrix multiplication.

This class is part of the nz::nodes namespace and is used for matrix operations in a computational graph.

Note

The left input tensor's number of columns must match the right input tensor's number of rows.
The matrix multiplication operation in this node uses Tensor Cores for faster computation using half-precision floating-point arithmetic (FP16).

Usage Example:

// Example 1: Creating and using a MatMulNode
InputNode input1({3, 2}, true);  // Create the first input node with shape {3, 2}
input1.output->fill(1.0f);  // Fill the tensor with value 1.0
 
InputNode input2({2, 3}, true);  // Create the second input node with shape {2, 3}
input2.output->fill(2.0f);  // Fill the tensor with value 2.0
 
MatMulNode matmul_node(&input1, &input2);  // Create a MatMulNode using the two input nodes
matmul_node.forward();  // Perform the forward pass: output = input1 * input2
matmul_node.backward();  // Perform the backward pass: propagate gradients
 
std::cout << "Output: " << *matmul_node.output << std::endl;  // Print the output tensor

See also: forward() for the forward pass computation method.; backward() for the backward pass gradient propagation method.

Author: Mgepahmge (https://github.com/Mgepahmge)

Date: 2024/11/29

Definition at line 1060 of file Nodes.cuh.

Constructor & Destructor Documentation

◆ MatMulNode()

nz::nodes::calc::MatMulNode::MatMulNode	(	Node *	input_left,
		Node *	input_right )

Constructor to initialize a MatMulNode for matrix multiplication.

This constructor initializes an MatMulNode which performs matrix multiplication between the outputs of two input nodes. It ensures that the shapes of the two input tensors are compatible for matrix multiplication. Specifically, the number of columns of the left input tensor must match the number of rows of the right input tensor. If the shapes do not match, an exception is thrown. The constructor also initializes the output tensor with the appropriate shape based on the input tensors and sets the requires_grad flag based on the input tensors' gradient tracking requirements.

Parameters

input_left	A pointer to the first input node. Its `output` tensor is used for the matrix multiplication.
input_right	A pointer to the second input node. Its `output` tensor is used for the matrix multiplication.

The constructor checks that the number of columns in the left input tensor (input_left->output->shape()[1]) matches the number of rows in the right input tensor (input_right->output->shape()[0]), as required for matrix multiplication. The output tensor is created with the shape (input_left->output->shape()[0], input_right->output->shape()[1]), and the requires_grad flag is set to true if either of the input tensors requires gradients.

Exceptions

std::invalid_argument If the shapes of the input tensors are not compatible for matrix multiplication.

Note

The left input tensor's column count must match the right input tensor's row count for matrix multiplication.
The constructor ensures that the output tensor has the correct shape to hold the result of the matrix multiplication.
The requires_grad flag for the output tensor is set based on the gradient requirements of the input tensors.

Author: Mgepahmge (https://github.com/Mgepahmge)

Date: 2024/11/29

Definition at line 148 of file Nodes.cu.

Member Function Documentation

◆ backward()

void nz::nodes::calc::MatMulNode::backward ( )

overridevirtual

Backward pass for the MatMulNode to propagate gradients.

The backward() method computes the gradients of the input tensors with respect to the output tensor for the matrix multiplication operation. During the backward pass, the gradients of the output tensor are propagated back to the two input tensors. The gradient computation follows the chain rule of calculus.

Specifically:

For the left input tensor (A), the gradient is computed as dA = dC * B^T, where dC is the gradient of the output tensor and B^T is the transpose of the right input tensor.
For the right input tensor (B), the gradient is computed as dB = A^T * dC, where A^T is the transpose of the left input tensor and dC is the gradient of the output tensor.

These gradients are computed on the GPU using CUDA kernels (GeneralMatrixMul), which parallelize the matrix operations.

Note

The gradients for both input tensors are computed only if they require gradients (i.e., requiresGrad() is true).
The gradients are computed using the transposes of the input matrices and propagated through the network.
The GeneralMatrixMul kernel is used for efficient gradient computation on the GPU.

See also: forward() for the forward pass computation method.

Author: Mgepahmge (https://github.com/Mgepahmge)

Date: 2024/11/29

Implements nz::nodes::Node.

Definition at line 169 of file Nodes.cu.

Here is the call graph for this function:

◆ forward()

void nz::nodes::calc::MatMulNode::forward ( )

overridevirtual

Forward pass for the MatMulNode to perform matrix multiplication.

The forward() method computes the matrix multiplication between the two input tensors using CUDA, and stores the result in the output tensor. The matrix multiplication is performed using the GeneralMatrixMul kernel on the GPU, which efficiently computes the product of the two matrices in parallel.

This method is called during the forward pass of the neural network. It calculates the matrix product of the left input tensor (inputs[0]) and the right input tensor (inputs[1]), and stores the result in the output tensor. The shape of the output tensor is determined by the number of rows in the left input tensor and the number of columns in the right input tensor.

Note

The kernel GeneralMatrixMul performs the matrix multiplication using parallel computation on the GPU.
The matrix multiplication is performed as M = A * B, where A is the left input tensor and B is the right input tensor.
The block size (TILE_SIZE) and grid size are chosen to ensure efficient GPU parallelization of the operation.

See also: backward() for the backward pass gradient propagation method.

Author: Mgepahmge (https://github.com/Mgepahmge)

Date: 2024/11/29

Implements nz::nodes::Node.

Definition at line 165 of file Nodes.cu.

The documentation for this class was generated from the following files:

D:/Users/Mgepahmge/Documents/C Program/NeuZephyr/include/NeuZephyr/Nodes.cuh
D:/Users/Mgepahmge/Documents/C Program/NeuZephyr/src/Nodes.cu

Public Member Functions

Detailed Description

Usage Example:

Constructor & Destructor Documentation

◆ MatMulNode()

Member Function Documentation

◆ backward()

◆ forward()