Expands tensors with batch size 1 to arbitrary batch dimensions through data replication. More...

Inheritance diagram for nz::nodes::calc::ExpandNode:

Collaboration diagram for nz::nodes::calc::ExpandNode:

Public Member Functions
	ExpandNode (Node *input, Tensor::size_type newBatch)
	Constructs an ExpandNode object.

void	forward () override
	Performs the forward propagation for the ExpandNode.

void	backward () override
	Performs the backward propagation for the ExpandNode.

Public Member Functions inherited from nz::nodes::Node
virtual void	print (std::ostream &os) const
	Prints the type, data, and gradient of the node.

void	dataInject (Tensor::value_type *data, bool grad=false) const
	Injects data into a relevant tensor object, optionally setting its gradient requirement.

template<typename Iterator >
void	dataInject (Iterator begin, Iterator end, const bool grad=false) const
	Injects data from an iterator range into the output tensor of the InputNode, optionally setting its gradient requirement.

void	dataInject (const std::initializer_list< Tensor::value_type > &data, bool grad=false) const
	Injects data from a std::initializer_list into the output tensor of the Node, optionally setting its gradient requirement.

Detailed Description

Expands tensors with batch size 1 to arbitrary batch dimensions through data replication.

The ExpandNode class enables batch dimension expansion by replicating single-instance input data across the batch dimension. This operation is particularly useful for converting single-sample processing networks into batch-processing configurations without modifying core network architecture.

Core operational characteristics:

Batch Dimension Expansion: Duplicates input data along batch dimension to create specified batch size.
Memory Efficiency: Shares underlying data storage through tensor views where possible.
Gradient Aggregation: Implements gradient summation across batch dimension during backward pass.
Device Consistency: Maintains computation device context (CPU/GPU) during expansion operations.
Input Validation: Enforces pre-condition of batch_size=1 for input tensors.

Implementation mechanics:

Forward Propagation: Creates tensor view with expanded batch dimensions through broadcasting.
Backward Propagation: Accumulates gradients across expanded batch dimension via summation.
CUDA Optimization: Leverages CUDA tensor broadcasting for efficient batch replication on GPU.
Shape Preservation: Maintains non-batch dimensions identical to input tensor.

Common application scenarios:

Converting single-sample inference networks to batch processing mode.
Data augmentation through batch-wise replication of prototype samples.
Model ensemble techniques requiring multiple copies of base input.

Critical operational constraints:

Input Batch Requirement: Input tensor must have batch_size=1 (first dimension size=1).
Memory Considerations: Creates virtual views rather than physical copies in forward pass.
Gradient Scaling: Backward pass gradients are scaled by batch size due to summation aggregation.
Device Compatibility: Input tensor and expansion operation must reside on same computation device.

Warning

Input validation failure occurs if input batch size != 1, throwing runtime_error.
Physical memory consumption increases proportionally with new_batch in backward pass.

Note

Actual data replication occurs lazily during memory access operations.
For physical data duplication, combine with CopyNode before ExpandNode.
Gradient computation maintains mathematical equivalence to manual batch duplication.

See also: RepeatNode For non-batch dimension replication operations; Tensor::expand() Underlying tensor expansion mechanism

Usage Example:

// Create single-sample input node
InputNode input({1, 1, 256, 256}, true); // Batch 1, 256x256 image
 
// Expand to batch size 16
ExpandNode expander(&input, 16);
expander.forward();
 
// Verify expanded shape
std::cout << "Expanded tensor shape: "
          << expander.output->shape() << std::endl; // [16, 1, 256, 256]
 
// Backward pass handling
expander.backward();

Author: Mgepahmge (https://github.com/Mgepahmge)

Date: 2023/10/16

Definition at line 3536 of file Nodes.cuh.

Constructor & Destructor Documentation

◆ ExpandNode()

nz::nodes::calc::ExpandNode::ExpandNode	(	Node *	input,
		Tensor::size_type	newBatch )

Constructs an ExpandNode object.

Parameters

input	A pointer to the input Node. Memory location: host.
newBatch	The new batch size for the output tensor. Memory location: host.

Returns: None

This function constructs an ExpandNode object. It first checks if the batch size of the input tensor is 1. If not, it throws an std::invalid_argument exception. Then, it adds the input node to the list of inputs, creates a new output tensor with the specified new batch size and the same dimensions as the input tensor except for the batch size, and sets the node type to "Expand".

Memory management strategy: The function allocates memory for the output tensor using std::make_shared. The memory will be automatically managed by the smart pointer and freed when it goes out of scope. Exception handling mechanism: If the batch size of the input tensor is not 1, the function throws an std::invalid_argument exception with an appropriate error message.

Exceptions

std::invalid_argument If the input tensor's batch size is not 1.

Note

Ensure that the input node pointer is valid and points to a properly initialized node.
The new batch size should be a valid non - negative value.

```cpp
Node* inputNode = new Node();
Tensor::size_type newBatchSize = 10;
ExpandNode expandNode(inputNode, newBatchSize);
```

Author: Mgepahmge(https://github.com/Mgepahmge)

Date: 2024/07/15

Definition at line 573 of file Nodes.cu.

Member Function Documentation

◆ backward()

void nz::nodes::calc::ExpandNode::backward ( )

overridevirtual

Performs the backward propagation for the ExpandNode.

Parameters

None

Returns: None

This function performs the backward propagation of the ExpandNode. It first checks if the input tensor requires gradient computation. If it does, it calculates the size of a single element in the input tensor (excluding the batch dimension) and the total number of elements in the output tensor. Then, it configures the CUDA grid and block dimensions for parallel execution. Finally, it calls the Compress kernel function to perform the compression operation on the gradients, which is the reverse operation of the forward expansion.

Memory management strategy: The function does not allocate or free any memory directly. It relies on the memory allocated for the input and output gradient tensors. Exception handling mechanism: There is no explicit exception handling in this function. However, the Compress kernel call may encounter errors related to CUDA operations such as invalid grid/block dimensions or device issues.

Exceptions

None

Note

Ensure that the CUDA device is properly initialized before calling this function.
The Compress kernel function should be implemented correctly to handle the gradient compression operation.
The time complexity of the compression operation depends on the implementation of the Compress kernel, but in general, it has a linear time complexity O(n), where n is the total number of elements in the output tensor.

```cpp
ExpandNode expandNode; // Assume expandNode is properly initialized
expandNode.backward();
```

Author: Mgepahmge(https://github.com/Mgepahmge)

Date: 2024/07/15

Implements nz::nodes::Node.

Definition at line 594 of file Nodes.cu.

Here is the call graph for this function:

◆ forward()

void nz::nodes::calc::ExpandNode::forward ( )

overridevirtual

Performs the forward propagation for the ExpandNode.

Parameters

None

Returns: None

This function conducts the forward propagation of the ExpandNode. It first calculates the size of a single element in the input tensor (excluding the batch dimension) and the total number of elements in the output tensor. Then, it configures the CUDA grid and block dimensions for parallel execution. Finally, it calls the Expand kernel function to perform the actual expansion operation on the device.

Memory management strategy: The function does not allocate or free any memory directly. It relies on the memory allocated for the input and output tensors in the constructor. Exception handling mechanism: There is no explicit exception handling in this function. However, the Expand kernel call may encounter errors related to CUDA operations such as invalid grid/block dimensions or device issues.

Exceptions

None

Note

Ensure that the CUDA device is properly initialized before calling this function.
The Expand kernel function should be implemented correctly to handle the expansion operation.
The time complexity of the expansion operation depends on the implementation of the Expand kernel, but in general, it has a linear time complexity O(n), where n is the total number of elements in the output tensor.

```cpp
ExpandNode expandNode; // Assume expandNode is properly initialized
expandNode.forward();
```

Author: Mgepahmge(https://github.com/Mgepahmge)

Date: 2024/07/15

Implements nz::nodes::Node.

Definition at line 585 of file Nodes.cu.

Here is the call graph for this function:

The documentation for this class was generated from the following files:

D:/Users/Mgepahmge/Documents/C Program/NeuZephyr/include/NeuZephyr/Nodes.cuh
D:/Users/Mgepahmge/Documents/C Program/NeuZephyr/src/Nodes.cu

Public Member Functions

Detailed Description

Usage Example:

Constructor & Destructor Documentation

◆ ExpandNode()

Member Function Documentation

◆ backward()

◆ forward()