Tensor API Reference
The Tensor API provides the core data structures and operations for Tofu. Tensors are multi-dimensional arrays that support automatic differentiation when used with the Graph API.
Table of Contents
- Data Structures
- Creation Functions
- Shape Operations
- Mathematical Operations
- Element-wise Operations
- Reductions
- Activation Functions
- Utilities
Data Structures
tofu_tensor
The core tensor structure representing a multi-dimensional array.
struct tofu_tensor {
tofu_dtype dtype; // Data type (TOFU_FLOAT, TOFU_INT32, etc.)
int len; // Total number of elements
int ndim; // Number of dimensions
int *dims; // Array of dimension sizes
void *data; // Pointer to data buffer
struct tofu_tensor *owner; // Data owner (NULL if self-owned)
void *backend_data; // Backend-specific data
};
Data Types (tofu_dtype)
Supported tensor data types:
TOFU_FLOAT- 32-bit floating point (most common for neural networks)TOFU_DOUBLE- 64-bit floating pointTOFU_INT32- 32-bit signed integerTOFU_INT64- 64-bit signed integerTOFU_INT16- 16-bit signed integerTOFU_INT8- 8-bit signed integerTOFU_UINT32- 32-bit unsigned integerTOFU_UINT64- 64-bit unsigned integerTOFU_UINT16- 16-bit unsigned integerTOFU_UINT8- 8-bit unsigned integerTOFU_BOOL- Boolean type
Element-wise Operations (tofu_elew_op)
TOFU_MUL- Multiplication (*)TOFU_DIV- Division (/)TOFU_SUM- Addition (+)TOFU_SUB- Subtraction (-)TOFU_MAX- Element-wise maximumTOFU_MIN- Element-wise minimumTOFU_POW- Power (^)
Creation Functions
tofu_tensor_create
Create a tensor with an existing data buffer.
tofu_tensor *tofu_tensor_create(void *data, int ndim, const int *dims, tofu_dtype dtype);
Parameters:
data- Pointer to data buffer (cannot be NULL)ndim- Number of dimensions (must be > 0 and <= TOFU_MAXDIM = 8)dims- Array of dimension sizes, length must bendimdtype- Data type (TOFU_FLOAT, TOFU_INT32, etc.)
Returns: Pointer to newly allocated tensor (caller owns, must call tofu_tensor_free)
Ownership:
- The tensor does NOT take ownership of the data buffer
- Caller must manage data lifetime and free both tensor and data separately
- Even when passed to
tofu_graph_param(), caller still owns tensor
Example:
float data[] = {1.0f, 2.0f, 3.0f, 4.0f};
int dims[] = {2, 2};
tofu_tensor *t = tofu_tensor_create(data, 2, dims, TOFU_FLOAT);
// Use tensor...
tofu_tensor_free(t); // Free tensor structure
// data is still valid - free manually if needed
Notes:
- Typical pattern: create tensor → use in graph →
tofu_graph_free()→tofu_tensor_free()→ free data - Violating preconditions triggers
assert()and crashes
See also: tofu_tensor_zeros, tofu_tensor_create_with_values
tofu_tensor_create_with_values
Create a tensor with heap-allocated copy of provided values.
tofu_tensor *tofu_tensor_create_with_values(const float *values, int ndim, const int *dims);
Parameters:
values- Array of initial values (cannot be NULL)ndim- Number of dimensions (must be > 0 and <= TOFU_MAXDIM)dims- Array of dimension sizes, length must bendim
Returns: Pointer to newly allocated tensor with copied data (caller owns, must call tofu_tensor_free_data_too)
Important:
- Creates heap-allocated copy of values (safe for gradients)
- DO NOT use compound literals like
(float[]){1.0f}as they create stack memory - Number of values must match product of dims
- Caller must call
tofu_tensor_free_data_tooto free both tensor and data
Example:
float values[] = {1.0f, 2.0f, 3.0f, 4.0f};
int dims[] = {2, 2};
tofu_tensor *t = tofu_tensor_create_with_values(values, 2, dims);
// Use tensor...
tofu_tensor_free_data_too(t); // Free both tensor and data
tofu_tensor_zeros
Create a zero-initialized tensor with allocated data buffer.
tofu_tensor *tofu_tensor_zeros(int ndim, const int *dims, tofu_dtype dtype);
Parameters:
ndim- Number of dimensions (must be > 0 and <= TOFU_MAXDIM)dims- Array of dimension sizes, length must bendimdtype- Data type (TOFU_FLOAT, TOFU_INT32, etc.)
Returns: Pointer to newly allocated zero-filled tensor (caller owns, must call tofu_tensor_free_data_too)
Ownership:
- Allocates both tensor structure and data buffer
- Caller must call
tofu_tensor_free_data_tooto free both - Even when passed to
tofu_graph_param(), caller still owns tensor
Example:
int dims[] = {3, 4};
tofu_tensor *t = tofu_tensor_zeros(2, dims, TOFU_FLOAT);
// All elements are 0.0f
// Use tensor...
tofu_tensor_free_data_too(t); // Free both tensor and data
See also: tofu_tensor_create, tofu_tensor_clone
tofu_tensor_clone
Create a deep copy of a tensor.
tofu_tensor *tofu_tensor_clone(const tofu_tensor *src);
Parameters:
src- Source tensor to clone (cannot be NULL)
Returns: Pointer to newly allocated tensor (caller owns, must call tofu_tensor_free_data_too)
Behavior:
- Creates both new tensor structure and new data buffer
- Copies all data from source to new tensor
- Preserves shape and data type
Example:
tofu_tensor *original = tofu_tensor_zeros(2, (int[]){2, 3}, TOFU_FLOAT);
tofu_tensor *copy = tofu_tensor_clone(original);
// copy is independent of original
tofu_tensor_free_data_too(copy);
tofu_tensor_free_data_too(original);
tofu_tensor_repeat
Create a tensor by repeating data multiple times.
tofu_tensor *tofu_tensor_repeat(const tofu_tensor *src, int times);
Parameters:
src- Source tensor to repeat (cannot be NULL)times- Number of repetitions (must be > 0)
Returns: Pointer to newly allocated tensor (caller owns, must call tofu_tensor_free_data_too)
Behavior:
- Creates new tensor with size =
src->len * times - Repeats source data sequentially
Example:
float data[] = {1.0f, 2.0f};
tofu_tensor *t = tofu_tensor_create(data, 1, (int[]){2}, TOFU_FLOAT);
tofu_tensor *repeated = tofu_tensor_repeat(t, 3);
// repeated contains: [1.0, 2.0, 1.0, 2.0, 1.0, 2.0]
tofu_tensor_free_data_too(repeated);
tofu_tensor_free(t);
tofu_tensor_arange
Create a 1-D tensor with evenly spaced values (similar to NumPy arange).
tofu_tensor *tofu_tensor_arange(double start, double stop, double step, tofu_dtype dtype);
Parameters:
start- Starting value (inclusive)stop- Ending value (exclusive)step- Step size between valuesdtype- Data type for the resulting tensor
Returns: Pointer to newly allocated 1-D tensor (caller owns, must call tofu_tensor_free_data_too)
Behavior:
- Creates values
[start, start+step, start+2*step, ..., stop) - Number of elements =
ceil((stop - start) / step)
Example:
tofu_tensor *t = tofu_tensor_arange(0.0, 5.0, 1.0, TOFU_FLOAT);
// t contains: [0.0, 1.0, 2.0, 3.0, 4.0]
tofu_tensor_free_data_too(t);
See also: tofu_tensor_rearange for in-place filling
tofu_tensor_rearange
Fill existing tensor with evenly spaced values (in-place arange).
void tofu_tensor_rearange(tofu_tensor *src, double start, double stop, double step);
Parameters:
src- Tensor to fill (cannot be NULL)start- Starting value (inclusive)stop- Ending value (exclusive)step- Step size between values
Behavior:
- Fills tensor with
[start, start+step, start+2*step, ...] - Number of values written is
min(tensor size, ceil((stop-start)/step)) - Modifies tensor data in-place
Cleanup Functions
tofu_tensor_free
Free tensor structure (does NOT free data buffer).
void tofu_tensor_free(tofu_tensor *t);
Parameters:
t- Tensor to free (can be NULL, no-op if NULL)
Behavior:
- Frees only the tensor structure and dims array
- Does NOT free the data buffer - caller must free data separately
- Safe to call even if tensor was used with
tofu_graph_param() - Call AFTER
tofu_graph_free()if tensor was used with graph
Example:
float data[4] = {1.0f, 2.0f, 3.0f, 4.0f};
tofu_tensor *t = tofu_tensor_create(data, 1, (int[]){4}, TOFU_FLOAT);
tofu_tensor_free(t); // Free tensor structure only
// data is still valid
See also: tofu_tensor_free_data_too
tofu_tensor_free_data_too
Free both tensor structure and data buffer.
void tofu_tensor_free_data_too(tofu_tensor *t);
Parameters:
t- Tensor to free (can be NULL, no-op if NULL)
Behavior:
- Frees both the tensor and its associated data buffer
- Only use if tensor owns its data (created with
tofu_tensor_zeros,tofu_tensor_clone, etc.) - Do NOT use if tensor was created with
tofu_tensor_create()(usetofu_tensor_free) - Safe to call if tensor was used with
tofu_graph_param() - Call AFTER
tofu_graph_free()if tensor was used with graph
Example:
tofu_tensor *t = tofu_tensor_zeros(2, (int[]){2, 3}, TOFU_FLOAT);
tofu_tensor_free_data_too(t); // Free both tensor and data
Warning: Using this on tensors created with tofu_tensor_create() will cause undefined behavior!
Shape Operations
tofu_tensor_size
Get total number of elements in tensor.
size_t tofu_tensor_size(tofu_tensor *t);
Parameters:
t- Tensor (cannot be NULL)
Returns: Total element count (product of all dimensions)
Example:
tofu_tensor *t = tofu_tensor_zeros(2, (int[]){3, 4}, TOFU_FLOAT);
size_t size = tofu_tensor_size(t); // Returns 12
tofu_tensor_reshape
Reshape tensor to new dimensions (view operation, no data copy).
tofu_tensor *tofu_tensor_reshape(tofu_tensor *src, int ndim, const int *dims);
Parameters:
src- Source tensor (cannot be NULL)ndim- Number of dimensions for reshaped tensordims- Array of new dimension sizes
Returns: New tensor structure sharing data with source (caller owns, must call tofu_tensor_free)
Behavior:
- Does NOT copy data - result shares memory with source
- Only changes shape metadata, not data layout
- Source must outlive result tensor
- Product of dims must equal
tofu_tensor_size(src)
Warning: Do NOT call tofu_tensor_free_data_too on the reshaped view - this would free the shared data while the source tensor still references it! Only use tofu_tensor_free on views.
Example:
tofu_tensor *t = tofu_tensor_zeros(1, (int[]){12}, TOFU_FLOAT);
tofu_tensor *reshaped = tofu_tensor_reshape(t, 2, (int[]){3, 4});
// reshaped is a view of t with shape [3, 4]
tofu_tensor_free(reshaped); // Free view
tofu_tensor_free_data_too(t); // Free original
See also: tofu_tensor_reshape_src for in-place reshape
tofu_tensor_reshape_src
Reshape tensor in-place (modifies source tensor metadata).
void tofu_tensor_reshape_src(tofu_tensor *src, int ndim, const int *dims);
Parameters:
src- Tensor to reshape (cannot be NULL)ndim- Number of dimensions for reshaped tensordims- Array of new dimension sizes
Behavior:
- Modifies src tensor structure in-place
- Does NOT copy or reallocate data
- Only changes shape metadata
- Product of dims must equal
tofu_tensor_size(src)
Example:
tofu_tensor *t = tofu_tensor_zeros(1, (int[]){12}, TOFU_FLOAT);
tofu_tensor_reshape_src(t, 2, (int[]){3, 4});
// t now has shape [3, 4]
tofu_tensor_transpose
Transpose tensor by permuting dimensions.
tofu_tensor *tofu_tensor_transpose(const tofu_tensor *src, tofu_tensor *dst, const int *axes);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)axes- Permutation array (can be NULL for reverse order)
Returns: Result tensor (caller owns if dst was NULL)
Behavior:
- If
axesis NULL, reverses dimension order (e.g.,[2,3,4]→[4,3,2]) - If
axesis non-NULL, permutes according to axes (e.g.,axes=[1,0]swaps dims) - For 2-D matrix,
axes=NULLtransposes (rows ↔ columns)
Example:
// Matrix transpose
tofu_tensor *matrix = tofu_tensor_zeros(2, (int[]){3, 4}, TOFU_FLOAT);
tofu_tensor *transposed = tofu_tensor_transpose(matrix, NULL, NULL);
// transposed has shape [4, 3]
// Custom permutation
int axes[] = {2, 0, 1};
tofu_tensor *t3d = tofu_tensor_zeros(3, (int[]){2, 3, 4}, TOFU_FLOAT);
tofu_tensor *permuted = tofu_tensor_transpose(t3d, NULL, axes);
// permuted has shape [4, 2, 3]
tofu_tensor_slice
Extract slice from tensor (copies data).
tofu_tensor *tofu_tensor_slice(const tofu_tensor *src, tofu_tensor *dst,
int axis, int start, int len);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)axis- Axis along which to slicestart- Starting index along axislen- Length of slice
Returns: Result tensor (caller owns if dst was NULL)
Preconditions:
axis < src->ndimstart >= 0andstart + len <= src->dims[axis]- If
dstis non-NULL, it must have correct shape for slice
Example:
tofu_tensor *t = tofu_tensor_arange(0.0, 10.0, 1.0, TOFU_FLOAT);
tofu_tensor *slice = tofu_tensor_slice(t, NULL, 0, 2, 5);
// slice contains: [2.0, 3.0, 4.0, 5.0, 6.0]
See also: tofu_tensor_slice_nocopy for view without copying
tofu_tensor_slice_nocopy
Create view of tensor slice (no data copy).
tofu_tensor *tofu_tensor_slice_nocopy(tofu_tensor *src, tofu_tensor *dst,
int axis, int start, int len);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)axis- Axis along which to slicestart- Starting index along axislen- Length of slice
Returns: Result tensor sharing data with source (caller owns if dst was NULL)
Behavior:
- Does NOT copy data - result shares memory with source
- Modifying result will modify source tensor
- Source must outlive result tensor
Warning: This is a view operation - changes affect the original tensor!
tofu_tensor_concat
Concatenate two tensors along specified axis.
tofu_tensor *tofu_tensor_concat(const tofu_tensor *src1, const tofu_tensor *src2,
tofu_tensor *dst, int axis);
Parameters:
src1- First tensor (cannot be NULL)src2- Second tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)axis- Axis along which to concatenate
Returns: Result tensor (caller owns if dst was NULL)
Preconditions:
- All dimensions except
axismust match between src1 and src2
Behavior:
- Result
dims[axis] = src1->dims[axis] + src2->dims[axis]
Example:
tofu_tensor *a = tofu_tensor_zeros(2, (int[]){2, 3}, TOFU_FLOAT);
tofu_tensor *b = tofu_tensor_zeros(2, (int[]){2, 3}, TOFU_FLOAT);
tofu_tensor *concat = tofu_tensor_concat(a, b, NULL, 0);
// concat has shape [4, 3]
Mathematical Operations
tofu_tensor_matmul
Compute matrix multiplication with broadcasting.
tofu_tensor *tofu_tensor_matmul(const tofu_tensor *src1, const tofu_tensor *src2,
tofu_tensor *dst);
Parameters:
src1- Left operand tensor (cannot be NULL)src2- Right operand tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)
Returns: Result tensor (caller owns if dst was NULL)
Preconditions:
- For 1-D @ 1-D:
src1->dims[0]must equalsrc2->dims[0] - For 2-D and higher:
src1->dims[src1->ndim-1]must equalsrc2->dims[src2->ndim-2]
Behavior:
- 1-D @ 1-D: Dot product → scalar
- 2-D @ 2-D: Standard matrix multiplication
- N-D @ 1-D: Matrix-vector (drops last dim)
- 1-D @ N-D: Vector-matrix (drops first dim)
- N-D @ N-D: Batch matmul with broadcasting
Example:
// Matrix multiplication
tofu_tensor *A = tofu_tensor_zeros(2, (int[]){3, 4}, TOFU_FLOAT);
tofu_tensor *B = tofu_tensor_zeros(2, (int[]){4, 5}, TOFU_FLOAT);
tofu_tensor *C = tofu_tensor_matmul(A, B, NULL);
// C has shape [3, 5]
// Batch matrix multiplication
tofu_tensor *batch_A = tofu_tensor_zeros(3, (int[]){2, 3, 4}, TOFU_FLOAT);
tofu_tensor *batch_B = tofu_tensor_zeros(3, (int[]){2, 4, 5}, TOFU_FLOAT);
tofu_tensor *batch_C = tofu_tensor_matmul(batch_A, batch_B, NULL);
// batch_C has shape [2, 3, 5]
Notes:
- Most commonly used operation for neural networks
- Broadcasts batch dimensions automatically
See also: tofu_tensor_inner for inner product
tofu_tensor_inner
Compute inner product (sum-product over last axes).
tofu_tensor *tofu_tensor_inner(const tofu_tensor *src1, const tofu_tensor *src2,
tofu_tensor *dst);
Parameters:
src1- First tensor (cannot be NULL)src2- Second tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)
Returns: Result tensor (caller owns if dst was NULL)
Preconditions:
src1->dims[src1->ndim-1]must equalsrc2->dims[src2->ndim-1]
Behavior:
- 1-D × 1-D: Dot product → scalar
- 2-D × 2-D:
result[i,j] = sum(a[i,:] * b[j,:]) - N-D × N-D: Cartesian product of non-last dimensions
- Output shape:
(*a.shape[:-1], *b.shape[:-1])
Example:
tofu_tensor *a = tofu_tensor_arange(0.0, 3.0, 1.0, TOFU_FLOAT); // [0, 1, 2]
tofu_tensor *b = tofu_tensor_arange(1.0, 4.0, 1.0, TOFU_FLOAT); // [1, 2, 3]
tofu_tensor *result = tofu_tensor_inner(a, b, NULL);
// result = 0*1 + 1*2 + 2*3 = 8.0
See also: tofu_tensor_matmul, tofu_tensor_outer
tofu_tensor_outer
Compute outer product (cartesian product without summation).
tofu_tensor *tofu_tensor_outer(const tofu_tensor *src1, const tofu_tensor *src2,
tofu_tensor *dst);
Parameters:
src1- First tensor (cannot be NULL)src2- Second tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)
Returns: Result tensor (caller owns if dst was NULL)
Behavior:
- Flattens both input tensors
- Computes:
result[i,j] = a[i] * b[j] - Always produces 2-D output
- Output shape:
[a.size, b.size]where size is total element count
Example:
tofu_tensor *a = tofu_tensor_arange(0.0, 3.0, 1.0, TOFU_FLOAT); // [0, 1, 2]
tofu_tensor *b = tofu_tensor_arange(1.0, 3.0, 1.0, TOFU_FLOAT); // [1, 2]
tofu_tensor *result = tofu_tensor_outer(a, b, NULL);
// result shape [3, 2]:
// [[0, 0],
// [1, 2],
// [2, 4]]
Element-wise Operations
tofu_tensor_elew
Apply element-wise binary operation with broadcasting.
tofu_tensor *tofu_tensor_elew(const tofu_tensor *src1, const tofu_tensor *src2,
tofu_tensor *dst, tofu_elew_op elew_op);
Parameters:
src1- First tensor (cannot be NULL)src2- Second tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)elew_op- Operation to apply (TOFU_MUL, TOFU_DIV, TOFU_SUM, TOFU_SUB, TOFU_POW, etc.)
Returns: Result tensor (caller owns if dst was NULL)
Preconditions:
- src1 and src2 must be broadcastable (NumPy rules)
Operations:
TOFU_MUL- Element-wise multiplication (*)TOFU_DIV- Element-wise division (/)TOFU_SUM- Element-wise addition (+)TOFU_SUB- Element-wise subtraction (-)TOFU_POW- Element-wise power (^)TOFU_MAX- Element-wise maximumTOFU_MIN- Element-wise minimum
Example:
tofu_tensor *a = tofu_tensor_arange(1.0, 5.0, 1.0, TOFU_FLOAT); // [1, 2, 3, 4]
tofu_tensor *b = tofu_tensor_arange(2.0, 6.0, 1.0, TOFU_FLOAT); // [2, 3, 4, 5]
tofu_tensor *sum = tofu_tensor_elew(a, b, NULL, TOFU_SUM);
// sum = [3, 5, 7, 9]
tofu_tensor *prod = tofu_tensor_elew(a, b, NULL, TOFU_MUL);
// prod = [2, 6, 12, 20]
// Broadcasting example
tofu_tensor *matrix = tofu_tensor_zeros(2, (int[]){3, 4}, TOFU_FLOAT);
float scalar_data[] = {2.0f};
tofu_tensor *scalar = tofu_tensor_create(scalar_data, 1, (int[]){1}, TOFU_FLOAT);
tofu_tensor *scaled = tofu_tensor_elew(matrix, scalar, NULL, TOFU_MUL);
// All elements of matrix multiplied by 2.0
See also: tofu_tensor_elew_param, tofu_tensor_elew_broadcast
tofu_tensor_elew_param
Apply element-wise operation between tensor and scalar.
tofu_tensor *tofu_tensor_elew_param(const tofu_tensor *src, double param,
tofu_tensor *dst, tofu_elew_op elew_op);
Parameters:
src- Source tensor (cannot be NULL)param- Scalar parameterdst- Destination tensor (can be NULL to allocate new)elew_op- Operation to apply
Returns: Result tensor with same shape as src (caller owns if dst was NULL)
Behavior:
- Applies operation element-wise:
op(tensor_element, param)
Example:
tofu_tensor *t = tofu_tensor_arange(1.0, 5.0, 1.0, TOFU_FLOAT); // [1, 2, 3, 4]
tofu_tensor *scaled = tofu_tensor_elew_param(t, 2.0, NULL, TOFU_MUL);
// scaled = [2, 4, 6, 8]
tofu_tensor *shifted = tofu_tensor_elew_param(t, 10.0, NULL, TOFU_SUM);
// shifted = [11, 12, 13, 14]
tofu_tensor *squared = tofu_tensor_elew_param(t, 2.0, NULL, TOFU_POW);
// squared = [1, 4, 9, 16]
tofu_tensor_elew_broadcast
Apply element-wise operation with automatic broadcasting.
tofu_tensor *tofu_tensor_elew_broadcast(const tofu_tensor *src1, const tofu_tensor *src2,
tofu_tensor *dst, tofu_elew_op elew_op);
Parameters:
src1- First tensor (cannot be NULL)src2- Second tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)elew_op- Operation to apply
Returns: Result tensor with broadcast shape (caller owns if dst was NULL)
Notes:
- Automatically broadcasts inputs to compatible shape
- Equivalent to
tofu_tensor_elewbut with explicit broadcast handling - Follows NumPy broadcasting rules
Reductions
tofu_tensor_sumreduce
Reduce tensor along axis using sum operation.
tofu_tensor *tofu_tensor_sumreduce(const tofu_tensor *src, tofu_tensor *dst, int axis);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)axis- Axis along which to reduce
Returns: Result tensor with dims[axis] removed (caller owns if dst was NULL)
Behavior:
- Output shape:
src->dimswithdims[axis]removed - Computes sum of all elements along specified axis
Example:
tofu_tensor *t = tofu_tensor_zeros(2, (int[]){3, 4}, TOFU_FLOAT);
// Fill with 1.0
for (int i = 0; i < 12; i++) {
float val = 1.0f;
TOFU_TENSOR_DATA_FROM(t, i, val, TOFU_FLOAT);
}
tofu_tensor *row_sum = tofu_tensor_sumreduce(t, NULL, 1);
// row_sum has shape [3], each element = 4.0
tofu_tensor *col_sum = tofu_tensor_sumreduce(t, NULL, 0);
// col_sum has shape [4], each element = 3.0
See also: tofu_tensor_meanreduce, tofu_tensor_maxreduce
tofu_tensor_meanreduce
Reduce tensor along axis using mean operation.
tofu_tensor *tofu_tensor_meanreduce(const tofu_tensor *src, tofu_tensor *dst, int axis);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)axis- Axis along which to reduce
Returns: Result tensor with dims[axis] removed (caller owns if dst was NULL)
Behavior:
- Output shape:
src->dimswithdims[axis]removed - Computes arithmetic mean of all elements along specified axis
Example:
tofu_tensor *t = tofu_tensor_arange(0.0, 12.0, 1.0, TOFU_FLOAT);
tofu_tensor_reshape_src(t, 2, (int[]){3, 4});
tofu_tensor *row_mean = tofu_tensor_meanreduce(t, NULL, 1);
// row_mean has shape [3]
// row_mean[0] = mean([0,1,2,3]) = 1.5
// row_mean[1] = mean([4,5,6,7]) = 5.5
// row_mean[2] = mean([8,9,10,11]) = 9.5
tofu_tensor_maxreduce
Reduce tensor along axis using max operation.
tofu_tensor *tofu_tensor_maxreduce(const tofu_tensor *src, tofu_tensor *dst,
tofu_tensor *arg, int axis);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)arg- Argmax indices tensor (can be NULL if indices not needed)axis- Axis along which to reduce
Returns: Result tensor with dims[axis] removed (caller owns if dst was NULL)
Behavior:
- Output shape:
src->dimswithdims[axis]removed - If
argis non-NULL, fills it with indices of maximum values
Example:
float data[] = {3.0f, 1.0f, 4.0f, 1.0f, 5.0f, 9.0f};
tofu_tensor *t = tofu_tensor_create(data, 2, (int[]){2, 3}, TOFU_FLOAT);
tofu_tensor *max_vals = tofu_tensor_maxreduce(t, NULL, NULL, 1);
// max_vals = [4.0, 9.0]
tofu_tensor *indices = tofu_tensor_zeros(1, (int[]){2}, TOFU_INT32);
max_vals = tofu_tensor_maxreduce(t, NULL, indices, 1);
// indices = [2, 2] (position of max in each row)
tofu_tensor_sub_broadcast
Subtract reduced tensor from source with broadcasting.
tofu_tensor *tofu_tensor_sub_broadcast(const tofu_tensor *src, const tofu_tensor *reduced,
tofu_tensor *dst, int axis);
Parameters:
src- Source tensor (cannot be NULL)reduced- Reduced tensor to subtract (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)axis- Axis along which reduction was performed
Returns: Result tensor with same shape as src (caller owns if dst was NULL)
Preconditions:
reduced->ndim = src->ndim - 1(one dimension removed)
Behavior:
- Broadcasts reduced tensor back along axis and subtracts
- Useful for normalization operations (subtract mean, etc.)
Activation Functions
tofu_tensor_lrelu
Apply Leaky ReLU activation function.
tofu_tensor *tofu_tensor_lrelu(const tofu_tensor *src, tofu_tensor *dst, float negslope);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)negslope- Slope for negative values (typically 0.01)
Returns: Result tensor with same shape as src (caller owns if dst was NULL)
Behavior:
- Computes:
x if x >= 0, else negslope * x - Standard ReLU equivalent when
negslope = 0
Example:
float data[] = {-2.0f, -1.0f, 0.0f, 1.0f, 2.0f};
tofu_tensor *t = tofu_tensor_create(data, 1, (int[]){5}, TOFU_FLOAT);
tofu_tensor *relu = tofu_tensor_lrelu(t, NULL, 0.0f);
// relu = [0.0, 0.0, 0.0, 1.0, 2.0]
tofu_tensor *leaky = tofu_tensor_lrelu(t, NULL, 0.01f);
// leaky = [-0.02, -0.01, 0.0, 1.0, 2.0]
Note: For use in computation graphs with automatic differentiation, use tofu_graph_relu() instead.
tofu_tensor_softmax
Apply softmax activation along specified axis.
tofu_tensor *tofu_tensor_softmax(const tofu_tensor *src, tofu_tensor *dst, int axis);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)axis- Axis along which to apply softmax
Returns: Result tensor with same shape as src (caller owns if dst was NULL)
Behavior:
- Computes:
exp(x_i) / sum(exp(x_j))along axis - Uses numerically stable implementation (subtracts max before exp)
- Output values sum to 1.0 along specified axis
Example:
float logits[] = {1.0f, 2.0f, 3.0f};
tofu_tensor *t = tofu_tensor_create(logits, 1, (int[]){3}, TOFU_FLOAT);
tofu_tensor *probs = tofu_tensor_softmax(t, NULL, 0);
// probs ≈ [0.09, 0.24, 0.67] (sums to 1.0)
Note: For use in computation graphs with automatic differentiation, use tofu_graph_softmax() instead.
tofu_tensor_layer_norm
Apply layer normalization with learnable affine transform.
tofu_tensor *tofu_tensor_layer_norm(const tofu_tensor *src, tofu_tensor *dst,
const tofu_tensor *gamma, const tofu_tensor *beta,
int axis, double eps);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)gamma- Scale parameter tensor (can be NULL for no scaling)beta- Shift parameter tensor (can be NULL for no shift)axis- Axis along which to normalizeeps- Small constant for numerical stability (typically 1e-5)
Returns: Result tensor with same shape as src (caller owns if dst was NULL)
Behavior:
- Normalizes:
(x - mean) / sqrt(variance + eps) - Then applies:
gamma * normalized + beta(if gamma/beta non-NULL) - If gamma/beta are NULL, only normalization is applied
Example:
tofu_tensor *x = tofu_tensor_zeros(2, (int[]){2, 4}, TOFU_FLOAT);
float gamma_data[] = {1.0f, 1.0f, 1.0f, 1.0f};
float beta_data[] = {0.0f, 0.0f, 0.0f, 0.0f};
tofu_tensor *gamma = tofu_tensor_create(gamma_data, 1, (int[]){4}, TOFU_FLOAT);
tofu_tensor *beta = tofu_tensor_create(beta_data, 1, (int[]){4}, TOFU_FLOAT);
tofu_tensor *normalized = tofu_tensor_layer_norm(x, NULL, gamma, beta, 1, 1e-5);
Utilities
tofu_tensor_issameshape
Check if two tensors have the same shape.
int tofu_tensor_issameshape(const tofu_tensor *t1, const tofu_tensor *t2);
Parameters:
t1- First tensor (cannot be NULL)t2- Second tensor (cannot be NULL)
Returns: 1 if same shape, 0 otherwise
tofu_tensor_isbroadcastable
Check if two tensors can be broadcast together (NumPy semantics).
int tofu_tensor_isbroadcastable(const tofu_tensor *t1, const tofu_tensor *t2);
Parameters:
t1- First tensor (cannot be NULL)t2- Second tensor (cannot be NULL)
Returns: 1 if broadcastable, 0 otherwise
Broadcasting Rules:
- Arrays with fewer dimensions are prepended with size-1 dimensions
- Size-1 dimensions are stretched to match the other array
- Dimensions must match or one must be 1
Example:
tofu_tensor *a = tofu_tensor_zeros(2, (int[]){3, 4}, TOFU_FLOAT);
tofu_tensor *b = tofu_tensor_zeros(1, (int[]){4}, TOFU_FLOAT);
int can_broadcast = tofu_tensor_isbroadcastable(a, b); // Returns 1
tofu_tensor *c = tofu_tensor_zeros(1, (int[]){3}, TOFU_FLOAT);
can_broadcast = tofu_tensor_isbroadcastable(a, c); // Returns 0
tofu_tensor_broadcast_to
Broadcast tensor to specified shape (NumPy semantics).
tofu_tensor *tofu_tensor_broadcast_to(const tofu_tensor *src, tofu_tensor *dst,
int ndim, const int *dims);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)ndim- Number of dimensions for target shapedims- Target dimension sizes
Returns: Result tensor with target shape (caller owns if dst was NULL)
Preconditions:
- src must be broadcastable to target shape (NumPy rules)
Behavior:
- Follows NumPy broadcasting rules
- Size-1 dimensions are stretched to match target
tofu_tensor_print
Print tensor to stdout with custom format.
void tofu_tensor_print(const tofu_tensor *t, const char *fmt);
Parameters:
t- Tensor to print (cannot be NULL)fmt- Format string for each element (e.g., "%.6f", "%d")
Example:
tofu_tensor *t = tofu_tensor_arange(0.0, 6.0, 1.0, TOFU_FLOAT);
tofu_tensor_reshape_src(t, 2, (int[]){2, 3});
tofu_tensor_print(t, "%.1f");
// Output:
// [[0.0, 1.0, 2.0],
// [3.0, 4.0, 5.0]]
See also: tofu_tensor_fprint for printing to arbitrary stream, tofu_tensor_save for saving to file
tofu_tensor_fprint
Print tensor to file stream with custom format.
void tofu_tensor_fprint(FILE *stream, const tofu_tensor *t, const char *fmt);
Parameters:
stream- File stream to write to (cannot be NULL)t- Tensor to print (cannot be NULL)fmt- Format string for each element
tofu_tensor_save
Save tensor to file with custom format.
int tofu_tensor_save(const char *file_name, const tofu_tensor *t, const char *fmt);
Parameters:
file_name- Path to output file (cannot be NULL)t- Tensor to save (cannot be NULL)fmt- Format string for each element
Returns: 0 on success, non-zero on error
tofu_tensor_convert
Convert tensor to different data type.
tofu_tensor *tofu_tensor_convert(const tofu_tensor *src, tofu_tensor *dst,
tofu_dtype dtype_d);
Parameters:
src- Source tensor (cannot be NULL)dst- Destination tensor (can be NULL to allocate new)dtype_d- Target data type
Returns: Result tensor with same shape as src but different dtype (caller owns if dst was NULL)
Behavior:
- Converts each element to target type with appropriate casting
- May lose precision (e.g., float to int truncates)
Example:
float data[] = {1.7f, 2.3f, 3.9f};
tofu_tensor *floats = tofu_tensor_create(data, 1, (int[]){3}, TOFU_FLOAT);
tofu_tensor *ints = tofu_tensor_convert(floats, NULL, TOFU_INT32);
// ints = [1, 2, 3]
tofu_tensor_index
Convert multi-dimensional coordinates to flat index.
int tofu_tensor_index(const tofu_tensor *t, int *coords);
Parameters:
t- Tensor (cannot be NULL)coords- Array of coordinates, length must bet->ndim
Returns: Flat index into tensor data array
tofu_tensor_coords
Convert flat index to multi-dimensional coordinates.
void tofu_tensor_coords(const tofu_tensor *t, int index, int *coords);
Parameters:
t- Tensor (cannot be NULL)index- Flat index into tensor data arraycoords- Output array for coordinates, length must bet->ndim
Common Patterns
Working with Tensor Memory
// Pattern 1: User manages data buffer
float data[4] = {1.0f, 2.0f, 3.0f, 4.0f};
tofu_tensor *t = tofu_tensor_create(data, 1, (int[]){4}, TOFU_FLOAT);
// Use tensor...
tofu_tensor_free(t);
// data is still valid
// Pattern 2: Library manages data buffer
tofu_tensor *t = tofu_tensor_zeros(1, (int[]){4}, TOFU_FLOAT);
// Use tensor...
tofu_tensor_free_data_too(t);
// Both tensor and data are freed
Accessing Tensor Elements
tofu_tensor *t = tofu_tensor_zeros(2, (int[]){3, 4}, TOFU_FLOAT);
// Read element at index i
float value;
TOFU_TENSOR_DATA_TO(t, i, value, TOFU_FLOAT);
// Write element at index i
value = 42.0f;
TOFU_TENSOR_DATA_FROM(t, i, value, TOFU_FLOAT);
// Copy element from src[si] to dst[di]
TOFU_TENSOR_DATA_ASSIGN(dst, di, src, si);
Broadcasting Example
// Add scalar to matrix (broadcasting)
tofu_tensor *matrix = tofu_tensor_zeros(2, (int[]){3, 4}, TOFU_FLOAT);
tofu_tensor *result = tofu_tensor_elew_param(matrix, 5.0, NULL, TOFU_SUM);
// Add vector to matrix rows (broadcasting)
tofu_tensor *row_vec = tofu_tensor_zeros(1, (int[]){4}, TOFU_FLOAT);
result = tofu_tensor_elew_broadcast(matrix, row_vec, NULL, TOFU_SUM);