API Reference
This section provides a comprehensive reference for all functions, classes, and modules in the Riemann library.
Tensor Operations
Tensor Creation Functions
- riemann.tensor(data, dtype=None, requires_grad=False)
Create a tensor from data.
- Parameters:
data (array_like) – Data to initialize the tensor
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
Tensor containing the given data
- Return type:
riemann.TN
- riemann.zeros(*shape, dtype=None, requires_grad=False)
Create a tensor filled with zeros.
- Parameters:
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
Tensor filled with zeros
- Return type:
riemann.TN
- riemann.ones(*shape, dtype=None, requires_grad=False)
Create a tensor filled with ones.
- Parameters:
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
Tensor filled with ones
- Return type:
riemann.TN
- riemann.empty(*shape, dtype=None, requires_grad=False)
Create an uninitialized tensor.
- Parameters:
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
Uninitialized tensor
- Return type:
riemann.TN
- riemann.full(*shape, fill_value, dtype=None, requires_grad=False)
Create a tensor filled with a specific value.
- Parameters:
fill_value (scalar) – Value to fill the tensor with
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
Tensor filled with the specified value
- Return type:
riemann.TN
- riemann.eye(n, m=None, dtype=None, requires_grad=False)
Create a 2D tensor with ones on the diagonal and zeros elsewhere.
- Parameters:
n (int) – Number of rows
m (int, optional) – Number of columns (default: n)
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
2D tensor with ones on the diagonal
- Return type:
riemann.TN
- riemann.zeros_like(tsr, dtype=None, requires_grad=False)
Create a zero tensor with the same shape as the input tensor.
- Parameters:
tsr (riemann.TN) – Reference tensor
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
Zero tensor with the same shape as the input tensor
- Return type:
riemann.TN
- riemann.ones_like(tsr, dtype=None, requires_grad=False)
Create a one tensor with the same shape as the input tensor.
- Parameters:
tsr (riemann.TN) – Reference tensor
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
One tensor with the same shape as the input tensor
- Return type:
riemann.TN
- riemann.empty_like(tsr, dtype=None, requires_grad=False)
Create an uninitialized tensor with the same shape as the input tensor.
- Parameters:
tsr (riemann.TN) – Reference tensor
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
Uninitialized tensor with the same shape as the input tensor
- Return type:
riemann.TN
- riemann.full_like(tsr, fill_value, dtype=None, requires_grad=False)
Create a tensor with the same shape as the input tensor, filled with a specific value.
- Parameters:
tsr (riemann.TN) – Reference tensor
fill_value (scalar) – Value to fill the tensor with
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
Tensor with the same shape as the input tensor, filled with the specified value
- Return type:
riemann.TN
Random Number Generation
- riemann.rand(*size, requires_grad=False, dtype=None)
Create a tensor filled with random numbers from a uniform distribution over [0, 1).
- Parameters:
requires_grad (bool, optional) – Whether to track operations on this tensor
dtype (numpy.dtype, optional) – Expected data type of the tensor
- Returns:
Tensor filled with random values
- Return type:
riemann.TN
- riemann.randn(*size, requires_grad=False, dtype=None)
Create a tensor filled with random numbers from a standard normal distribution.
- Parameters:
requires_grad (bool, optional) – Whether to track operations on this tensor
dtype (numpy.dtype, optional) – Expected data type of the tensor
- Returns:
Tensor filled with random values
- Return type:
riemann.TN
- riemann.randint(low, high, size, requires_grad=False, dtype=int64)
Create a tensor filled with random integers from low (inclusive) to high (exclusive).
- Parameters:
- Returns:
Tensor filled with random integers
- Return type:
riemann.TN
- riemann.randperm(n, requires_grad=False, dtype=int64)
Create a tensor containing numbers from 0 to n-1 in random order.
- Parameters:
n (int) – Upper bound (exclusive)
requires_grad (bool, optional) – Whether to track operations on this tensor
dtype (numpy.dtype, optional) – Expected data type of the tensor
- Returns:
Tensor containing randomly permuted integers
- Return type:
riemann.TN
- riemann.normal(mean, std, size, dtype=None)
Create a tensor filled with random numbers from a normal distribution.
- Parameters:
mean (float) – Mean of the normal distribution
std (float) – Standard deviation of the normal distribution
dtype (numpy.dtype, optional) – Expected data type of the tensor
- Returns:
Tensor filled with random values
- Return type:
riemann.TN
Random Seed Control
Sequence and Range Functions
- riemann.arange(start, end=None, step=1.0, dtype=None, requires_grad=False)
Create a 1-D tensor of evenly spaced values from start to end with step.
- Parameters:
start (float) – Start value
end (float, optional) – End value (exclusive)
step (float, optional) – Spacing between values
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
1-D tensor containing evenly spaced values
- Return type:
riemann.TN
- riemann.linspace(start, end, steps=100, endpoint=True, dtype=None, requires_grad=False)
Create a 1-D tensor of evenly spaced values within a given interval.
- Parameters:
start (float) – Start value
end (float) – End value
steps (int, optional) – Number of samples to generate
endpoint (bool, optional) – Whether to include the end value
dtype (numpy.dtype, optional) – Expected data type of the tensor
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
1-D tensor containing evenly spaced values
- Return type:
riemann.TN
Tensor Attributes
- riemann.TN.dtype()
Return the data type of the tensor.
- Returns:
Data type of the tensor
- Return type:
- riemann.TN.real()
Return the real part of a complex tensor.
- Returns:
Tensor containing the real parts
- Return type:
riemann.TN
- riemann.TN.imag()
Return the imaginary part of a complex tensor.
- Returns:
Tensor containing the imaginary parts
- Return type:
riemann.TN
- riemann.TN.shape()
Return the shape of the tensor.
- Returns:
Tuple of tensor dimensions
- Return type:
- riemann.TN.ndim()
Return the number of dimensions of the tensor.
- Returns:
Number of dimensions of the tensor
- Return type:
- riemann.TN.device()
Return the device where the tensor is located.
- Returns:
Device object where the tensor is located
- Return type:
- riemann.TN.is_cuda()
Check if the tensor is on a CUDA device.
- Returns:
True if the tensor is on a CUDA device, False otherwise
- Return type:
- riemann.TN.is_cpu()
Check if the tensor is on a CPU device.
- Returns:
True if the tensor is on a CPU device, False otherwise
- Return type:
- riemann.TN.is_leaf()
Check if the tensor is a leaf node.
- Returns:
True if the tensor is a leaf node, False otherwise
- Return type:
- riemann.TN.is_floating_point()
Check if the tensor is of floating point type.
- Returns:
True if the tensor is of floating point type, False otherwise
- Return type:
- riemann.TN.is_complex()
Check if the tensor is of complex type.
- Returns:
True if the tensor is of complex type, False otherwise
- Return type:
- riemann.TN.isreal()
Determine if tensor elements are real numbers.
- Returns:
Boolean tensor, True indicates the corresponding element is real
- Return type:
riemann.TN
- riemann.TN.isinf()
Determine if tensor elements are infinity.
- Returns:
Boolean tensor, True indicates the corresponding element is infinity
- Return type:
riemann.TN
- riemann.TN.isnan()
Determine if tensor elements are NaN (Not a Number).
- Returns:
Boolean tensor, True indicates the corresponding element is NaN
- Return type:
riemann.TN
- riemann.TN.conj()
Return the complex conjugate of the tensor.
- Returns:
Tensor containing conjugate elements
- Return type:
riemann.TN
- riemann.TN.size(dim=None)
Return the size of the tensor.
- riemann.TN.numel()
Return the total number of elements in the tensor.
- Returns:
Number of elements in the tensor
- Return type:
Tensor Shape Operations
- riemann.reshape(input, shape)
Return a tensor with the same data but different shape.
- Parameters:
input (riemann.TN) – Input tensor
shape (tuple of integers) – New shape
- Returns:
Tensor with the new shape
- Return type:
riemann.TN
- riemann.squeeze(input, dim=None)
Remove dimensions of size 1 from the tensor shape.
- Parameters:
input (riemann.TN) – Input tensor
dim (int, optional) – Dimension to squeeze
- Returns:
Tensor with squeezed dimensions
- Return type:
riemann.TN
- riemann.unsqueeze(input, dim)
Insert a dimension of size 1 at the specified position.
- Parameters:
input (riemann.TN) – Input tensor
dim (int) – Dimension to expand
- Returns:
Tensor with expanded dimensions
- Return type:
riemann.TN
- riemann.transpose(input, dim0, dim1)
Swap two dimensions of the tensor.
- riemann.TN.mT
Matrix transpose, i.e., transpose between the last two dimensions of the tensor.
For a row vector (1, n), transpose to column vector (n, 1); for a column vector (n, 1), transpose to row vector (1, n).
- Returns:
Transposed tensor
- Return type:
riemann.TN
- riemann.is_contiguous(input)
Check if the tensor memory is contiguous.
- Parameters:
input (riemann.TN) – Input tensor
- Returns:
Whether the memory is contiguous
- Return type:
- riemann.contiguous(input)
Return a tensor with contiguous memory.
- Parameters:
input (riemann.TN) – Input tensor
- Returns:
Tensor with contiguous memory
- Return type:
riemann.TN
- riemann.gather(input, dim, index)
Gather elements from the tensor along a specified dimension.
- Parameters:
input (riemann.TN) – Input tensor
dim (int) – Dimension to gather
index (riemann.TN) – Index tensor
- Returns:
Gathered tensor
- Return type:
riemann.TN
- riemann.scatter(input, dim, index, src)
Scatter elements from the source tensor into the target tensor along a specified dimension.
- Parameters:
input (riemann.TN) – Target tensor
dim (int) – Dimension to scatter
index (riemann.TN) – Index tensor
src (riemann.TN) – Source tensor
- Returns:
Scattered tensor
- Return type:
riemann.TN
- riemann.broadcast_to(input, size)
Broadcast the tensor to a new shape.
- Parameters:
input (riemann.TN) – Input tensor
size (tuple of integers) – Target shape
- Returns:
Broadcasted tensor
- Return type:
riemann.TN
- riemann.flip(input, dims)
Reverse the order of elements along specified dimensions.
- riemann.split(ts, split_indices, dim=0)
Split a tensor into multiple sub-tensors.
- riemann.stack(tensors, dim=0)
Stack tensors along a new dimension.
- Parameters:
tensors (Sequence of riemann.TN) – Sequence of tensors to stack
dim (int, optional) – Dimension to insert
- Returns:
Stacked tensor
- Return type:
riemann.TN
- riemann.cat(tensors, dim=0)
Concatenate tensors along an existing dimension.
- Parameters:
tensors (Sequence of riemann.TN) – Sequence of tensors to concatenate
dim (int, optional) – Dimension along which to concatenate
- Returns:
Concatenated tensor
- Return type:
riemann.TN
- riemann.concatenate(tensors, dim=0)
Concatenate tensors along an existing dimension.
- Parameters:
tensors (Sequence of riemann.TN) – Sequence of tensors to concatenate
dim (int, optional) – Dimension along which to concatenate
- Returns:
Concatenated tensor
- Return type:
riemann.TN
- riemann.vstack(tensors)
Stack tensors vertically (row-wise).
- Parameters:
tensors (Sequence of riemann.TN) – Sequence of tensors to stack
- Returns:
Vertically stacked tensor
- Return type:
riemann.TN
- riemann.hstack(tensors)
Stack tensors horizontally (column-wise).
- Parameters:
tensors (Sequence of riemann.TN) – Sequence of tensors to stack
- Returns:
Horizontally stacked tensor
- Return type:
riemann.TN
- riemann.dstack(tensors)
Stack tensors in depth (along dimension 2).
1D tensors are first reshaped to (1, N, 1), 2D tensors are reshaped to (M, N, 1), then stacked along dimension 2.
- Parameters:
tensors (Sequence of riemann.TN) – Sequence of tensors to stack
- Returns:
Depth-stacked tensor
- Return type:
riemann.TN
- riemann.tensor_split(input, indices_or_sections, dim=0)
Split a tensor into multiple sub-tensors along a specified dimension.
When
indices_or_sectionsis an integer, it specifies the number of sections to split the tensor into. Whenindices_or_sectionsis a list, it specifies the indices at which to split.
- riemann.vsplit(input, indices_or_sections)
Split a tensor vertically (along dimension 0).
Splits the tensor along dimension 0 (vertical direction) into multiple sub-tensors.
- riemann.hsplit(input, indices_or_sections)
Split a tensor horizontally (along dimension 1).
Splits the tensor along dimension 1 (horizontal direction) into multiple sub-tensors.
- riemann.dsplit(input, indices_or_sections)
Split a tensor in depth (along dimension 2).
Splits a 3D+ tensor along dimension 2 (depth direction) into multiple sub-tensors.
Tensor Operators
The Riemann framework supports a rich set of tensor operators, including arithmetic operators, comparison operators, bitwise operators, and in-place operators. These operators can directly act on tensor objects, support automatic differentiation, and follow Python’s operator precedence rules.
Arithmetic Operators
- __add__(other)
Tensor addition operator, equivalent to +.
- __radd__(other)
Reverse tensor addition operator, used when the left operand is not a tensor.
- __sub__(other)
Tensor subtraction operator, equivalent to -.
- __rsub__(other)
Reverse tensor subtraction operator, used when the left operand is not a tensor.
- __mul__(other)
Tensor multiplication operator, equivalent to *.
- __rmul__(other)
Reverse tensor multiplication operator, used when the left operand is not a tensor.
- __matmul__(other)
Tensor matrix multiplication operator, equivalent to @.
- Parameters:
other (riemann.TN) – Another tensor
- Returns:
Matrix multiplication result tensor
- Return type:
riemann.TN
- Raises:
RuntimeError – If either operand is a scalar
- __rmatmul__(other)
Reverse tensor matrix multiplication operator, used when the left operand is not a tensor.
- __truediv__(other)
Tensor division operator, equivalent to /.
- __rtruediv__(other)
Reverse tensor division operator, used when the left operand is not a tensor.
- __pow__(other)
Tensor power operator, equivalent to **.
- __rpow__(other)
Reverse tensor power operator, used when the left operand is not a tensor.
- __pos__()
Tensor unary plus operator, equivalent to +.
- Returns:
Original tensor
- Return type:
riemann.TN
- __neg__()
Tensor unary minus operator, equivalent to -.
- Returns:
Negated result tensor
- Return type:
riemann.TN
Comparison Operators
- __lt__(other)
Tensor less than operator, equivalent to <.
- __le__(other)
Tensor less than or equal operator, equivalent to <=.
- __gt__(other)
Tensor greater than operator, equivalent to >.
- __ge__(other)
Tensor greater than or equal operator, equivalent to >=.
- __eq__(other)
Tensor equality operator, equivalent to ==.
Bitwise Operators
- __and__(other)
Tensor bitwise AND operator, equivalent to &.
- Parameters:
other (riemann.TN or int) – Another tensor or scalar value
- Returns:
Bitwise AND result tensor
- Return type:
riemann.TN
- __or__(other)
Tensor bitwise OR operator, equivalent to |.
- Parameters:
other (riemann.TN or int) – Another tensor or scalar value
- Returns:
Bitwise OR result tensor
- Return type:
riemann.TN
- __xor__(other)
Tensor bitwise XOR operator, equivalent to ^.
- Parameters:
other (riemann.TN or int) – Another tensor or scalar value
- Returns:
Bitwise XOR result tensor
- Return type:
riemann.TN
- __invert__()
Tensor bitwise NOT operator, equivalent to ~.
- Returns:
Bitwise NOT result tensor
- Return type:
riemann.TN
In-place Operators
- __iadd__(other)
Tensor in-place addition operator, equivalent to +=.
- Parameters:
other (riemann.TN or int or float or complex) – Another tensor or scalar value
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- Raises:
RuntimeError – If the tensor is a leaf node that requires gradients
- __isub__(other)
Tensor in-place subtraction operator, equivalent to -=.
- Parameters:
other (riemann.TN or int or float or complex) – Another tensor or scalar value
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- Raises:
RuntimeError – If the tensor is a leaf node that requires gradients
- __imul__(other)
Tensor in-place multiplication operator, equivalent to *=.
- Parameters:
other (riemann.TN or int or float or complex) – Another tensor or scalar value
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- Raises:
RuntimeError – If the tensor is a leaf node that requires gradients
- __itruediv__(other)
Tensor in-place division operator, equivalent to /=.
- Parameters:
other (riemann.TN or int or float or complex) – Another tensor or scalar value
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- Raises:
RuntimeError – If the tensor is a leaf node that requires gradients
- __ipow__(other)
Tensor in-place power operator, equivalent to **=.
- Parameters:
other (riemann.TN or int or float or complex) – Exponent tensor or scalar value
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- Raises:
RuntimeError – If the tensor is a leaf node that requires gradients
Mathematical Operations
- riemann.matmul(input, other)
Matrix multiplication of two tensors.
- Parameters:
input (riemann.TN) – First tensor
other (riemann.TN) – Second tensor
- Returns:
Matrix product of the tensors
- Return type:
riemann.TN
- riemann.dot(x, y)
Calculate the dot product of two tensors.
- Parameters:
x (riemann.TN) – First tensor
y (riemann.TN) – Second tensor
- Returns:
Dot product result
- Return type:
riemann.TN
- riemann.outer(x, y)
Calculate the outer product of two tensors.
- Parameters:
x (riemann.TN) – First tensor
y (riemann.TN) – Second tensor
- Returns:
Outer product result
- Return type:
riemann.TN
- riemann.cross(input, other, dim=-1)
Calculate the cross product of two tensors.
- Parameters:
input (riemann.TN) – First tensor
other (riemann.TN) – Second tensor
dim (int, optional) – Dimension for cross product calculation, defaults to last dimension
- Returns:
Cross product result
- Return type:
riemann.TN
- riemann.einsum(equation, *operands)
Compute tensor operations using Einstein summation convention.
- Parameters:
equation (str) – Einstein summation equation string, e.g., “ij,jk->ik”
operands (riemann.TN) – Input tensor sequence
- Returns:
Result tensor
- Return type:
riemann.TN
- Raises:
ValueError – If equation format is invalid or tensor dimensions don’t match
- riemann.trace(input)
Return the sum of the elements of the diagonal of the input 2-D matrix.
- Parameters:
input (riemann.TN) – Input 2-D tensor
- Returns:
Scalar tensor with sum of diagonal elements
- Return type:
riemann.TN
- Raises:
ValueError – If input is not a 2-D tensor
- riemann.kron(input, other, *, out=None)
Compute the Kronecker product of input and other.
- Parameters:
input (riemann.TN) – First input tensor
other (riemann.TN) – Second input tensor
out (riemann.TN, optional) – Optional output tensor for storing result
- Returns:
Kronecker product result tensor
- Return type:
riemann.TN
- Raises:
ValueError – If the two tensors have different number of dimensions
- riemann.sum(x, dim=None, keepdim=False)
Calculate the sum of elements across dimensions.
- riemann.prod(x, dim=None, keepdim=False)
Calculate the product of elements across dimensions.
- riemann.mean(x, dim=None, keepdim=False)
Calculate the mean of elements across dimensions.
- riemann.var(x, dim=None, unbiased=True, keepdim=False)
Calculate the variance of elements across dimensions.
- Parameters:
- Returns:
Variance of elements
- Return type:
riemann.TN
- riemann.std(x, dim=None, unbiased=True, keepdim=False)
Calculate the standard deviation of elements across dimensions.
- Parameters:
- Returns:
Standard deviation of elements
- Return type:
riemann.TN
- riemann.norm(x, p='fro', dim=None, keepdim=False)
Calculate the norm of the tensor.
- riemann.max(x, dim=None, keepdim=False, *, out=None)
Calculate the maximum value of elements across dimensions.
- riemann.min(x, dim=None, keepdim=False, *, out=None)
Calculate the minimum value of elements across dimensions.
- riemann.argmax(x, dim=None, keepdim=False, *, out=None)
Calculate the indices of maximum values across dimensions.
- riemann.argmin(x, dim=None, keepdim=False, *, out=None)
Calculate the indices of minimum values across dimensions.
- riemann.abs(x)
Calculate the absolute value of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Absolute value of each element
- Return type:
riemann.TN
- riemann.sqrt(x)
Calculate the square root of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Square root of each element
- Return type:
riemann.TN
- riemann.pow(input, exponent)
Raise each element to a power.
- Parameters:
input (riemann.TN) – Input tensor
exponent (riemann.TN or scalar) – Exponent value
- Returns:
Power of the input tensor
- Return type:
riemann.TN
- riemann.log(x)
Calculate the natural logarithm of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Natural logarithm of each element
- Return type:
riemann.TN
- riemann.log1p(x)
Calculate the natural logarithm of each element plus one.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Natural logarithm of each element plus one
- Return type:
riemann.TN
- riemann.log2(x)
Calculate the base-2 logarithm of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Base-2 logarithm of each element
- Return type:
riemann.TN
- riemann.log10(x)
Calculate the base-10 logarithm of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Base-10 logarithm of each element
- Return type:
riemann.TN
- riemann.exp(x)
Calculate the exponential of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Exponential of each element
- Return type:
riemann.TN
- riemann.exp2(x)
Calculate 2 raised to the power of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
2 raised to the power of each element
- Return type:
riemann.TN
- riemann.square(x)
Calculate the square of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Square of each element
- Return type:
riemann.TN
- riemann.sin(x)
Calculate the sine of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Sine of each element
- Return type:
riemann.TN
- riemann.cos(x)
Calculate the cosine of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Cosine of each element
- Return type:
riemann.TN
- riemann.tan(x)
Calculate the tangent of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Tangent of each element
- Return type:
riemann.TN
- riemann.cot(x)
Calculate the cotangent of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Cotangent of each element
- Return type:
riemann.TN
- riemann.sec(x)
Calculate the secant of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Secant of each element
- Return type:
riemann.TN
- riemann.csc(x)
Calculate the cosecant of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Cosecant of each element
- Return type:
riemann.TN
- riemann.asin(x)
Calculate the arcsine of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Arcsine of each element
- Return type:
riemann.TN
- riemann.acos(x)
Calculate the arccosine of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Arccosine of each element
- Return type:
riemann.TN
- riemann.atan(x)
Calculate the arctangent of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Arctangent of each element
- Return type:
riemann.TN
- riemann.sinh(x)
Calculate the hyperbolic sine of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Hyperbolic sine of each element
- Return type:
riemann.TN
- riemann.cosh(x)
Calculate the hyperbolic cosine of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Hyperbolic cosine of each element
- Return type:
riemann.TN
- riemann.tanh(x)
Calculate the hyperbolic tangent of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Hyperbolic tangent of each element
- Return type:
riemann.TN
- riemann.coth(x)
Calculate the hyperbolic cotangent of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Hyperbolic cotangent of each element
- Return type:
riemann.TN
- riemann.sech(x)
Calculate the hyperbolic secant of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Hyperbolic secant of each element
- Return type:
riemann.TN
- riemann.csch(x)
Calculate the hyperbolic cosecant of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Hyperbolic cosecant of each element
- Return type:
riemann.TN
- riemann.arcsinh(x)
Calculate the inverse hyperbolic sine of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Inverse hyperbolic sine of each element
- Return type:
riemann.TN
- riemann.arccosh(x)
Calculate the inverse hyperbolic cosine of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Inverse hyperbolic cosine of each element
- Return type:
riemann.TN
- riemann.arctanh(x)
Calculate the inverse hyperbolic tangent of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Inverse hyperbolic tangent of each element
- Return type:
riemann.TN
- riemann.ceil(x)
Round up each element to the smallest integer greater than or equal to the element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Ceil of each element
- Return type:
riemann.TN
- riemann.floor(x)
Round down each element to the largest integer less than or equal to the element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Floor of each element
- Return type:
riemann.TN
- riemann.round(x)
Round each element to the nearest integer.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Rounded tensor
- Return type:
riemann.TN
- riemann.trunc(x)
Truncate the decimal part of each element, returning the integer part.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Truncated tensor
- Return type:
riemann.TN
- riemann.sign(x)
Calculate the sign of each element.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Sign of each element
- Return type:
riemann.TN
- riemann.where(cond, x=None, y=None)
Select elements from x or y based on condition.
- Parameters:
cond (riemann.TN) – Condition tensor
x (riemann.TN, optional) – Tensor to select when condition is True
y (riemann.TN, optional) – Tensor to select when condition is False
- Returns:
Tensor composed of elements selected from x or y
- Return type:
riemann.TN
- riemann.clamp(x, min=None, max=None, out=None)
Clamp all elements within a specified range.
- riemann.masked_fill(input, mask, value)
Fill values into the tensor according to a mask.
- Parameters:
input (riemann.TN) – Input tensor
mask (riemann.TN) – Mask tensor, same shape as input tensor
value (scalar) – Value to fill
- Returns:
Filled tensor
- Return type:
riemann.TN
- riemann.maximum(input, other)
Calculate the element-wise maximum of two tensors.
- Parameters:
input (riemann.TN) – First input tensor
other (riemann.TN) – Second input tensor
- Returns:
Tensor composed of element-wise maximum values
- Return type:
riemann.TN
- riemann.minimum(input, other)
Calculate the element-wise minimum of two tensors.
- Parameters:
input (riemann.TN) – First input tensor
other (riemann.TN) – Second input tensor
- Returns:
Tensor composed of element-wise minimum values
- Return type:
riemann.TN
- riemann.diagonal(input, offset=0, dim1=-2, dim2=-1)
Return the diagonal of the tensor.
- riemann.diag(input, offset=0)
Return the diagonal of a 2D tensor or construct a diagonal matrix.
- Parameters:
input (riemann.TN) – Input tensor
offset (int, optional) – Offset of the diagonal
- Returns:
Diagonal of the tensor or diagonal matrix
- Return type:
riemann.TN
- riemann.fill_diagonal(input, value, offset=0, dim1=-2, dim2=-1)
Fill the diagonal of the tensor with a specified value.
- Parameters:
- Returns:
Tensor with filled diagonal
- Return type:
riemann.TN
- riemann.batch_diag(v)
Return the batch diagonal of the tensor.
- Parameters:
v (riemann.TN) – Input tensor
- Returns:
Batch diagonal of the tensor
- Return type:
riemann.TN
- riemann.nonzero(input, *, as_tuple=False)
Return the indices of non-zero elements.
- riemann.tril(input_tensor, diagonal=0)
Return the lower triangular part of the matrix.
- Parameters:
input_tensor (riemann.TN) – Input tensor
diagonal (int, optional) – Diagonal offset
- Returns:
Lower triangular part of the matrix
- Return type:
riemann.TN
- riemann.triu(input_tensor, diagonal=0)
Return the upper triangular part of the matrix.
- Parameters:
input_tensor (riemann.TN) – Input tensor
diagonal (int, optional) – Diagonal offset
- Returns:
Upper triangular part of the matrix
- Return type:
riemann.TN
- riemann.cumsum(input, dim, *, dtype=None)
Calculate the cumulative sum of a tensor along a specified dimension.
- Parameters:
input (riemann.TN) – Input tensor
dim (int) – Dimension along which to compute cumulative sum
dtype (riemann.dtype, optional) – Data type of the output tensor
- Returns:
Cumulative sum result
- Return type:
riemann.TN
- riemann.unique(input, sorted=True, return_inverse=False, return_counts=False, dim=None)
Return the unique elements of a tensor.
- Parameters:
input (riemann.TN) – Input tensor
sorted (bool, optional) – Whether to sort the unique values
return_inverse (bool, optional) – Whether to return inverse indices
return_counts (bool, optional) – Whether to return counts of each unique value
dim (int, optional) – Dimension along which to find unique values, default is None (flattened)
- Returns:
Unique values, or tuple if return_inverse or return_counts is specified
- Return type:
riemann.TN or tuple
- riemann.broadcast_tensors(*tensors)
Broadcast multiple tensors to the same shape.
- Parameters:
tensors (riemann.TN) – Sequence of tensors to broadcast
- Returns:
List of broadcasted tensors
- Return type:
list of riemann.TN
- riemann.repeat(input, repeats, dim=None)
Repeat tensor elements along a specified dimension.
Comparison Operations
- riemann.equal(a, b)
Calculate element-wise equality.
- Parameters:
a (riemann.TN) – First tensor
b (riemann.TN) – Second tensor
- Returns:
Boolean tensor indicating equality
- Return type:
- riemann.not_equal(a, b)
Calculate element-wise inequality.
- Parameters:
a (riemann.TN) – First tensor
b (riemann.TN) – Second tensor
- Returns:
Boolean tensor indicating inequality
- Return type:
- riemann.allclose(a, b, rtol=1e-5, atol=1e-8, equal_nan=False)
Return True if two tensors are element-wise equal within a tolerance.
- riemann.isinf(x)
Element-wise test for infinity.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Boolean tensor indicating infinity
- Return type:
riemann.TN
- riemann.isnan(x)
Element-wise test for NaN.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Boolean tensor indicating NaN
- Return type:
riemann.TN
- riemann.isreal(x)
Element-wise test for real numbers.
- Parameters:
x (riemann.TN) – Input tensor
- Returns:
Boolean tensor indicating real numbers
- Return type:
riemann.TN
Sorting Operations
- riemann.sort(input, dim=-1, descending=False, stable=False, *, out=None)
Sort tensor elements along a given dimension.
- Parameters:
- Returns:
Sorted tensor and indices
- Return type:
riemann.TN, riemann.TN
- riemann.argsort(input, dim=-1, descending=False, stable=False, *, out=None)
Return indices that would sort the tensor along a given dimension.
In-place Operations
- riemann.TN.setat_(index, val)
In-place set values at specified positions in the tensor.
- Parameters:
index (int, slice, tuple, or array) – Index specifying positions to set values
val (riemann.TN, numpy.ndarray, list, or scalar) – Values to set
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.addat_(index, val)
In-place add values to specified positions in the tensor.
- Parameters:
index (int, slice, tuple, or array) – Index specifying positions to operate
val (riemann.TN, numpy.ndarray, list, or scalar) – Values to add
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.subat_(index, val)
In-place subtract values from specified positions in the tensor.
- Parameters:
index (int, slice, tuple, or array) – Index specifying positions to operate
val (riemann.TN, numpy.ndarray, list, or scalar) – Values to subtract
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.mulat_(index, val)
In-place multiply specified positions in the tensor by given values.
- Parameters:
index (int, slice, tuple, or array) – Index specifying positions to operate
val (riemann.TN, numpy.ndarray, list, or scalar) – Values to multiply
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.divat_(index, val)
In-place divide specified positions in the tensor by given values.
- Parameters:
index (int, slice, tuple, or array) – Index specifying positions to operate
val (riemann.TN, numpy.ndarray, list, or scalar) – Divisor
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.powat_(index, val)
In-place exponentiate specified positions in the tensor.
- Parameters:
index (int, slice, tuple, or array) – Index specifying positions to operate
val (riemann.TN, numpy.ndarray, list, or scalar) – Exponent
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.scatter_(dim, index, src=None, *, value=None)
In-place fill values into the tensor according to indices.
- riemann.TN.scatter_add_(dim, index, src)
In-place accumulate values into the tensor according to indices.
- Parameters:
dim (int) – Dimension along which to index
index (riemann.TN) – Index tensor
src (riemann.TN) – Source tensor providing values to accumulate
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.requires_grad_(requires_grad=True)
In-place set whether the tensor requires gradient calculation.
- Parameters:
requires_grad (bool, optional) – Whether to require gradient calculation
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.add_(other)
In-place addition operation.
- Parameters:
other (riemann.TN, numpy.ndarray, list, or scalar) – Value to add to the current tensor
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.sub_(other)
In-place subtraction operation.
- Parameters:
other (riemann.TN, numpy.ndarray, list, or scalar) – Value to subtract from the current tensor
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.mul_(other)
In-place multiplication operation.
- Parameters:
other (riemann.TN, numpy.ndarray, list, or scalar) – Value to multiply with the current tensor
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.div_(other)
In-place division operation.
- Parameters:
other (riemann.TN, numpy.ndarray, list, or scalar) – Divisor
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.pow_(other)
In-place power operation.
- Parameters:
other (riemann.TN, numpy.ndarray, list, or scalar) – Exponent
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.detach_()
In-place detach the tensor from the computation graph.
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.copy_(src)
In-place copy source tensor to current tensor.
- Parameters:
src (riemann.TN, numpy.ndarray, list, or scalar) – Source tensor
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.zero_()
In-place set all elements of the tensor to 0.
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.fill_(value)
In-place fill all elements of the tensor with a specified value.
- Parameters:
value (riemann.TN, numpy.ndarray, list, or scalar) – Fill value
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.TN.clamp_(min=None, max=None)
In-place clamp tensor elements within a specified range.
- riemann.masked_fill_(input, mask, value)
In-place version of masked_fill function, fill values into the tensor according to a mask.
- Parameters:
input (riemann.TN) – Input tensor
mask (riemann.TN) – Mask tensor, same shape as input tensor
value (scalar) – Value to fill
- Returns:
In-place modified tensor
- Return type:
riemann.TN
- riemann.fill_diagonal_(input, value, offset=0, dim1=-2, dim2=-1)
In-place version of fill_diagonal.
- Parameters:
- Returns:
Input tensor with filled diagonal
- Return type:
riemann.TN
Gather and Scatter Functions
- riemann.TN.gather(dim, index)
Gather elements according to specified dimension and indices.
- Parameters:
dim (int) – Gathering dimension
index (riemann.TN) – Index tensor
- Returns:
Gathered tensor
- Return type:
riemann.TN
- riemann.TN.scatter(dim, index, src=None, *, value=None)
Fill values into a new tensor according to indices.
- riemann.TN.scatter_(dim, index, src=None, *, value=None)
In-place fill values into the tensor according to indices.
- riemann.TN.scatter_add(dim, index, src)
Accumulate values into a new tensor according to indices.
- Parameters:
dim (int) – Dimension along which to index
index (riemann.TN) – Index tensor
src (riemann.TN) – Source tensor providing values to accumulate
- Returns:
New tensor with accumulated values
- Return type:
riemann.TN
- riemann.TN.scatter_add_(dim, index, src)
In-place accumulate values into the tensor according to indices.
- Parameters:
dim (int) – Dimension along which to index
index (riemann.TN) – Index tensor
src (riemann.TN) – Source tensor providing values to accumulate
- Returns:
In-place modified tensor
- Return type:
riemann.TN
Data Conversion
- riemann.from_numpy(arr, requires_grad=False)
Convert a NumPy array to a Riemann tensor.
- Parameters:
arr (numpy.ndarray) – Input NumPy array
requires_grad (bool, optional) – Whether to track operations on this tensor
- Returns:
Riemann tensor
- Return type:
riemann.TN
- riemann.item(tensor)
Convert a single-element tensor to a Python scalar.
- riemann.TN.tolist()
Convert the tensor to a Python list.
- riemann.TN.numpy()
Convert the tensor to a NumPy array.
- Returns:
NumPy array
- Return type:
- riemann.TN.to(*args, **kwargs)
Convert the tensor to a specified data type and/or device.
- riemann.TN.type(dtype=None)
Return or convert the data type of the tensor.
- Parameters:
dtype (dtype, optional) – Data type, if None return current data type
- Returns:
Current data type if dtype is None, otherwise converted data type
- Return type:
dtype or riemann.TN
- riemann.TN.type_as(other)
Convert the tensor to the same data type as another tensor.
- Parameters:
other (riemann.TN) – Reference tensor for target data type
- Returns:
Converted data type
- Return type:
riemann.TN
- riemann.TN.bool()
Convert the tensor to boolean type.
- Returns:
Boolean type tensor
- Return type:
riemann.TN
- riemann.TN.float()
Convert the tensor to single-precision floating point type (float32).
- Returns:
float32 type tensor
- Return type:
riemann.TN
- riemann.TN.double()
Convert the tensor to double-precision floating point type (float64).
- Returns:
float64 type tensor
- Return type:
riemann.TN
Copy Functions
- riemann.clone(tensor)
Return a copy of the tensor.
- Parameters:
tensor (riemann.TN) – Input tensor
- Returns:
Tensor copy
- Return type:
riemann.TN
- riemann.TN.copy()
Return a copy of the tensor, not sharing memory and not dependent on the original tensor.
- Returns:
Tensor copy
- Return type:
riemann.TN
- riemann.detach(tensor)
Detach the tensor from the computation graph, stopping gradient tracking.
- Parameters:
tensor (riemann.TN) – Input tensor
- Returns:
Detached tensor
- Return type:
riemann.TN
Data Types
Predefined Data Types
- riemann.is_numeric_array(numpy_arr)
Check if a NumPy array has a numeric data type
- Parameters:
numpy_arr (numpy.ndarray) – The NumPy array to check
- Returns:
Whether the array has a numeric data type
- Return type:
- riemann.is_number(v)
Check if a value is a numeric type
- Parameters:
v (Any) – The value to check
- Returns:
Whether the value is a numeric type
- Return type:
- riemann.is_float_or_complex(dtype)
Check if a data type is a floating point or complex number type
- Parameters:
dtype (numpy.dtype) – The data type to check
- Returns:
Whether the data type is a floating point or complex number type
- Return type:
Data Type Inference
- riemann.infer_data_type(v)
Infer an appropriate data type from Python values, NumPy arrays, or collections of values
- Parameters:
v (Any) – The value or collection of values from which to infer the data type
- Returns:
The inferred data type
- Return type:
Gradient Mode Control
- riemann.is_grad_enabled()
Get the gradient computation state for the current thread
- Returns:
The current gradient computation mode (True for enabled, False for disabled)
- Return type:
- riemann.no_grad(func=None)
Context manager to temporarily disable gradient computation
Can also be used as a function decorator, disabling gradient tracking for all computations within the decorated function.
- Parameters:
func (callable, optional) – Optional, if provided, applies no_grad as a decorator to the function
- Returns:
If func is not provided, returns a context manager instance; if func is provided, returns the decorated function
Example:
# Used as a context manager with riemann.no_grad(): # Computations within this block will not track gradients output = model(input_data) # Used as a decorator @riemann.no_grad def inference(x): # Computations within this function will not track gradients return model(x)
- riemann.enable_grad(func=None)
Context manager to temporarily enable gradient computation
Can also be used as a function decorator, ensuring that computations within the decorated function track gradients.
- Parameters:
func (callable, optional) – Optional, if provided, applies enable_grad as a decorator to the function
- Returns:
If func is not provided, returns a context manager instance; if func is provided, returns the decorated function
Example:
# Used as a context manager with riemann.enable_grad(): # Computations within this block will track gradients output = model(input_data) loss = loss_fn(output, target) loss.backward() # Used as a decorator @riemann.enable_grad def train_step(x, y): # Computations within this function will track gradients pred = model(x) loss = loss_fn(pred, y) loss.backward() return loss
- riemann.set_grad_enabled(mode=True, func=None)
Context manager to set the gradient computation mode
Similar to PyTorch’s set_grad_enabled(), it can explicitly enable or disable gradient computation. Supports usage as both a context manager and a decorator, providing the most flexible way to control gradients.
- Parameters:
mode (bool) – If True, enables gradient computation; if False, disables gradient computation
func (callable, optional) – Optional, the function passed when used as a decorator
- Returns:
If func is None, returns a context manager instance; if the func parameter is provided, returns the wrapped function
Example:
# Used as a context manager with riemann.set_grad_enabled(False): # Computations within this block will not track gradients output = model(input_data) with riemann.set_grad_enabled(True): # Computations within this block will track gradients output = model(input_data) loss = loss_fn(output, target) loss.backward() # Used as a decorator @riemann.set_grad_enabled(False) def inference(x): return model(x) @riemann.set_grad_enabled(True) def train(x, y): pred = model(x) loss = loss_fn(pred, y) loss.backward() return loss
Serialization
- riemann.save(obj, f, pickle_module=None, pickle_protocol=2, use_new_zipfile_serialization=True)
Save an object to a disk file.
This function uses pickle serialization to save Riemann tensors, parameters, modules, or any Python objects to a disk file.
- Parameters:
obj (Any) – The object to save. Can be a tensor, parameter, module, or any picklable object
f (str, os.PathLike, or file-like object) – File path or file-like object to write to
pickle_module (Any, optional) – Module to use for pickling (default: pickle)
pickle_protocol (int, optional) – Pickle protocol version (default: 2)
use_new_zipfile_serialization (bool, optional) – Whether to use zip-based serialization (default: True)
- Example:
>>> import riemann as rm >>> # Save a tensor >>> tensor = rm.randn(3, 4) >>> rm.save(tensor, 'tensor.pt') >>> >>> # Save a module >>> model = rm.nn.Linear(10, 5) >>> rm.save(model.state_dict(), 'model_weights.pt') >>> >>> # Save multiple objects >>> rm.save({ ... 'model': model.state_dict(), ... 'optimizer_state': optimizer.state_dict(), ... 'epoch': 10 ... }, 'checkpoint.pt')
- riemann.load(f, map_location=None, pickle_module=None, **pickle_load_args)
Load an object from a disk file.
This function uses pickle deserialization to load Riemann tensors, parameters, modules, or any Python objects from a disk file.
- Parameters:
f (str, os.PathLike, or file-like object) – File path or file-like object to read from
map_location (Any, optional) – Function or dictionary for remapping storage locations
pickle_module (Any, optional) – Module to use for unpickling (default: pickle)
**pickle_load_args – Additional arguments passed to pickle.load
- Returns:
The loaded object
- Example:
>>> import riemann as rm >>> # Load a tensor >>> tensor = rm.load('tensor.pt') >>> >>> # Load model weights >>> state_dict = rm.load('model_weights.pt') >>> model.load_state_dict(state_dict) >>> >>> # Load a checkpoint >>> checkpoint = rm.load('checkpoint.pt') >>> model.load_state_dict(checkpoint['model']) >>> optimizer.load_state_dict(checkpoint['optimizer_state']) >>> epoch = checkpoint['epoch']
CUDA Support
- class riemann.cuda.Device(device='cpu')
Represents a device (CPU or CUDA GPU).
- Parameters:
device (str or int) – Device type or index. Can be: - String: ‘cpu’, ‘cuda’ or ‘cuda:0’, ‘cuda:1’ - Integer: CUDA device index
- Example:
>>> import riemann as rm >>> # Create a CPU device >>> cpu_device = rm.Device('cpu') >>> # Create a CUDA device >>> cuda_device = rm.Device('cuda') >>> # Create a specific CUDA device >>> cuda_device_1 = rm.Device('cuda:1') >>> # Create a CUDA device by type and index >>> cuda_device_2 = rm.Device('cuda', 2)
- __enter__()
Enter the device context.
- __exit__(exc_type, exc_val, exc_tb)
Exit the device context.
- __eq__(other)
Compare with another device.
- __str__()
Return the string representation of the device.
- __repr__()
Return the official string representation of the device.
- riemann.cuda.is_available()
Check if CUDA is available.
- Returns:
True if CUDA is available, False otherwise
- Return type:
- riemann.cuda.device_count()
Return the number of available CUDA devices.
- Returns:
Number of available CUDA devices
- Return type:
- riemann.cuda.current_device()
Return the index of the current CUDA device.
- Returns:
Index of the current CUDA device
- Return type:
- riemann.cuda.get_device_name(device_idx)
Return the name of the CUDA device at the given index.
- riemann.cuda.set_device(device_idx)
Set the current CUDA device.
- Parameters:
device_idx (int) – CUDA device index to set as current
- riemann.cuda.empty_cache()
Clear the CUDA cache.
- riemann.cuda.synchronize(device=None)
Wait for all operations on the current CUDA device to complete.
- riemann.cuda.is_in_cuda_context()
Check if the current thread is in a CUDA device context.
- Returns:
True if in a CUDA device context, False otherwise
- Return type:
- riemann.memory_allocated(device_idx=None)
Return the amount of memory allocated on the given CUDA device.
- riemann.get_default_device()
Get the default device for tensor creation.
- Returns:
The default device
- Return type:
- riemann.set_default_device(device)
Set the default device for tensor creation.
- Parameters:
device (str, int, or Device) – The device to set as default. Can be: - String: ‘cpu’, ‘cuda’ or ‘cuda:0’, ‘cuda:1’ - Integer: CUDA device index - Device object
- Example:
>>> import riemann as rm >>> rm.get_default_device() device(type='cpu', index=None) >>> rm.set_default_device('cuda') >>> rm.get_default_device() device(type='cuda', index=0) >>> rm.set_default_device('cuda:1') >>> rm.get_default_device() device(type='cuda', index=1)
Automatic Differentiation
Gradient Computation
- riemann.autograd.backward(self, gradient=None, retain_graph=False, create_graph=False)
Perform reverse-mode automatic differentiation (backpropagation).
Starting from the current tensor, propagate gradients backward through the computation graph, computing and storing gradients for all leaf nodes or intermediate nodes with retains_grad=True.
- Parameters:
self (riemann.TN) – The tensor that triggers backpropagation
gradient (riemann.TN or None, optional) – Gradient of the output tensor, defaults to None
retain_graph (bool, optional) – This parameter is for PyTorch compatibility, Riemann backpropagation does not rely on it
create_graph (bool, optional) – Whether to create a computation graph during gradient computation, set to True for higher-order derivative calculations
- riemann.autograd.grad(outputs, inputs, grad_outputs=None, retain_graph=False, create_graph=False, allow_unused=False)
Compute and return the gradients of outputs with respect to inputs.
This is the core gradient computation function in the Riemann framework. Similar to the backward() method, but it directly returns the computed gradient tensors instead of storing them in the .grad attribute of input tensors. This makes it more suitable for advanced gradient computation scenarios, such as calculating Jacobian matrices, Hessian matrices, etc.
- Parameters:
outputs (riemann.TN) – Output tensor(s) for which to compute gradients
inputs (riemann.TN or list/tuple of riemann.TN) – Input tensor(s) or list/tuple of input tensors for which to compute gradients
grad_outputs (riemann.TN or None, optional) – Gradient(s) of the output tensor(s), defaults to None
retain_graph (bool, optional) – This parameter is for PyTorch compatibility, Riemann backpropagation does not rely on it
create_graph (bool, optional) – Whether to create a computation graph during gradient computation
allow_unused (bool, optional) – Whether to allow unused inputs
- Returns:
Tuple of gradient tensors corresponding to the inputs
- Return type:
tuple of riemann.TN
- riemann.autograd.higher_order_grad(outputs, inputs, n, create_graph=False)
Compute the n-th order derivative of a scalar tensor output with respect to each tensor in inputs.
This function computes higher-order derivatives by recursively calling grad(). For each input tensor, it computes the n-th order derivative and returns a tuple of derivatives corresponding to the input list.
- Parameters:
outputs (riemann.TN) – Scalar tensor output for which to compute higher-order derivatives
inputs (riemann.TN or list/tuple of riemann.TN) – Input tensor(s) or list/tuple of input tensors for which to compute higher-order derivatives
n (int) – Order of the derivative, must be a non-negative integer
create_graph (bool, optional) – Whether to create a computation graph during gradient computation
- Returns:
Tuple of n-th order derivative tensors corresponding to the inputs
- Return type:
tuple of riemann.TN
- riemann.autograd.gradcheck(func, inputs, eps=1e-6, atol=1e-5, rtol=1e-3, raise_exception=True, check_sparse_nnz=False, fast_mode=False)
Verify the correctness of gradient computation for a given function by comparing numerical and analytical gradients.
This function computes numerical gradients using finite differences by adding small perturbations to input parameters, and compares them with analytical gradients computed using automatic differentiation.
- Parameters:
func (callable) – Function for which to verify gradients
inputs (tuple of riemann.TN) – Tuple of input tensors for testing
eps (float, optional) – Small perturbation for numerical gradient computation
atol (float, optional) – Absolute error tolerance
rtol (float, optional) – Relative error tolerance
raise_exception (bool, optional) – Whether to raise an exception if gradient check fails
check_sparse_nnz (bool, optional) – Whether to check sparse tensor non-zero elements (not supported in current version)
fast_mode (bool, optional) – Whether to use fast mode (only check the first element)
- Returns:
True if gradient check passes, False otherwise
- Return type:
- riemann.track_grad(grad_func)
Create a gradient tracking decorator for adding automatic differentiation support to functions.
This decorator factory receives a gradient function and returns a decorator that can convert ordinary tensor operation functions into functions that support automatic differentiation. It automatically creates backpropagation functions and manages gradient computation graph construction.
- Parameters:
grad_func (callable) – Gradient computation function that receives the same input parameters as the forward function, returns a tuple containing gradients (partial derivatives) for each input tensor. Elements in the tuple must correspond one-to-one with the input tensors of the forward function. For tensors that don’t require gradients, the corresponding gradient value should be None.
- Returns:
A decorator function for wrapping forward computation functions to support automatic differentiation
- Return type:
callable
Example:
# Define single-input derivative function (d/dx log(x) = 1/x) def _log_derivative(x): return (1. / x.conj(),) # Use track_grad decorator to create automatic differentiation-supported log function @track_grad(_log_derivative) def mylog(x): return tensor(np.log(x.data)) # Use automatic differentiation-supported log function x = tensor(2., requires_grad=True) y = mylog(x) y.backward() print(f'x.grad = {x.grad}') # Output: x.grad = 0.5 # Define multi-input derivative function (d/dx (x + y) = 1, d/dy (x + y) = 1) def _add_derivative(x, y): return (tensor(1.), tensor(1.)) # Use track_grad decorator to create automatic differentiation-supported addition function @track_grad(_add_derivative) def myadd(x, y): return tensor(x.data + y.data) # Use automatic differentiation-supported addition function x = tensor(2., requires_grad=True) y = tensor(3., requires_grad=True) z = myadd(x, y) z.backward() print(f'x.grad = {x.grad}') # Output: x.grad = 1.0 print(f'y.grad = {y.grad}') # Output: y.grad = 1.0
- class riemann.autograd.Function
Base class for custom gradient implementations in the Riemann framework, designed with an interface similar to PyTorch’s torch.autograd.Function.
To use this class, inherit from it and implement the forward and backward static methods: - forward: Perform forward computation, return output tensor(s) - backward: Receive output gradient(s), return input gradient(s)
Example:
class MyFunction(Function): @staticmethod def forward(ctx, input1, input2): ctx.save_for_backward(input1, input2) output = input1 * input2 return output @staticmethod def backward(ctx, grad_output): input1, input2 = ctx.saved_tensors grad_input1 = grad_output * input2 grad_input2 = grad_output * input1 return grad_input1, grad_input2
Functional Differentiation
- riemann.autograd.functional.jacobian(func, inputs, create_graph=False, strict=True)
Compute the Jacobian matrix of a function.
This function computes the Jacobian matrix of a given function at the input point, supporting single or multiple inputs, single or multiple outputs, and maintains compatibility with PyTorch’s jacobian function behavior.
- Parameters:
func (callable) – Function for which to compute the Jacobian matrix
inputs (riemann.TN or list/tuple of riemann.TN) – Input tensor(s) or list/tuple of input tensors for the function
create_graph (bool, optional) – Whether to create a computation graph during gradient computation
strict (bool, optional) – Whether to strictly follow PyTorch’s behavior specifications
- Returns:
Jacobian matrix representation corresponding to the input/output types
- Return type:
riemann.TN or list/tuple of riemann.TN
- riemann.autograd.functional.hessian(func, inputs, create_graph=False, strict=True)
Compute the Hessian matrix of a function.
This function computes the Hessian matrix of a given function at the input point, which is the Jacobian matrix of the gradient. It supports single or multiple inputs and maintains compatibility with PyTorch’s hessian function behavior.
- Parameters:
func (callable) – Scalar-valued function for which to compute the Hessian matrix
inputs (riemann.TN or list/tuple of riemann.TN) – Input tensor(s) or list/tuple of input tensors for the function
create_graph (bool, optional) – Whether to create a computation graph during gradient computation
strict (bool, optional) – If True, raises an error when output is detected to be independent of input
- Returns:
Hessian matrix representation corresponding to the input types
- Return type:
riemann.TN or list/tuple of riemann.TN
- riemann.autograd.functional.jvp(func, inputs, v=None, create_graph=False, strict=False)
Compute the Jacobian-Vector Product (JVP).
- Parameters:
func (callable) – Function for which to compute JVP
inputs (riemann.TN or list/tuple of riemann.TN) – Input tensor(s) or list/tuple of input tensors for the function
v (riemann.TN or list/tuple of riemann.TN, optional) – Vector to multiply with the Jacobian matrix
create_graph (bool, optional) – Whether to create a computation graph during gradient computation, for higher-order derivative calculations
strict (bool, optional) – Whether to raise an error for unused inputs
- Returns:
Function output and JVP value
- Return type:
tuple of (riemann.TN, riemann.TN or list/tuple of riemann.TN)
- riemann.autograd.functional.vjp(func, inputs, v=None, create_graph=False, strict=False)
Compute the Vector-Jacobian Product (VJP).
- Parameters:
func (callable) – Function for which to compute VJP
inputs (riemann.TN or list/tuple of riemann.TN) – Input tensor(s) or list/tuple of input tensors for the function
v (riemann.TN or list/tuple of riemann.TN, optional) – Vector to multiply with the Jacobian matrix
create_graph (bool, optional) – Whether to create a computation graph during gradient computation, for higher-order derivative calculations
strict (bool, optional) – Whether to raise an error for unused inputs
- Returns:
Function output and VJP value
- Return type:
tuple of (riemann.TN, riemann.TN or list/tuple of riemann.TN)
- riemann.autograd.functional.hvp(func, inputs, v, create_graph=False, strict=False)
Compute the Hessian-Vector Product (HVP).
- Parameters:
func (callable) – Scalar-valued function for which to compute HVP
inputs (riemann.TN or list/tuple of riemann.TN) – Input tensor(s) or list/tuple of input tensors for the function
v (riemann.TN or list/tuple of riemann.TN) – Vector to multiply with the Hessian matrix
create_graph (bool, optional) – Whether to create a computation graph during gradient computation, for higher-order derivative calculations
strict (bool, optional) – Whether to raise an error for unused inputs
- Returns:
Function output and HVP value
- Return type:
tuple of (riemann.TN, riemann.TN or list/tuple of riemann.TN)
- riemann.autograd.functional.vhp(func, inputs, v, create_graph=False, strict=False)
Compute the Vector-Hessian Product (VHP).
- Parameters:
func (callable) – Scalar-valued function for which to compute VHP
inputs (riemann.TN or list/tuple of riemann.TN) – Input tensor(s) or list/tuple of input tensors for the function
v (riemann.TN or list/tuple of riemann.TN) – Vector to multiply with the Hessian matrix
create_graph (bool, optional) – Whether to create a computation graph during gradient computation, for higher-order derivative calculations
strict (bool, optional) – Whether to raise an error for unused inputs
- Returns:
Function output and VHP value
- Return type:
tuple of (riemann.TN, riemann.TN or list/tuple of riemann.TN)
- riemann.autograd.functional.derivative(func, create_graph=False)
Compute the derivative function of a function.
This function returns a new function that, when called, computes the derivative of the original function func at the input point. Supports func with single or multiple tensor inputs, returning single or multiple tensors or scalars. Internally implements derivative computation based on the jacobian function.
- Parameters:
func (callable) – Function to differentiate
create_graph (bool, optional) – Whether to create a computation graph during gradient computation, defaults to False
- Returns:
Derivative function that accepts the same inputs as the original function
- Return type:
callable
Context Managers
- riemann.no_grad()
Context manager to disable gradient computation. Operations within this context will not be recorded in the computation graph.
- riemann.enable_grad()
Context manager to enable gradient computation.
Linear Algebra Module
The riemann.linalg module provides various linear algebra operations, including matrix multiplication, decomposition, and solving.
Matrix Operations
- riemann.linalg.matmul(a, b)
Compute the matrix product of two tensors.
- Parameters:
a (riemann.TN) – First input tensor
b (riemann.TN) – Second input tensor
- Returns:
Matrix product result
- Return type:
riemann.TN
- riemann.linalg.cross(a, b, dim=-1)
Compute the cross product (vector product) of two tensors.
- Parameters:
a (riemann.TN) – First input tensor
b (riemann.TN) – Second input tensor
dim (int, optional) – Dimension along which to compute cross product, default is -1
- Returns:
Cross product result
- Return type:
riemann.TN
Norm Computation
- riemann.linalg.norm(A, ord=None, dim=None, keepdim=False)
Compute the norm of a tensor or matrix.
- Parameters:
- Returns:
Norm value
- Return type:
riemann.TN
- riemann.linalg.vector_norm(x, ord=2, dim=None, keepdim=False)
Compute the vector norm.
- riemann.linalg.matrix_norm(A, ord='fro', dim=(-2, -1), keepdim=False)
Compute the matrix norm.
- Parameters:
- Returns:
Matrix norm value
- Return type:
riemann.TN
- riemann.linalg.cond(A, p=None)
Compute the condition number of a matrix.
- riemann.linalg.svdvals(A)
Compute the singular values of a matrix.
- Parameters:
A (riemann.TN) – Input matrix
- Returns:
Singular values
- Return type:
riemann.TN
Matrix Decomposition
- riemann.linalg.det(A)
Compute the determinant of a matrix.
- Parameters:
A (riemann.TN) – Input matrix
- Returns:
Determinant value
- Return type:
riemann.TN
- riemann.linalg.inv(A)
Compute the inverse of a square matrix.
- Parameters:
A (riemann.TN) – Input square matrix
- Returns:
Inverse matrix
- Return type:
riemann.TN
- riemann.linalg.skew(A)
Compute the skew-symmetric part of a matrix.
- Parameters:
A (riemann.TN) – Input matrix
- Returns:
Skew-symmetric matrix
- Return type:
riemann.TN
- riemann.linalg.svd(A, full_matrices=True)
Compute the singular value decomposition (SVD) of a matrix.
Eigenvalue Decomposition
- riemann.linalg.eig(A)
Compute the eigenvalues and eigenvectors of a square matrix.
- Parameters:
A (riemann.TN) – Input square matrix
- Returns:
Tuple of (eigenvalues, eigenvectors)
- Return type:
- riemann.linalg.eigh(A, UPLO='L')
Compute the eigenvalues and eigenvectors of a Hermitian (or real symmetric) matrix.
Linear Equation Solving
- riemann.linalg.lstsq(A, b, rcond=None)
Compute the least-squares solution.
- Parameters:
A (riemann.TN) – Coefficient matrix
b (riemann.TN) – Right-hand side vector or matrix
rcond (float, optional) – Singular value threshold
- Returns:
Least-squares solution
- Return type:
riemann.TN
- riemann.linalg.lu(A, pivot=True)
Compute the LU decomposition of a matrix.
- riemann.linalg.solve(A, b)
Solve the linear equation system Ax = b.
- Parameters:
A (riemann.TN) – Coefficient matrix
b (riemann.TN) – Right-hand side vector or matrix
- Returns:
Solution vector or matrix
- Return type:
riemann.TN
- riemann.linalg.qr(A, mode='reduced')
Compute the QR decomposition of a matrix.
- riemann.linalg.cholesky(A, upper=False)
Compute the Cholesky decomposition of a positive-definite matrix.
- Parameters:
A (riemann.TN) – Input positive-definite matrix
upper (bool, optional) – Whether to return upper triangular matrix, default is False (lower)
- Returns:
Cholesky factor
- Return type:
riemann.TN
Neural Network Modules
Base Classes
Container Modules
- class riemann.nn.Sequential(*modules)
Container that applies a sequence of modules in order.
- Parameters:
modules (list of riemann.Module) – List of modules
- class riemann.nn.ModuleList(modules=None)
Container class for storing a list of modules.
This container allows storing multiple modules in list form and provides convenient access and iteration methods. All submodules are properly registered to appear in the parameter list.
- Parameters:
modules (list of riemann.Module, optional) – List of modules for initialization
- class riemann.nn.ModuleDict(modules=None)
Container class for storing a dictionary of modules.
This container allows storing modules using string keys and provides dictionary-like access methods. All submodules are properly registered.
- Parameters:
modules (dict of {str: riemann.Module}, optional) – Dictionary of modules for initialization
- class riemann.nn.ParameterList(parameters=None)
Container class for storing a list of parameters.
This container allows storing multiple parameters in list form. All parameters are properly registered to appear in the parameter list.
- Parameters:
parameters (list of riemann.Parameter, optional) – List of parameters for initialization
- class riemann.nn.ParameterDict(parameters=None)
Container class for storing a dictionary of parameters.
This container allows storing parameters using string keys and provides dictionary-like access methods. All parameters are properly registered.
- Parameters:
parameters (dict of {str: riemann.Parameter}, optional) – Dictionary of parameters for initialization
Linear Layers
Convolutional Layers
- class riemann.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')
1D convolutional layer.
Applies convolution operations to 1D inputs, extracting features and generating new feature maps.
- Parameters:
in_channels (int) – Number of input channels
out_channels (int) – Number of output channels
padding (int or tuple, optional) – Zero-padding added to all sides of the input
dilation (int or tuple, optional) – Spacing between kernel elements
groups (int, optional) – Grouping between input and output channels
bias (bool, optional) – Whether to use bias term
padding_mode (str, optional) – Padding mode
- class riemann.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')
2D convolutional layer.
Applies convolution operations to 2D inputs, extracting image features and generating new feature maps.
- Parameters:
in_channels (int) – Number of input channels
out_channels (int) – Number of output channels
padding (int or tuple, optional) – Zero-padding added to all sides of the input
dilation (int or tuple, optional) – Spacing between kernel elements
groups (int, optional) – Grouping between input and output channels
bias (bool, optional) – Whether to use bias term
padding_mode (str, optional) – Padding mode
- class riemann.nn.Conv3d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')
3D convolutional layer.
Applies convolution operations to 3D inputs, commonly used for feature extraction in video, volumetric data, etc.
Pooling Layers
- class riemann.nn.MaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
1D max pooling layer.
Applies max pooling to 1D inputs, used for extracting key features from sequence data and reducing data dimensions.
- Parameters:
stride (int or tuple, optional) – Stride of the pooling window
padding (int or tuple, optional) – Zero-padding added to all sides of the input
dilation (int or tuple, optional) – Spacing between pooling window elements
return_indices (bool, optional) – Whether to return indices of maximum values
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
- class riemann.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
2D max pooling layer.
Applies max pooling to 2D inputs, commonly used for feature extraction and dimension reduction in image data.
- Parameters:
stride (int or tuple, optional) – Stride of the pooling window
padding (int or tuple, optional) – Zero-padding added to all sides of the input
dilation (int or tuple, optional) – Spacing between pooling window elements
return_indices (bool, optional) – Whether to return indices of maximum values
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
- class riemann.nn.MaxPool3d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
3D max pooling layer.
Applies max pooling to 3D inputs, commonly used for feature extraction in video, volumetric data, etc.
- Parameters:
stride (int or tuple, optional) – Stride of the pooling window
padding (int or tuple, optional) – Zero-padding added to all sides of the input
dilation (int or tuple, optional) – Spacing between pooling window elements
return_indices (bool, optional) – Whether to return indices of maximum values
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
- class riemann.nn.AvgPool1d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
1D average pooling layer.
Applies average pooling to 1D inputs, used for smoothing sequence data features and reducing data dimensions.
- Parameters:
stride (int or tuple, optional) – Stride of the pooling window
padding (int or tuple, optional) – Zero-padding added to all sides of the input
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
count_include_pad (bool, optional) – Whether to include zero-padding when calculating average
divisor_override (int, optional) – If specified, will be used as denominator
- class riemann.nn.AvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
2D average pooling layer.
Applies average pooling to 2D inputs, commonly used for feature smoothing and dimension reduction in image data.
- Parameters:
stride (int or tuple, optional) – Stride of the pooling window
padding (int or tuple, optional) – Zero-padding added to all sides of the input
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
count_include_pad (bool, optional) – Whether to include zero-padding when calculating average
divisor_override (int, optional) – If specified, will be used as denominator
- class riemann.nn.AvgPool3d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
3D average pooling layer.
Applies average pooling to 3D inputs, commonly used for feature smoothing and dimension reduction in video, volumetric data, etc.
- Parameters:
stride (int or tuple, optional) – Stride of the pooling window
padding (int or tuple, optional) – Zero-padding added to all sides of the input
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
count_include_pad (bool, optional) – Whether to include zero-padding when calculating average
divisor_override (int, optional) – If specified, will be used as denominator
Adaptive Pooling Layers
- class riemann.nn.AdaptiveAvgPool1d(output_size)
1D adaptive average pooling layer.
Automatically computes pooling kernel size and stride based on the specified output size, ensuring the output dimensions are always fixed.
- class riemann.nn.AdaptiveAvgPool2d(output_size)
2D adaptive average pooling layer.
Automatically computes pooling kernel size and stride based on the specified output size, commonly used to convert feature maps of arbitrary sizes to fixed dimensions.
- class riemann.nn.AdaptiveAvgPool3d(output_size)
3D adaptive average pooling layer.
Automatically computes pooling kernel size and stride based on the specified output size, commonly used for feature extraction in volumetric data.
- class riemann.nn.AdaptiveMaxPool1d(output_size, return_indices=False)
1D adaptive max pooling layer.
Automatically computes pooling kernel size and stride based on the specified output size, ensuring the output dimensions are always fixed.
- class riemann.nn.AdaptiveMaxPool2d(output_size, return_indices=False)
2D adaptive max pooling layer.
Automatically computes pooling kernel size and stride based on the specified output size, commonly used to convert feature maps of arbitrary sizes to fixed dimensions.
- class riemann.nn.AdaptiveMaxPool3d(output_size, return_indices=False)
3D adaptive max pooling layer.
Automatically computes pooling kernel size and stride based on the specified output size, commonly used for feature extraction in volumetric data.
Normalization Layers
- class riemann.nn.BatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
1D batch normalization layer.
Normalizes the channel dimension of 2D or 3D input tensors to have zero mean and unit variance, improving training convergence and model generalization.
- Parameters:
num_features (int) – Number of features (channel dimension)
eps (float, optional) – Small value to avoid division by zero
momentum (float, optional) – Momentum for running statistics
affine (bool, optional) – Whether to include learnable affine parameters
track_running_stats (bool, optional) – Whether to track running mean and variance
- class riemann.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1)
2D batch normalization layer.
- class riemann.nn.BatchNorm3d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
3D batch normalization layer.
Normalizes the channel dimension of 5D input tensors (N, C, D, H, W) to have zero mean and unit variance, improving training convergence and model generalization.
- Parameters:
num_features (int) – Number of features (channel dimension)
eps (float, optional) – Small value to avoid division by zero
momentum (float, optional) – Momentum for running statistics
affine (bool, optional) – Whether to include learnable affine parameters
track_running_stats (bool, optional) – Whether to track running mean and variance
- class riemann.nn.LayerNorm(normalized_shape, eps=1e-05, affine=True, device=None, dtype=None)
Layer normalization layer, normalizes specified dimensions.
Compatible with torch.nn.LayerNorm, normalizes specified dimensions of input tensors to have zero mean and unit variance.
- Parameters:
normalized_shape (int or tuple) – Integer or tuple specifying dimensions to normalize
eps (float, optional) – Small value added to variance to avoid division by zero
affine (bool, optional) – Whether to include learnable affine parameters (gamma and beta)
device (optional) – Device for parameters and buffers
dtype (optional) – Data type for parameters and buffers
- class riemann.nn.Flatten(start_dim=1, end_dim=-1)
Layer that flattens tensor dimensions, removing all dimensions from start_dim to end_dim.
Typically used after convolutional layers and before fully connected layers to flatten multi-dimensional convolution results into 1D vectors.
Activation Function Modules
- class riemann.nn.ReLU(inplace=False)
ReLU activation function
Applies the rectified linear unit function element-wise: ReLU(x) = max(0, x)
- Parameters:
inplace (bool, optional) – Whether to perform operation in-place
- class riemann.nn.LeakyReLU(negative_slope=0.01, inplace=False)
Leaky ReLU activation function
Applies the leaky rectified linear unit function element-wise: LeakyReLU(x) = max(x, negative_slope * x)
- class riemann.nn.RReLU(lower=0.125, upper=0.3333333333333333, inplace=False)
Randomized Leaky ReLU activation function
Applies the randomized leaky rectified linear unit function element-wise
- class riemann.nn.PReLU(num_parameters=1, init=0.25)
Parametric ReLU activation function
Applies the parametric rectified linear unit function element-wise, where a is a learnable parameter
- class riemann.nn.Sigmoid
Sigmoid activation function
Applies the sigmoid function element-wise, mapping values to the [0, 1] range
- class riemann.nn.Tanh
Tanh activation function
Applies the hyperbolic tangent function element-wise, mapping values to the [-1, 1] range
- class riemann.nn.Softmax(dim=None)
Softmax activation function
Applies the softmax function along the specified dimension
- Parameters:
dim (int, optional) – Dimension to apply softmax
- class riemann.nn.LogSoftmax(dim=None)
Log-Softmax activation function
Applies the log-softmax function along the specified dimension
- Parameters:
dim (int, optional) – Dimension to apply log-softmax
- class riemann.nn.GELU
Gaussian Error Linear Unit activation function
Applies the Gaussian Error Linear Unit function element-wise: GELU(x) = x * Φ(x), where Φ is the cumulative distribution function of the standard normal distribution
- class riemann.nn.Softplus(beta=1, threshold=20)
Softplus activation function
Applies the Softplus activation function element-wise: Softplus(x) = (1 / beta) * log(1 + exp(beta * x))
Dropout Layers
- class riemann.nn.Dropout(p=0.5)
Dropout layer for preventing overfitting.
- Parameters:
p (float, optional) – Dropout probability
- class riemann.nn.Dropout2d(p=0.5, inplace=False)
2D dropout layer for preventing overfitting.
During training, randomly zeroes entire channels of the input tensor with probability p, and scales remaining channels by 1/(1-p). During evaluation, no operation is performed.
- class riemann.nn.Dropout3d(p=0.5, inplace=False)
3D dropout layer for preventing overfitting.
During training, randomly zeroes entire channels of the input tensor with probability p, and scales remaining channels by 1/(1-p). During evaluation, no operation is performed.
Embedding Layer
- class riemann.nn.Embedding(num_embeddings, embedding_dim, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False, dtype=None, device=None)
Embedding layer that converts integer indices to dense vectors.
The embedding layer is a fundamental component in neural networks for handling categorical features and sequence data.
- Parameters:
num_embeddings (int) – Number of embedding vectors, i.e., vocabulary size
embedding_dim (int) – Dimension of each embedding vector
padding_idx (int, optional) – If specified, embedding vectors at this index do not participate in gradient computation and remain unchanged during training
max_norm (float, optional) – If specified, all embedding vectors with norm exceeding max_norm will be renormalized to max_norm
norm_type (float, optional) – p-value for norm calculation, defaults to 2 (L2 norm)
scale_grad_by_freq (bool, optional) – If True, gradients will be scaled by frequency of each word in mini-batch
sparse (bool, optional) – If True, gradient of weight will be a sparse tensor
dtype (np.dtype, optional) – Data type for embedding weights
device (str|int|Device, optional) – Device for embedding weights
Loss Function Modules
- class riemann.nn.L1Loss(size_average=None, reduce=None, reduction='mean')
Mean absolute error loss, computes absolute error between input and target values.
- class riemann.nn.MSELoss(size_average=None, reduce=None, reduction='mean')
Mean squared error loss, computes squared error between input and target values.
- class riemann.nn.NLLLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')
Negative log likelihood loss, used for probability prediction in classification tasks.
- class riemann.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')
Cross entropy loss, combines LogSoftmax and NLLLoss in one class, commonly used for multi-class classification tasks.
- class riemann.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction='mean')
Binary cross entropy loss, computes binary classification error between target and output.
- class riemann.nn.BCEWithLogitsLoss(weight=None, size_average=None, reduce=None, reduction='mean')
Binary cross entropy loss with logits, computes binary cross entropy directly on input logits.
- class riemann.nn.HuberLoss(delta=1.0, size_average=None, reduce=None, reduction='mean')
Huber loss function, uses squared error when error is less than delta, otherwise uses linear error.
- class riemann.nn.SmoothL1Loss(beta=1.0, size_average=None, reduce=None, reduction='mean')
Smooth L1 loss, combines advantages of L1 and L2 losses, uses quadratic loss for small errors and linear loss for large errors.
Transformer Modules
- class riemann.nn.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None, batch_first=False, device=None, dtype=None)
Multi-head attention mechanism, allows the model to attend to information from different representation subspaces.
- Parameters:
embed_dim (int) – Dimension of input and output vectors, must be divisible by num_heads
num_heads (int) – Number of attention heads
dropout (float, optional) – Dropout probability for attention weights
bias (bool, optional) – Whether to add bias to projection layers
add_bias_kv (bool, optional) – Whether to add learnable bias to key and value sequences
add_zero_attn (bool, optional) – Whether to add a column of zeros to attention weights
kdim (int, optional) – Dimension of key vectors, defaults to embed_dim
vdim (int, optional) – Dimension of value vectors, defaults to embed_dim
batch_first (bool, optional) – Whether input/output shape is (batch, seq, feature) instead of (seq, batch, feature)
device (optional) – Device for tensors
dtype (optional) – Data type for tensors
- class riemann.nn.TransformerEncoderLayer(d_model, nhead, dim_feedforward=2048, dropout=0.1, activation='relu', layer_norm_eps=1e-05, batch_first=False, norm_first=False, bias=True, device=None, dtype=None)
Single layer of Transformer encoder, consisting of self-attention mechanism and feed-forward network.
- Parameters:
d_model (int) – Dimension of input and output features
nhead (int) – Number of attention heads
dim_feedforward (int, optional) – Dimension of feed-forward hidden layer
dropout (float, optional) – Dropout probability
activation (str, optional) – Activation function type, ‘relu’ or ‘gelu’
layer_norm_eps (float, optional) – Epsilon value for layer normalization
batch_first (bool, optional) – Whether input/output shape is (batch, seq, feature)
norm_first (bool, optional) – Whether to use Pre-LN mode
bias (bool, optional) – Whether to add bias to linear layers
device (optional) – Device for tensors
dtype (optional) – Data type for tensors
- class riemann.nn.TransformerDecoderLayer(d_model, nhead, dim_feedforward=2048, dropout=0.1, activation='relu', layer_norm_eps=1e-05, batch_first=False, norm_first=False, bias=True, device=None, dtype=None)
Single layer of Transformer decoder, consisting of self-attention, cross-attention, and feed-forward network.
- Parameters:
d_model (int) – Dimension of input and output features
nhead (int) – Number of attention heads
dim_feedforward (int, optional) – Dimension of feed-forward hidden layer
dropout (float, optional) – Dropout probability
activation (str, optional) – Activation function type, ‘relu’ or ‘gelu’
layer_norm_eps (float, optional) – Epsilon value for layer normalization
batch_first (bool, optional) – Whether input/output shape is (batch, seq, feature)
norm_first (bool, optional) – Whether to use Pre-LN mode
bias (bool, optional) – Whether to add bias to linear layers
device (optional) – Device for tensors
dtype (optional) – Data type for tensors
- class riemann.nn.TransformerEncoder(encoder_layer, num_layers, norm=None, enable_nested_tensor=True, mask_check=True)
Transformer encoder consisting of N stacked TransformerEncoderLayer layers.
- Parameters:
encoder_layer (TransformerEncoderLayer) – Single encoder layer instance to be cloned
num_layers (int) – Number of encoder layers
norm (Module, optional) – Final layer normalization, optional
enable_nested_tensor (bool, optional) – Whether to enable nested tensor optimization (interface compatibility only)
mask_check (bool, optional) – Whether to perform mask checking (interface compatibility only)
- class riemann.nn.TransformerDecoder(decoder_layer, num_layers, norm=None)
Transformer decoder consisting of N stacked TransformerDecoderLayer layers.
- Parameters:
decoder_layer (TransformerDecoderLayer) – Single decoder layer instance to be cloned
num_layers (int) – Number of decoder layers
norm (Module, optional) – Final layer normalization, optional
- class riemann.nn.Transformer(d_model=512, nhead=8, num_encoder_layers=6, num_decoder_layers=6, dim_feedforward=2048, dropout=0.1, activation='relu', custom_encoder=None, custom_decoder=None, layer_norm_eps=1e-05, batch_first=False, norm_first=False, bias=True, device=None, dtype=None)
Complete Transformer architecture containing both encoder and decoder.
- Parameters:
d_model (int, optional) – Dimension of encoder/decoder inputs
nhead (int, optional) – Number of attention heads
num_encoder_layers (int, optional) – Number of encoder layers
num_decoder_layers (int, optional) – Number of decoder layers
dim_feedforward (int, optional) – Dimension of feed-forward network
dropout (float, optional) – Dropout value
activation (str, optional) – Activation function, ‘relu’ or ‘gelu’
custom_encoder (Module, optional) – Custom encoder module
custom_decoder (Module, optional) – Custom decoder module
layer_norm_eps (float, optional) – Epsilon value for layer normalization
batch_first (bool, optional) – Whether input/output shape is (batch, seq, feature)
norm_first (bool, optional) – Whether to perform LayerNorm before attention and feed-forward
bias (bool, optional) – Whether linear and LayerNorm layers learn additive bias
device (optional) – Device for tensors
dtype (optional) – Data type for tensors
Functional Interface
The riemann.nn.functional module provides functional implementations of various neural network operations.
Linear Functions
- riemann.nn.functional.linear(input, weight, bias=None)
Applies linear transformation: y = xA^T + b
- Parameters:
input (riemann.TN) – Input tensor with shape
(*, in_features)weight (riemann.TN) – Weight tensor with shape
(out_features, in_features)bias (riemann.TN, optional) – Bias tensor with shape
(out_features). Default: None
- Returns:
Output tensor with shape
(*, out_features)- Return type:
riemann.TN
Activation Functions
- riemann.nn.functional.sigmoid(input)
Applies element-wise sigmoid function: sigmoid(x) = 1 / (1 + exp(-x))
- Parameters:
input (riemann.TN) – Input tensor
- Returns:
Output tensor
- Return type:
riemann.TN
- riemann.nn.functional.silu(input)
Applies Sigmoid Linear Unit (SiLU) activation function: silu(x) = x * sigmoid(x)
- Parameters:
input (riemann.TN) – Input tensor
- Returns:
Output tensor
- Return type:
riemann.TN
- riemann.nn.functional.tanh(input)
Applies hyperbolic tangent activation function: tanh(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))
- Parameters:
input (riemann.TN) – Input tensor
Dropout Functions
- riemann.nn.functional.dropout(input, p=0.5, training=True, inplace=False)
During training, randomly zeroes elements of the input tensor with probability p, and scales remaining elements by 1/(1-p). During evaluation, no operation is performed.
- riemann.nn.functional.dropout2d(input, p=0.5, training=True, inplace=False)
During training, randomly zeroes entire channels of the input tensor with probability p, and scales remaining channels by 1/(1-p). During evaluation, no operation is performed.
- riemann.nn.functional.dropout3d(input, p=0.5, training=True, inplace=False)
During training, randomly zeroes entire channels of the input tensor with probability p, and scales remaining channels by 1/(1-p). During evaluation, no operation is performed.
Normalization Functions
- riemann.nn.functional.batch_norm(input, running_mean=None, running_var=None, weight=None, bias=None, training=False, momentum=0.1, eps=1e-5)
Applies batch normalization to the input tensor.
- param input:
Input tensor with shape (N, C), (N, C, L), (N, C, H, W) or (N, C, D, H, W)
- type input:
riemann.TN
- param running_mean:
Running mean with shape (C,)
- type running_mean:
riemann.TN, optional
- param running_var:
Running variance with shape (C,)
- type running_var:
riemann.TN, optional
- param weight:
Learnable scaling parameter γ with shape (C,)
- type weight:
riemann.TN, optional
- param bias:
Learnable offset parameter β with shape (C,)
- type bias:
riemann.TN, optional
- param training:
Whether in training mode
- type training:
bool, optional
- param momentum:
Momentum for running statistics
- type momentum:
float, optional
- param eps:
Small constant for numerical stability
- type eps:
float, optional
- return:
Normalized tensor with same shape as input
- rtype:
riemann.TN
- riemann.nn.functional.layer_norm(input, normalized_shape, weight=None, bias=None, eps=1e-05)
Applies layer normalization to specified dimensions of the input tensor.
- param input:
Input tensor
- type input:
riemann.TN
- param normalized_shape:
Integer or tuple specifying dimensions to normalize
- type normalized_shape:
int or tuple
- param weight:
Optional weight tensor (γ) for affine transformation
- type weight:
riemann.TN, optional
- param bias:
Optional bias tensor (β) for affine transformation
- type bias:
riemann.TN, optional
- param eps:
Small value added to variance to avoid division by zero
- type eps:
float, optional
- return:
Normalized tensor with same shape as input
- rtype:
riemann.TN
Embedding Functions
- riemann.nn.functional.embedding(input, weight, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False)
Looks up embedding vectors for input indices from an embedding matrix.
- Parameters:
input (riemann.TN) – Tensor containing indices with arbitrary shape
weight (riemann.TN) – Embedding matrix with shape (num_embeddings, embedding_dim)
padding_idx (int, optional) – If specified, embedding vectors at this index do not participate in gradient computation and remain unchanged during training
max_norm (float, optional) – If specified, all embedding vectors with norm exceeding max_norm will be renormalized to max_norm
norm_type (float, optional) – p-value for norm calculation, defaults to 2 (L2 norm)
scale_grad_by_freq (bool, optional) – If True, gradients will be scaled by frequency of each word in mini-batch
sparse (bool, optional) – If True, gradient of weight will be a sparse tensor
- Returns:
Output tensor with shape (
*, embedding_dim), where*is the shape of input- Return type:
riemann.TN
- riemann.nn.functional.softmax(input, dim)
Applies softmax function along the specified dimension
- Parameters:
input (riemann.TN) – Input tensor
dim (int) – Dimension to compute softmax
- Returns:
Output tensor
- Return type:
riemann.TN
- riemann.nn.functional.log_softmax(input, dim=-1)
Applies log softmax function for numerical stability
- Parameters:
input (riemann.TN) – Input tensor
dim (int, optional) – Dimension to compute log_softmax
- Returns:
Output tensor
- Return type:
riemann.TN
- riemann.nn.functional.relu(input)
Applies rectified linear unit activation function: relu(x) = max(0, x)
- Parameters:
input (riemann.TN) – Input tensor
- Returns:
Output tensor
- Return type:
riemann.TN
- riemann.nn.functional.leaky_relu(input, alpha=0.01)
Applies leaky rectified linear unit activation function
- Parameters:
input (riemann.TN) – Input tensor
alpha (float, optional) – Slope of the negative region
- Returns:
Output tensor
- Return type:
riemann.TN
- riemann.nn.functional.prelu(input, alpha)
Applies parametric rectified linear unit activation function
- Parameters:
input (riemann.TN) – Input tensor
alpha (riemann.TN) – Learnable parameter tensor
- Returns:
Output tensor
- Return type:
riemann.TN
- riemann.nn.functional.rrelu(input, lower=1.0 / 8.0, upper=1.0 / 3.0, training=True)
Applies randomized rectified linear unit activation function
- Parameters:
- Returns:
Output tensor
- Return type:
riemann.TN
- riemann.nn.functional.gelu(input)
Applies Gaussian Error Linear Unit activation function
- Parameters:
input (riemann.TN) – Input tensor
- Returns:
Output tensor
- Return type:
riemann.TN
- riemann.nn.functional.softplus(input, beta=1.0, threshold=20.0)
Applies Softplus activation function: softplus(x) = (1 / beta) * log(1 + exp(beta * x))
Loss Functions
- riemann.nn.functional.l1_loss(input, target, size_average=None, reduce=None, reduction='mean')
Compute L1 (absolute error) loss
- param input:
Input tensor
- type input:
riemann.TN
- param target:
Target tensor
- type target:
riemann.TN
- param size_average:
Deprecated
- type size_average:
bool, optional
- param reduce:
Deprecated
- type reduce:
bool, optional
- param reduction:
Specifies the reduction to apply to the output
- type reduction:
str, optional
- return:
Loss value
- rtype:
riemann.TN
- riemann.nn.functional.smooth_l1_loss(input, target, size_average=None, reduce=None, reduction='mean', beta=1.0)
Compute smooth L1 loss
- param input:
Input tensor
- type input:
riemann.TN
- param target:
Target tensor
- type target:
riemann.TN
- param size_average:
Deprecated
- type size_average:
bool, optional
- param reduce:
Deprecated
- type reduce:
bool, optional
- param reduction:
Specifies the reduction to apply to the output
- type reduction:
str, optional
- param beta:
Threshold at which the loss function changes from quadratic to linear
- type beta:
float, optional
- return:
Loss value
- rtype:
riemann.TN
- riemann.nn.functional.cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean', label_smoothing=0.0)
Compute cross entropy loss
- param input:
Input tensor
- type input:
riemann.TN
- param target:
Target tensor
- type target:
riemann.TN
- param weight:
Manual scaling weight for each class
- type weight:
riemann.TN, optional
- param size_average:
Deprecated
- type size_average:
bool, optional
- param ignore_index:
Specifies target value to ignore
- type ignore_index:
int, optional
- param reduce:
Deprecated
- type reduce:
bool, optional
- param reduction:
Specifies the reduction to apply to the output
- type reduction:
str, optional
- param label_smoothing:
Amount of label smoothing
- type label_smoothing:
float, optional
- return:
Loss value
- rtype:
riemann.TN
- riemann.nn.functional.binary_cross_entropy_with_logits(input, target, weight=None, size_average=None, reduce=None, reduction='mean', pos_weight=None)
Compute binary cross entropy loss with logits
- param input:
Input tensor
- type input:
riemann.TN
- param target:
Target tensor
- type target:
riemann.TN
- param weight:
Manual scaling weight for each batch element
- type weight:
riemann.TN, optional
- param size_average:
Deprecated
- type size_average:
bool, optional
- param reduce:
Deprecated
- type reduce:
bool, optional
- param reduction:
Specifies the reduction to apply to the output
- type reduction:
str, optional
- param pos_weight:
Weight of positive class
- type pos_weight:
riemann.TN, optional
- return:
Loss value
- rtype:
riemann.TN
- riemann.nn.functional.huber_loss(input, target, delta=1.0, size_average=None, reduce=None, reduction='mean')
Compute Huber loss
- param input:
Input tensor
- type input:
riemann.TN
- param target:
Target tensor
- type target:
riemann.TN
- param delta:
Threshold at which the loss function changes from quadratic to linear
- type delta:
float, optional
- param size_average:
Deprecated
- type size_average:
bool, optional
- param reduce:
Deprecated
- type reduce:
bool, optional
- param reduction:
Specifies the reduction to apply to the output
- type reduction:
str, optional
- return:
Loss value
- rtype:
riemann.TN
- riemann.nn.functional.nll_loss(input, target, weight=None, ignore_index=-100, reduction='mean')
Compute negative log likelihood loss
- param input:
Input tensor
- type input:
riemann.TN
- param target:
Target tensor
- type target:
riemann.TN
- param weight:
Manual scaling weight for each class
- type weight:
riemann.TN, optional
- param ignore_index:
Specifies target value to ignore
- type ignore_index:
int, optional
- param reduction:
Specifies the reduction to apply to the output
- type reduction:
str, optional
- return:
Loss value
- rtype:
riemann.TN
Convolution Functions
- riemann.nn.functional.conv1d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)
Apply 1D convolution to input signals
- Parameters:
input (riemann.TN) – Input tensor with shape (N, C_in, L_in)
weight (riemann.TN) – Weight tensor with shape (C_out, C_in/groups, K)
bias (riemann.TN, optional) – Bias tensor with shape (C_out). Default: None
padding (int or tuple, optional) – Zero-padding added to both sides of the input
dilation (int or tuple, optional) – Spacing between kernel elements
groups (int, optional) – Number of blocked connections from input channels to output channels
- Returns:
Output tensor with shape (N, C_out, L_out)
- Return type:
riemann.TN
- riemann.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)
Apply 2D convolution to input images
- Parameters:
input (riemann.TN) – Input tensor with shape (N, C_in, H_in, W_in)
weight (riemann.TN) – Weight tensor with shape (C_out, C_in/groups, K_h, K_w)
bias (riemann.TN, optional) – Bias tensor with shape (C_out). Default: None
padding (int or tuple, optional) – Zero-padding added to both sides of the input
dilation (int or tuple, optional) – Spacing between kernel elements
groups (int, optional) – Number of blocked connections from input channels to output channels
- Returns:
Output tensor with shape (N, C_out, H_out, W_out)
- Return type:
riemann.TN
- riemann.nn.functional.conv3d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)
Apply 3D convolution to input volumes
- Parameters:
input (riemann.TN) – Input tensor with shape (N, C_in, D_in, H_in, W_in)
weight (riemann.TN) – Weight tensor with shape (C_out, C_in/groups, K_d, K_h, K_w)
bias (riemann.TN, optional) – Bias tensor with shape (C_out). Default: None
padding (int or tuple, optional) – Zero-padding added to all sides of the input
dilation (int or tuple, optional) – Spacing between kernel elements
groups (int, optional) – Number of blocked connections from input channels to output channels
- Returns:
Output tensor with shape (N, C_out, D_out, H_out, W_out)
- Return type:
riemann.TN
Pooling Functions
- riemann.nn.functional.max_pool1d(input, kernel_size, stride=None, padding=0, dilation=1, ceil_mode=False, return_indices=False)
Apply 1D max pooling to input signals
- Parameters:
input (riemann.TN) – Input tensor with shape (N, C, L_in)
stride (int or tuple, optional) – Stride of the pooling window. Default: kernel_size
padding (int or tuple, optional) – Zero-padding added to both sides of the input
dilation (int or tuple, optional) – Spacing between pooling window elements
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
return_indices (bool, optional) – Whether to return indices of maximum values
- Returns:
Output tensor with shape (N, C, L_out), or tuple (TN, TN) if return_indices is True
- Return type:
riemann.TN or tuple
- riemann.nn.functional.max_pool2d(input, kernel_size, stride=None, padding=0, dilation=1, ceil_mode=False, return_indices=False)
Apply 2D max pooling to input images
- Parameters:
input (riemann.TN) – Input tensor with shape (N, C, H_in, W_in)
stride (int or tuple, optional) – Stride of the pooling window. Default: kernel_size
padding (int or tuple, optional) – Zero-padding added to both sides of the input
dilation (int or tuple, optional) – Spacing between pooling window elements
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
return_indices (bool, optional) – Whether to return indices of maximum values
- Returns:
Output tensor with shape (N, C, H_out, W_out), or tuple (TN, TN) if return_indices is True
- Return type:
riemann.TN or tuple
- riemann.nn.functional.max_pool3d(input, kernel_size, stride=None, padding=0, dilation=1, ceil_mode=False, return_indices=False)
Apply 3D max pooling to input volume data
- Parameters:
input (riemann.TN) – Input tensor with shape (N, C, D_in, H_in, W_in)
stride (int or tuple, optional) – Stride of the pooling window. Default: kernel_size
padding (int or tuple, optional) – Zero-padding added to all sides of the input
dilation (int or tuple, optional) – Spacing between pooling window elements
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
return_indices (bool, optional) – Whether to return indices of maximum values
- Returns:
Output tensor with shape (N, C, D_out, H_out, W_out), or tuple (TN, TN) if return_indices is True
- Return type:
riemann.TN or tuple
- riemann.nn.functional.avg_pool1d(input, kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
Apply 1D average pooling to input signals
- Parameters:
input (riemann.TN) – Input tensor with shape (N, C, L_in)
stride (int or tuple, optional) – Stride of the pooling window. Default: kernel_size
padding (int or tuple, optional) – Zero-padding added to both sides of the input
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
count_include_pad (bool, optional) – Whether to include zero-padding when calculating average
divisor_override (int, optional) – If specified, will be used as denominator
- Returns:
Output tensor with shape (N, C, L_out)
- Return type:
riemann.TN
- riemann.nn.functional.avg_pool2d(input, kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
Apply 2D average pooling to input images
- Parameters:
input (riemann.TN) – Input tensor with shape (N, C, H_in, W_in)
stride (int or tuple, optional) – Stride of the pooling window. Default: kernel_size
padding (int or tuple, optional) – Zero-padding added to both sides of the input
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
count_include_pad (bool, optional) – Whether to include zero-padding when calculating average
divisor_override (int, optional) – If specified, will be used as denominator
- Returns:
Output tensor with shape (N, C, H_out, W_out)
- Return type:
riemann.TN
- riemann.nn.functional.avg_pool3d(input, kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None)
Apply 3D average pooling to input volume data
- Parameters:
input (riemann.TN) – Input tensor with shape (N, C, D_in, H_in, W_in)
stride (int or tuple, optional) – Stride of the pooling window. Default: kernel_size
padding (int or tuple, optional) – Zero-padding added to all sides of the input
ceil_mode (bool, optional) – Whether to use ceiling instead of floor to compute output shape
count_include_pad (bool, optional) – Whether to include zero-padding when calculating average
divisor_override (int, optional) – If specified, will be used as denominator
- Returns:
Output tensor with shape (N, C, D_out, H_out, W_out)
- Return type:
riemann.TN
Utility Functions
- riemann.nn.functional.one_hot(target, num_classes)
Convert category indices to one-hot encoded tensors
- Parameters:
target (riemann.TN) – Target tensor with shape
(N, *)num_classes (int) – Number of classes
- Returns:
One-hot encoded tensor with shape
(N, *, num_classes)- Return type:
riemann.TN
- riemann.nn.functional.unfold(input, kernel_size, dilation=1, padding=0, stride=1)
Extract sliding local blocks from batched input tensor
- Parameters:
- Returns:
Unfolded tensor with shape (N, C * kernel_size[0] * kernel_size[1], L)
- Return type:
riemann.TN
- riemann.nn.functional.fold(input, output_size, kernel_size, dilation=1, padding=0, stride=1)
Fold the unfolded tensor back to its original shape
- param input:
Input tensor with shape (N, C * kernel_size[0] * kernel_size[1], L)
- type input:
riemann.TN
- param output_size:
Output tensor size (H, W)
- type output_size:
int or tuple
- param kernel_size:
Sliding block size
- type kernel_size:
int or tuple
- param dilation:
Spacing between kernel elements
- type dilation:
int or tuple, optional
- param padding:
Zero-padding added to both sides of the input
- type padding:
int or tuple, optional
- param stride:
Stride of the sliding block
- type stride:
int or tuple, optional
- return:
Folded tensor with shape (N, C, H, W)
- rtype:
riemann.TN
- riemann.nn.functional.unfold2d(input, kernel_size, dilation=1, padding=0, stride=1)
Extract sliding local blocks from 2D input tensor (2D-specific version of unfold)
- Parameters:
- Returns:
Unfolded tensor with shape (N, C * kernel_size[0] * kernel_size[1], L)
- Return type:
riemann.TN
- riemann.nn.functional.unfold3d(input, kernel_size, dilation=1, padding=0, stride=1)
Extract sliding local blocks from 3D input tensor (3D-specific version of unfold)
- Parameters:
- Returns:
Unfolded tensor with shape (N, C * kernel_size[0] * kernel_size[1] * kernel_size[2], L)
- Return type:
riemann.TN
Datasets
Dataset Classes
- class riemann.utils.Dataset
Abstract base class for datasets, defining the standard interface that all datasets must implement.
- __len__()
Return the number of samples in the dataset.
- __getitem__(index)
Get a single sample from the dataset at the given index.
- class riemann.utils.TensorDataset(*tensors)
Simple tensor dataset implementation that uses the first dimension of multiple tensors as the dataset dimension.
- Parameters:
*tensors (riemann.TN) – Variable number of tensors, all tensors must have the same size in the first dimension
- __len__()
Return the size of the dataset, which is the size of the first dimension of the tensors.
- __getitem__(index)
Get sample data at the specified index.
Data Loaders
- class riemann.utils.DataLoader(dataset, batch_size=1, shuffle=False, num_workers=0, collate_fn=None, drop_last=False)
Efficient data loader supporting batch processing, data shuffling, and multi-process loading.
- Parameters:
dataset (riemann.utils.Dataset) – Dataset to load data from
batch_size (int, optional) – Size of each batch, defaults to 1
shuffle (bool, optional) – Whether to shuffle the data at the beginning of each epoch, defaults to False
num_workers (int, optional) – Number of worker processes for data loading, 0 means loading in the main process, defaults to 0
collate_fn (callable, optional) – Batch processing function for combining samples into batches, defaults to default_collate
drop_last (bool, optional) – Whether to drop the last incomplete batch if dataset size is not divisible by batch size, defaults to False
- __len__()
Return the number of batches in the data loader.
- __iter__()
Return an iterator for the data loader.
Dataset Utility Functions
- riemann.utils.default_collate(batch)
Default batch processing function that converts a batch of sample data into tensor format suitable for model input.
- param batch:
List of samples in a batch, each sample can be various data types
- type batch:
list
- return:
Batch data combined according to input type
- riemann.utils.clip_grad_norm_(parameters, max_norm, norm_type=2.0, error_if_nonfinite=False)
Clip gradients by norm.
- param parameters:
Collection of parameters whose gradients need to be clipped
- type parameters:
Iterable[riemann.TN]
- param max_norm:
Maximum norm of gradients
- type max_norm:
float or int
- param norm_type:
Type of norm, defaults to 2 (L2 norm)
- type norm_type:
float or int, optional
- param error_if_nonfinite:
Whether to throw an error if gradients contain non-finite values (such as NaN or inf), defaults to False
- type error_if_nonfinite:
bool, optional
- return:
Gradient norm before clipping
- rtype:
float
- riemann.utils.clip_grad_value_(parameters, clip_value, error_if_nonfinite=False)
Clip gradients by value.
- param parameters:
Collection of parameters whose gradients need to be clipped
- type parameters:
Iterable[riemann.TN]
- param clip_value:
Threshold value for gradient clipping
- type clip_value:
float or int
- param error_if_nonfinite:
Whether to throw an error if gradients contain non-finite values (such as NaN or inf), defaults to False
- type error_if_nonfinite:
bool, optional
Vision
Datasets
- class riemann.vision.datasets.MNIST(root, train=True, transform=None, target_transform=None)
MNIST dataset class for loading and processing the MNIST handwritten digit dataset.
- Parameters:
- __len__()
Return the number of samples in the dataset.
- __getitem__(index)
Get a single sample from the dataset at the given index.
- class riemann.vision.datasets.EasyMNIST(root, train=True, onehot_label=True, download=False)
Subclass inherited from MNIST, applies normalization, standardization, and flattening transformations to image data during initialization, and performs one-hot encoding or conversion to scalar tensors for labels.
- Parameters:
- __len__()
Return the size of the dataset.
- __getitem__(index)
Get sample data at the specified index.
- class riemann.vision.datasets.FashionMNIST(root, train=True, transform=None, target_transform=None, download=False)
Fashion-MNIST dataset class for loading and processing the Fashion-MNIST fashion product dataset.
- Parameters:
root (str) – Root directory of the dataset
train (bool) – Whether to load the training set, defaults to True
transform (callable, optional) – Transformation function applied to images
target_transform (callable, optional) – Transformation function applied to targets
download (bool, optional) – Whether to download the dataset if not found, defaults to False
- classes
List of class names: [‘T-shirt/top’, ‘Trouser’, ‘Pullover’, ‘Dress’, ‘Coat’, ‘Sandal’, ‘Shirt’, ‘Sneaker’, ‘Bag’, ‘Ankle boot’]
- __len__()
Return the number of samples in the dataset.
- __getitem__(index)
Get a single sample from the dataset at the given index.
- class riemann.vision.datasets.CIFAR10(root, train=True, transform=None, target_transform=None, download=False)
CIFAR-10 dataset class for loading and processing the CIFAR-10 image dataset.
- Parameters:
root (str) – Root directory of the dataset
train (bool) – Whether to load the training set, defaults to True
transform (callable, optional) – Transformation function applied to images
target_transform (callable, optional) – Transformation function applied to targets
download (bool, optional) – Whether to download the dataset if not found, defaults to False
- __len__()
Return the size of the dataset.
- __getitem__(index)
Get sample data at the specified index.
- class riemann.vision.datasets.Flowers102(root, split='train', transform=None, target_transform=None, download=False)
Oxford 102 Flower dataset class for loading and processing the flower classification dataset.
- Parameters:
root (str) – Root directory of the dataset
split (str, optional) – Dataset split (‘train’, ‘val’, or ‘test’), defaults to ‘train’
transform (callable, optional) – Transformation function applied to images
target_transform (callable, optional) – Transformation function applied to targets
download (bool, optional) – Whether to download the dataset if not found, defaults to False
- __len__()
Return the size of the dataset.
- __getitem__(index)
Get sample data at the specified index.
- class riemann.vision.datasets.OxfordIIITPet(root, split='trainval', target_types='category', transform=None, target_transform=None, download=False)
Oxford-IIIT Pet dataset class for loading and processing the pet classification dataset.
- Parameters:
root (str) – Root directory of the dataset
split (str, optional) – Dataset split (‘trainval’ or ‘test’), defaults to ‘trainval’
target_types (str or list, optional) – Type of target (‘category’, ‘binary-category’, or ‘segmentation’), defaults to ‘category’
transform (callable, optional) – Transformation function applied to images
target_transform (callable, optional) – Transformation function applied to targets
download (bool, optional) – Whether to download the dataset if not found, defaults to False
- __len__()
Return the size of the dataset.
- __getitem__(index)
Get sample data at the specified index.
- class riemann.vision.datasets.LFWPeople(root, split='10fold', image_set='funneled', transform=None, target_transform=None, download=False)
LFW People dataset class for loading and processing the face recognition dataset.
- Parameters:
root (str) – Root directory of the dataset
split (str, optional) – Dataset split (‘10fold’, ‘train’, or ‘test’), defaults to ‘10fold’
image_set (str, optional) – Image alignment type (‘original’, ‘funneled’, or ‘deepfunneled’), defaults to ‘funneled’
transform (callable, optional) – Transformation function applied to images
target_transform (callable, optional) – Transformation function applied to targets
download (bool, optional) – Whether to download the dataset if not found, defaults to False
- classes
List of person names.
- __len__()
Return the size of the dataset.
- __getitem__(index)
Get sample data at the specified index.
- class riemann.vision.datasets.SVHN(root, split='train', transform=None, target_transform=None, download=False)
SVHN (Street View House Numbers) dataset class for loading and processing the digit recognition dataset.
- Parameters:
root (str) – Root directory of the dataset
split (str, optional) – Dataset split (‘train’, ‘test’, or ‘extra’), defaults to ‘train’
transform (callable, optional) – Transformation function applied to images
target_transform (callable, optional) – Transformation function applied to targets
download (bool, optional) – Whether to download the dataset if not found, defaults to False
- __len__()
Return the size of the dataset.
- __getitem__(index)
Get sample data at the specified index.
- class riemann.vision.datasets.DatasetFolder(root, loader, extensions=None, transform=None, target_transform=None, is_valid_file=None, allow_empty=False)
Generic folder dataset class for loading custom datasets from folders.
- Parameters:
root (str) – Root directory path of the dataset
loader (callable) – Image loading function
extensions (tuple, optional) – Tuple of allowed file extensions
transform (callable, optional) – Transformation function applied to images
target_transform (callable, optional) – Transformation function applied to targets
is_valid_file (callable, optional) – Function to validate if a file is valid
allow_empty (bool) – Whether to allow empty folders, defaults to False
- classes
List of class names.
- class_to_idx
Dictionary mapping class names to indices.
- __len__()
Return the number of samples in the dataset.
- __getitem__(index)
Get a single sample from the dataset at the given index.
- class riemann.vision.datasets.ImageFolder(root, transform=None, target_transform=None, loader=None, is_valid_file=None)
Image folder dataset class, inherited from DatasetFolder, for loading image datasets from folders.
- Parameters:
root (str) – Root directory path of the dataset
transform (callable, optional) – Transformation function applied to images
target_transform (callable, optional) – Transformation function applied to targets
loader (callable, optional) – Image loading function, defaults to PIL Image loader
is_valid_file (callable, optional) – Function to validate if a file is valid
- classes
List of class names.
- class_to_idx
Dictionary mapping class names to indices.
- __len__()
Return the number of samples in the dataset.
- __getitem__(index)
Get a single sample from the dataset at the given index.
Image Transforms
- class riemann.vision.transforms.Transform
Base class for all transformation classes.
- __call__(img)
Execute the transformation.
- class riemann.vision.transforms.Compose(transforms)
Combine multiple transformations into a single transformation.
- Parameters:
transforms (list of Transform objects) – List of transformations to combine
- class riemann.vision.transforms.ToTensor
Convert PIL image or NumPy array to TN tensor.
- class riemann.vision.transforms.ToPILImage
Convert TN tensor or NumPy array to PIL image.
- class riemann.vision.transforms.Normalize(mean, std, inplace=False)
Normalize tensor using mean and standard deviation.
- Parameters:
mean (sequence) – Mean for each channel
std (sequence) – Standard deviation for each channel
inplace (bool, optional) – Whether to perform operation in-place, defaults to False
- class riemann.vision.transforms.Resize(size, interpolation=BILINEAR)
Resize PIL image.
- class riemann.vision.transforms.CenterCrop(size)
Center crop.
- class riemann.vision.transforms.RandomHorizontalFlip(p=0.5)
Random horizontal flip.
- Parameters:
p (float, optional) – Flip probability, defaults to 0.5
- class riemann.vision.transforms.RandomVerticalFlip(p=0.5)
Random vertical flip.
- Parameters:
p (float, optional) – Flip probability, defaults to 0.5
- class riemann.vision.transforms.RandomRotation(degrees, resample=NEAREST, expand=False, center=None)
Random rotation.
- Parameters:
degrees (int or tuple) – Rotation angle range. If int, select from (-degrees, degrees). If (min, max), select from (min, max).
resample (int, optional) – Resampling method, defaults to NEAREST
expand (bool, optional) – Whether to expand image to accommodate rotation, defaults to False
center (tuple, optional) – Rotation center, defaults to image center
- class riemann.vision.transforms.ColorJitter(brightness=0, contrast=0, saturation=0, hue=0)
Random color transformation.
- class riemann.vision.transforms.Grayscale(num_output_channels=1)
Convert image to grayscale.
- Parameters:
num_output_channels (int) – Number of output channels, 1 or 3, defaults to 1
- class riemann.vision.transforms.RandomGrayscale(p=0.1)
Randomly convert to grayscale.
- Parameters:
p (float, optional) – Probability of converting to grayscale, defaults to 0.1
- class riemann.vision.transforms.RandomCrop(size, padding=None)
Crop image at random position.
- class riemann.vision.transforms.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(3 / 4, 4 / 3), interpolation=BILINEAR)
Random crop and resize.
- Parameters:
size (int or tuple) – Target size. If int, resize to square (size, size). If (h, w), resize to this size.
scale (tuple, optional) – Crop area ratio range relative to original image, defaults to (0.08, 1.0)
ratio (tuple, optional) – Crop aspect ratio range, defaults to (3/4, 4/3)
interpolation (int, optional) – Interpolation method, defaults to BILINEAR
- class riemann.vision.transforms.FiveCrop(size)
Five crop.
- class riemann.vision.transforms.TenCrop(size, vertical_flip=False)
Ten crop.
- class riemann.vision.transforms.Pad(padding, fill=0, padding_mode='constant')
Padding.
- Parameters:
- class riemann.vision.transforms.Lambda(lambd)
Use user-defined lambda function as transformation.
- Parameters:
lambd (function) – Lambda function
- class riemann.vision.transforms.PILToTensor
Convert PIL Image to tensor (without scaling).
- class riemann.vision.transforms.ConvertImageDtype(dtype)
Convert image data type.
- Parameters:
dtype (torch.dtype) – Target data type
- class riemann.vision.transforms.GaussianBlur(kernel_size, sigma=(0.1, 2.0))
Apply Gaussian blur to image.
- class riemann.vision.transforms.RandomAffine(degrees, translate=None, scale=None, shear=None, resample=NEAREST, fillcolor=0)
Random affine transformation.
- class riemann.vision.transforms.RandomPerspective(distortion_scale=0.5, p=0.5, interpolation=BILINEAR, fill=0)
Random perspective transformation.
- class riemann.vision.transforms.RandomErasing(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0, inplace=False)
Random erasing for data augmentation.
- class riemann.vision.transforms.AutoAugment(policy=AutoAugmentPolicy.IMAGENET)
Automatic data augmentation based on learning policy.
- Parameters:
policy (AutoAugmentPolicy) – Augmentation policy
- class riemann.vision.transforms.RandAugment(num_ops=2, magnitude=9, num_magnitude_bins=31, interpolation=BILINEAR, fill=None)
Random data augmentation.
- class riemann.vision.transforms.TrivialAugmentWide(num_magnitude_bins=31, interpolation=BILINEAR, fill=None)
Wide range simple augmentation.
- class riemann.vision.transforms.SanitizeBoundingBox(labels_format='xyxy', min_size=1)
Sanitize bounding boxes.
- class riemann.vision.transforms.Invert
Invert colors.
- class riemann.vision.transforms.Posterize(bits)
Reduce color bits.
- Parameters:
bits (int) – Number of bits to keep
- class riemann.vision.transforms.Solarize(threshold)
Invert pixels above threshold.
- Parameters:
threshold (int) – Threshold value
- class riemann.vision.transforms.Equalize
Histogram equalization.
- class riemann.vision.transforms.AutoContrast
Auto contrast adjustment.
- class riemann.vision.transforms.Sharpness(sharpness_factor)
Sharpness adjustment.
- Parameters:
sharpness_factor (float) – Sharpness factor
- class riemann.vision.transforms.Brightness(brightness_factor)
Brightness adjustment.
- Parameters:
brightness_factor (float) – Brightness factor
- class riemann.vision.transforms.Contrast(contrast_factor)
Contrast adjustment.
- Parameters:
contrast_factor (float) – Contrast factor
Optimization
Optimizers
- class riemann.optim.Optimizer(params, defaults)
Base class for all optimizers.
- Parameters:
params (Iterable[riemann.TN or riemann.nn.Parameter] or List[Dict[str, Any]]) – Iterator of parameters to optimize or list of dictionaries defining parameter groups
defaults (Dict[str, Any]) – Default hyperparameters for the optimizer
- step(closure=None)
Perform a single optimization step
- Parameters:
closure (callable, optional) – A closure that reevaluates the model and returns the loss
- Returns:
Loss value if closure is provided, otherwise None
- Return type:
float or None
- zero_grad(set_to_none=False)
Set the gradients of all optimized parameters to zero
- Parameters:
set_to_none (bool, optional) – Whether to set gradients to None instead of zero
- add_param_group(param_group)
Add a parameter group to the optimizer
- Parameters:
param_group (Dict[str, Any]) – Parameter group to add
- class riemann.optim.GD(params, lr=0.01, weight_decay=0.0)
Gradient Descent optimizer
- Parameters:
params (Iterable[riemann.TN or riemann.nn.Parameter] or List[Dict[str, Any]]) – Iterator of parameters to optimize or list of dictionaries defining parameter groups
lr (float, optional) – Learning rate
weight_decay (float, optional) – Weight decay (L2 regularization) coefficient
- step()
Perform a single optimization step
- class riemann.optim.SGD(params, lr=0.01, momentum=0.0, weight_decay=0.0, dampening=0.0, nesterov=False)
Stochastic Gradient Descent optimizer
- Parameters:
params (Iterable[riemann.TN or riemann.nn.Parameter] or List[Dict[str, Any]]) – Iterator of parameters to optimize or list of dictionaries defining parameter groups
lr (float, optional) – Learning rate
momentum (float, optional) – Momentum factor
weight_decay (float, optional) – Weight decay (L2 regularization) coefficient
dampening (float, optional) – Dampening for momentum
nesterov (bool, optional) – Whether to enable Nesterov momentum
- class riemann.optim.Adam(params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0, amsgrad=False)
Adam (Adaptive Moment Estimation) optimizer
- Parameters:
params (Iterable[riemann.TN or riemann.nn.Parameter] or List[Dict[str, Any]]) – Iterator of parameters to optimize or list of dictionaries defining parameter groups
lr (float, optional) – Learning rate
betas (Tuple[float, float], optional) – Coefficients used for computing running averages of gradient and its square
eps (float, optional) – Term added to the denominator to improve numerical stability
weight_decay (float, optional) – Weight decay (L2 regularization) coefficient
amsgrad (bool, optional) – Whether to use the AMSGrad variant
- step()
Perform a single optimization step
- class riemann.optim.Adagrad(params, lr=0.01, lr_decay=0.0, weight_decay=0.0, initial_accumulator_value=0.0, eps=1e-10)
Adagrad (Adaptive Gradient Algorithm) optimizer
- Parameters:
params (Iterable[riemann.TN or riemann.nn.Parameter] or List[Dict[str, Any]]) – Iterator of parameters to optimize or list of dictionaries defining parameter groups
lr (float, optional) – Learning rate
lr_decay (float, optional) – Learning rate decay
weight_decay (float, optional) – Weight decay (L2 regularization) coefficient
initial_accumulator_value (float, optional) – Initial value for the accumulator
eps (float, optional) – Term added to the denominator to improve numerical stability
- step()
Perform a single optimization step
- class riemann.optim.LBFGS(params, lr=1.0, max_iter=20, max_eval=None, tolerance_grad=1e-05, tolerance_change=1e-09, history_size=100, line_search_fn=None)
L-BFGS (Limited-memory Broyden-Fletcher-Goldfarb-Shanno) optimizer
- Parameters:
params (Iterable[riemann.TN or riemann.nn.Parameter] or List[Dict[str, Any]]) – Iterator of parameters to optimize or list of dictionaries defining parameter groups
lr (float, optional) – Learning rate
max_iter (int, optional) – Maximum number of iterations per optimization step
max_eval (int, optional) – Maximum number of function evaluations per optimization step
tolerance_grad (float, optional) – Gradient tolerance for convergence
tolerance_change (float, optional) – Parameter change tolerance for convergence
history_size (int, optional) – Update history size
line_search_fn (callable, optional) – Line search function
- class riemann.optim.AdamW(params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0.01, amsgrad=False)
AdamW (Adam with Weight Decay) optimizer
An improved version of Adam that treats weight decay as a separate regularization term instead of modifying the gradients in Adam. This allows weight decay to more effectively act as L2 regularization, avoiding the weight decay side effects present in Adam.
- Parameters:
params (Iterable[riemann.TN or riemann.nn.Parameter] or List[Dict[str, Any]]) – Iterator of parameters to optimize or list of dictionaries defining parameter groups
lr (float, optional) – Learning rate
betas (Tuple[float, float], optional) – Coefficients used for computing running averages of gradient and its square
eps (float, optional) – Term added to the denominator to improve numerical stability
weight_decay (float, optional) – Weight decay (L2 regularization) coefficient
amsgrad (bool, optional) – Whether to use the AMSGrad variant
- step()
Perform a single optimization step
- class riemann.optim.RMSprop(params, lr=1e-2, alpha=0.99, eps=1e-8, weight_decay=0, momentum=0, centered=False)
RMSprop (Root Mean Square Propagation) optimizer
An adaptive learning rate optimizer particularly suitable for recurrent neural networks (RNNs). It adjusts the learning rate for each parameter by maintaining a moving average of squared gradients.
- Parameters:
params (Iterable[riemann.TN or riemann.nn.Parameter] or List[Dict[str, Any]]) – Iterator of parameters to optimize or list of dictionaries defining parameter groups
lr (float, optional) – Learning rate
alpha (float, optional) – Smoothing constant used for computing the exponential moving average of squared gradients
eps (float, optional) – Term added to the denominator to improve numerical stability
weight_decay (float, optional) – Weight decay (L2 regularization) coefficient
momentum (float, optional) – Momentum factor
centered (bool, optional) – Whether to use centered RMSprop (using a moving average of gradients)
- step()
Perform a single optimization step
Learning Rate Schedulers
- class riemann.optim.lr_scheduler.LRScheduler(optimizer, last_epoch=-1, verbose=False)
Base class for all learning rate schedulers
- Parameters:
optimizer (riemann.optim.Optimizer) – Optimizer whose learning rate will be adjusted
last_epoch (int, optional) – The index of last epoch
verbose (bool, optional) – Whether to print learning rate updates
- step(epoch=None)
Perform a single scheduler step
- Parameters:
epoch (int, optional) – Current epoch index
- get_lr()
Get the current learning rate for the current epoch
- Returns:
Learning rate for each parameter group
- Return type:
List[float]
- get_last_lr()
Return the last computed learning rate
- Returns:
Learning rate for each parameter group
- Return type:
List[float]
- class riemann.optim.lr_scheduler.StepLR(optimizer, step_size, gamma=0.1, last_epoch=-1, verbose=False)
Decays the learning rate by gamma every step_size epochs
- Parameters:
optimizer (riemann.optim.Optimizer) – Optimizer whose learning rate will be adjusted
step_size (int) – Period of learning rate decay
gamma (float, optional) – Multiplicative factor of learning rate decay
last_epoch (int, optional) – The index of last epoch
verbose (bool, optional) – Whether to print learning rate updates
- class riemann.optim.lr_scheduler.MultiStepLR(optimizer, milestones, gamma=0.1, last_epoch=-1, verbose=False)
Decays the learning rate by gamma at specified milestones
- Parameters:
optimizer (riemann.optim.Optimizer) – Optimizer whose learning rate will be adjusted
milestones (List[int]) – List of epoch indices
gamma (float, optional) – Multiplicative factor of learning rate decay
last_epoch (int, optional) – The index of last epoch
verbose (bool, optional) – Whether to print learning rate updates
- class riemann.optim.lr_scheduler.ExponentialLR(optimizer, gamma, last_epoch=-1, verbose=False)
Exponentially decays the learning rate
- Parameters:
optimizer (riemann.optim.Optimizer) – Optimizer whose learning rate will be adjusted
gamma (float) – Multiplicative factor of learning rate decay
last_epoch (int, optional) – The index of last epoch
verbose (bool, optional) – Whether to print learning rate updates
- class riemann.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max, eta_min=0, last_epoch=-1, verbose=False)
Anneals the learning rate using cosine function
- Parameters:
optimizer (riemann.optim.Optimizer) – Optimizer whose learning rate will be adjusted
T_max (int) – Maximum number of iterations
eta_min (float, optional) – Minimum learning rate
last_epoch (int, optional) – The index of last epoch
verbose (bool, optional) – Whether to print learning rate updates
- class riemann.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=10, verbose=False, threshold=1e-4, threshold_mode='rel', cooldown=0, min_lr=0, eps=1e-8)
Reduces learning rate when a metric has stopped improving
- Parameters:
optimizer (riemann.optim.Optimizer) – Optimizer whose learning rate will be adjusted
mode (str, optional) – One of ‘min’ or ‘max’
factor (float, optional) – Multiplicative factor of learning rate reduction
patience (int, optional) – Number of epochs with no improvement after which learning rate will be reduced
verbose (bool, optional) – Whether to print learning rate updates
threshold (float, optional) – Threshold for measuring the new optimum
threshold_mode (str, optional) – One of ‘rel’ or ‘abs’
cooldown (int, optional) – Number of epochs to wait before resuming normal operation after learning rate has been reduced
min_lr (float or List[float], optional) – Minimum learning rate
eps (float, optional) – Minimal decay applied to lr