Module warm.functional
Wraps around various torch.nn Modules to fit into a functional interface.
Functions
batch_norm
def :
x,
**kw
x: Tensor
; 2d or more, with shapes(Batch, Channel, *)
where*
means any number of additional dimensions.**kw: dict
; Any additional KWargs are passed down totorch.nn.BatchNormNd
, where N can be 1, 2 or 3. as well aswarm.engine.forward
. Refer to their docs for details. Some of the additional BatchNorm arguments:eps, momentum, affine, track_running_stats
.return: Tensor
; Same shape as inputx
.
conv
def :
x,
size,
kernel,
init_weight=None,
init_bias=None,
bias=True,
**kw
x: Tensor
; With shape(Batch, Channel, *)
where*
Can be 1d or 2d or 3d. If 3d, shapes are(Batch, Channel, Length)
. If 4d, shapes are(Batch, Channel, Height, Width)
. If 5d, shapes are(Batch, Channel, Depth, Height, Width)
.size: int
; Size of hidden filters, and size of the output channel.kernel: int or tuple
; Size of the convolution kernel.init_weight: None or str or callable
; Initialization specification for the weight tensor. If astr
, should be one of the nonlinearity functions contained intorch.nn.init
. If acallable
, it will be applied tox
directly, i.e.spec(x)
. If a 2-tuple
, it must be of format(callable, kwargs)
, i.e.callable(x, **kwargs)
. Default:None
, and the weight tensor is initialized usingtorch.nn.ConvNd
s default scheme.init_bias: None or str or callable
; Same asinit_weight
, but for the bias tensor.bias: bool
; IfTrue
, adds a learnable bias to the output. Default:True
.**kw:dict
; Any additional KWargs are passed down totorch.nn.ConvNd
, where N can be 1, 2 or 3. as well aswarm.engine.forward
. Refer to their docs for details. Some of the additional ConvNd arguments:stride, padding, dilation, groups
.return: Tensor
; With shape(Batch, Size, *)
where*
can be 1d, 2d, 3d that depends onx
.
dropout
def :
x,
rate=0.5,
by_channel=False,
**kw
During training, randomly zeros part of input tensor x
, at probability rate
.
x: Tensor
; Can be of any shape ifby_channel
is false, or 2d and up ifby_channel
is true.rate: float
; The probability of dropout. Default 0.5.by_channel: bool
; If true, will dropout entire channels (all'D'
dimensions will be 0 if x is'BCD'
).by_channel
true requiresx
to be 2d or more.inplace: bool
; If true, the operation will be in-place and the inputx
will be altered.return: Tensor
; Same shape asx
.
embedding
def :
x,
size,
vocabulary=None,
**kw
The input is usually a list of indices (integers), and the output is a dense matrix which maps indices to dense vectors. Thus the output will have 1 more dimension than the input.
Note: The output of this function is always one more dimension than the input. For input with shape (*)
,
The output will be (*, size)
. Any shape specifications in the KWargs are ignored.
x: Tensor
; Contains indices into the vocabulary. Will be converted toLongTensor
of integers. Can be of any shape.size: int
; The size of embedding vector.vocabulary: int or None
; The size of vocabulary of embedding, or max number of unique indices inx
. By default it is set tomax(x)-min(x)+1
.**kw: dict
; Any additional KWargs are passed down totorch.nn.LayerNorm
, as well aswarm.engine.forward
.return: Tensor
; With the embedded dim appended to the shape of x. Thus with shape(*, Size)
, where*
is the shape ofx
.
gru
def :
*arg,
**kw
x: Tensor or tuple
; If tuple, must be of format(x, (h_0, c_0))
, wherex
is a 3d tensor, with shapes(Batch, Channel, Length)
.size: int
; Size of hidden features, and size of the output channel.init_weight_hh: None or str or callable
; Initialization specification for the hidden-hidden weight tensor. If astr
, should be one of the nonlinearity functions contained intorch.nn.init
. If acallable
, it will be applied tox
directly, i.e.spec(x)
. If a 2-tuple
, it must be of format(callable, kwargs)
, i.e.callable(x, **kwargs)
. Default:'orthogonal_'
.init_weight_ih: None or str or callable
; Initialization specification for the input-hidden weight tensor. Default:None
, and the weight tensor is initialized usingtorch.nn.GRU
s default scheme.init_bias_hh: None or str or callable
; Initialization specification for the hidden-hidden bias tensor. Default:None
, and the weight tensor is initialized usingtorch.nn.GRU
s default scheme.init_bias_ih: None or str or callable
; Initialization specification for the input-hidden bias tensor. Default:None
, and the weight tensor is initialized usingtorch.nn.GRU
s default scheme.bias: bool
; IfFalse
, then the layer does not usebias_ih
andbias_hh
. Default:True
.num_layers: int
; Number of the recurrent layers. Default: 1.tuple_out: bool
; IfTrue
, the returned value will be a tuple(out, (h_n, c_n))
. Default: False.**kw: dict
; Any additional KWargs are passed down totorch.nn.GRU
, as well aswarm.engine.forward
. Refer to their docs for details. Some of the additional GRU arguments:dropout, bidirectional, batch_first
.return: Tensor or tuple
; Iftuple_out
set to true, will return(out, (h_n, c_n)
, otherwise justout
.out
has shape(Batch, Size, Length*Directions)
, where Directions = 2 ifbidirectional
else 1.h_n
is the hidden states with shape(num_layers*Directions, Batch, Size)
.c_n
is the cell states with shape(num_layers*Directions, Batch, Size)
.
identity
def :
x,
*arg,
**kw
layer_norm
def :
x,
dim=1,
**kw
x: Tensor
; Can be of any shape.dim: int or list of int
; Dimensions to be normalized. Default: 1.**kw: dict
; Any additional KWargs are passed down totorch.nn.LayerNorm
, as well aswarm.engine.forward
.return: Tensor
; Same shape asx
.
linear
def :
x,
size,
init_weight=None,
init_bias=None,
bias=True,
**kw
x: Tensor
; 2d or more, with shapes(Batch, Channel, *)
where*
means any number of additional dimensions.size: int
; Size of hidden features, and size of the output channel.init_weight: None or str or callable
; Initialization specification for the weight tensor. If astr
, should be one of the nonlinearity functions contained intorch.nn.init
. If acallable
, it will be applied tox
directly, i.e.spec(x)
. If a 2-tuple
, it must be of format(callable, kwargs)
, i.e.callable(x, **kwargs)
. Default:None
, and the weight tensor is initialized usingtorch.nn.Linear
s default scheme.init_bias: None or str or callable
; Same asinit_weight
, but for the bias tensor.bias: bool
; IfTrue
, adds a learnable bias to the output. Default:True
.**kw:dict
; Any additional KWargs are passed down towarm.engine.forward
. Refer to its docs for details.return: Tensor
; With shape(Batch, Size, *)
where*
can be 1d, 2d, 3d that depends onx
.
lstm
def :
x,
size,
init_weight_hh='orthogonal_',
init_weight_ih=None,
init_bias_hh=None,
init_bias_ih=None,
bias=True,
num_layers=1,
**kw
x: Tensor or tuple
; If tuple, must be of format(x, (h_0, c_0))
, wherex
is a 3d tensor, with shapes(Batch, Channel, Length)
.size: int
; Size of hidden features, and size of the output channel.init_weight_hh: None or str or callable
; Initialization specification for the hidden-hidden weight tensor. If astr
, should be one of the nonlinearity functions contained intorch.nn.init
. If acallable
, it will be applied tox
directly, i.e.spec(x)
. If a 2-tuple
, it must be of format(callable, kwargs)
, i.e.callable(x, **kwargs)
. Default:'orthogonal_'
.init_weight_ih: None or str or callable
; Initialization specification for the input-hidden weight tensor. Default:None
, and the weight tensor is initialized usingtorch.nn.LSTM
s default scheme.init_bias_hh: None or str or callable
; Initialization specification for the hidden-hidden bias tensor. Default:None
, and the weight tensor is initialized usingtorch.nn.LSTM
s default scheme.init_bias_ih: None or str or callable
; Initialization specification for the input-hidden bias tensor. Default:None
, and the weight tensor is initialized usingtorch.nn.LSTM
s default scheme.bias: bool
; IfFalse
, then the layer does not usebias_ih
andbias_hh
. Default:True
.num_layers: int
; Number of the recurrent layers. Default: 1.tuple_out: bool
; IfTrue
, the returned value will be a tuple(out, (h_n, c_n))
. Default: False.**kw: dict
; Any additional KWargs are passed down totorch.nn.LSTM
, as well aswarm.engine.forward
. Refer to their docs for details. Some of the additional LSTM arguments:dropout, bidirectional, batch_first
.return: Tensor or tuple
; Iftuple_out
set to true, will return(out, (h_n, c_n)
, otherwise justout
.out
has shape(Batch, Size, Length*Directions)
, where Directions = 2 ifbidirectional
else 1.h_n
is the hidden states with shape(num_layers*Directions, Batch, Size)
.c_n
is the cell states with shape(num_layers*Directions, Batch, Size)
.
transformer
def :
x,
y=None,
num_encoder=6,
num_decoder=6,
num_head=8,
mask=None,
causal=False,
in_shape='BCD',
**kw
This layer covers functionality of Transformer
, TransformerEncoder
, and TransformerDecoder
.
See torch.nn.Transformer
for more details.
x: Tensor
; The source sequence, with shape(Batch, Channel, LengthX)
.Channel
is usually from embedding.y: None or Tensor
; The target sequence. Also with shape(Batch, Channel, LengthY)
. If not present, default to equalx
.num_encoder: int
; Number of encoder layers. Set to 0 to disable encoder and use only decoder. Default 6.num_decoder: int
; Number of decoder layers. Set to 0 to disable decoder and use only encoder. Default 6.num_head: int
; Number of heads for multi-headed attention. Default 8.mask: None or dict
; Keys are among:src_mask
,tgt_mask
,memory_mask
,src_key_padding_mask
,tgt_key_padding_mask
,memory_key_padding_mask
. See theforward
method oftorch.nn.Transformer
for details.causal: bool
; Default false. if true, will add causal masks to source and target, so that current value only depends on the past, not the future, in the sequences.**kw: dict
; Any additional KWargs are passed down totorch.nn.Transformer
, as well aswarm.engine.forward
.return: Tensor
; Same shape asy
, ifnum_decoder
> 0. Otherwise same shape asx
.