Module warm.functional

Wraps around various torch.nn Modules to fit into a functional interface.

Functions


batch_norm

def :
    x,
  **kw 
Batch Normalization layer.

  • x: Tensor; 2d or more, with shapes (Batch, Channel, *) where * means any number of additional dimensions.
  • **kw: dict; Any additional KWargs are passed down to torch.nn.BatchNormNd, where N can be 1, 2 or 3. as well as warm.engine.forward. Refer to their docs for details. Some of the additional BatchNorm arguments: eps, momentum, affine, track_running_stats.
  • return: Tensor; Same shape as input x.

conv

def :
    x,
  size,
  kernel,
  init_weight=None,
  init_bias=None,
  bias=True,
  **kw 
Convolution layer.

  • x: Tensor; With shape (Batch, Channel, *) where * Can be 1d or 2d or 3d. If 3d, shapes are (Batch, Channel, Length). If 4d, shapes are (Batch, Channel, Height, Width). If 5d, shapes are (Batch, Channel, Depth, Height, Width).
  • size: int; Size of hidden filters, and size of the output channel.
  • kernel: int or tuple; Size of the convolution kernel.
  • init_weight: None or str or callable; Initialization specification for the weight tensor. If a str, should be one of the nonlinearity functions contained in torch.nn.init. If a callable, it will be applied to x directly, i.e. spec(x). If a 2-tuple, it must be of format (callable, kwargs), i.e. callable(x, **kwargs). Default: None, and the weight tensor is initialized using torch.nn.ConvNds default scheme.
  • init_bias: None or str or callable; Same as init_weight, but for the bias tensor.
  • bias: bool; If True, adds a learnable bias to the output. Default: True.
  • **kw:dict; Any additional KWargs are passed down to torch.nn.ConvNd, where N can be 1, 2 or 3. as well as warm.engine.forward. Refer to their docs for details. Some of the additional ConvNd arguments: stride, padding, dilation, groups.
  • return: Tensor; With shape (Batch, Size, *) where * can be 1d, 2d, 3d that depends on x.

dropout

def :
    x,
  rate=0.5,
  by_channel=False,
  **kw 
Dropout layer.

During training, randomly zeros part of input tensor x, at probability rate.

  • x: Tensor; Can be of any shape if by_channel is false, or 2d and up if by_channel is true.
  • rate: float; The probability of dropout. Default 0.5.
  • by_channel: bool; If true, will dropout entire channels (all 'D' dimensions will be 0 if x is 'BCD'). by_channel true requires x to be 2d or more.
  • inplace: bool; If true, the operation will be in-place and the input x will be altered.
  • return: Tensor; Same shape as x.

embedding

def :
    x,
  size,
  vocabulary=None,
  **kw 
Embedding layer.

The input is usually a list of indices (integers), and the output is a dense matrix which maps indices to dense vectors. Thus the output will have 1 more dimension than the input.

Note: The output of this function is always one more dimension than the input. For input with shape (*), The output will be (*, size). Any shape specifications in the KWargs are ignored.

  • x: Tensor; Contains indices into the vocabulary. Will be converted to LongTensor of integers. Can be of any shape.
  • size: int; The size of embedding vector.
  • vocabulary: int or None; The size of vocabulary of embedding, or max number of unique indices in x. By default it is set to max(x)-min(x)+1.
  • **kw: dict; Any additional KWargs are passed down to torch.nn.LayerNorm, as well as warm.engine.forward.
  • return: Tensor; With the embedded dim appended to the shape of x. Thus with shape (*, Size), where * is the shape of x.

gru

def :
    *arg,
  **kw 
Gated Recurrent Unit layer.

  • x: Tensor or tuple; If tuple, must be of format (x, (h_0, c_0)), where x is a 3d tensor, with shapes (Batch, Channel, Length).
  • size: int; Size of hidden features, and size of the output channel.
  • init_weight_hh: None or str or callable; Initialization specification for the hidden-hidden weight tensor. If a str, should be one of the nonlinearity functions contained in torch.nn.init. If a callable, it will be applied to x directly, i.e. spec(x). If a 2-tuple, it must be of format (callable, kwargs), i.e. callable(x, **kwargs). Default: 'orthogonal_'.
  • init_weight_ih: None or str or callable; Initialization specification for the input-hidden weight tensor. Default: None, and the weight tensor is initialized using torch.nn.GRUs default scheme.
  • init_bias_hh: None or str or callable; Initialization specification for the hidden-hidden bias tensor. Default: None, and the weight tensor is initialized using torch.nn.GRUs default scheme.
  • init_bias_ih: None or str or callable; Initialization specification for the input-hidden bias tensor. Default: None, and the weight tensor is initialized using torch.nn.GRUs default scheme.
  • bias: bool; If False, then the layer does not use bias_ih and bias_hh. Default: True.
  • num_layers: int; Number of the recurrent layers. Default: 1.
  • tuple_out: bool; If True, the returned value will be a tuple (out, (h_n, c_n)). Default: False.
  • **kw: dict; Any additional KWargs are passed down to torch.nn.GRU, as well as warm.engine.forward. Refer to their docs for details. Some of the additional GRU arguments: dropout, bidirectional, batch_first.
  • return: Tensor or tuple; If tuple_out set to true, will return (out, (h_n, c_n), otherwise just out. out has shape (Batch, Size, Length*Directions), where Directions = 2 if bidirectional else 1. h_n is the hidden states with shape (num_layers*Directions, Batch, Size). c_n is the cell states with shape (num_layers*Directions, Batch, Size).

identity

def :
    x,
  *arg,
  **kw 
Identity layer that returns the first input, ignores the rest arguments.


layer_norm

def :
    x,
  dim=1,
  **kw 
Layer Normalization.

  • x: Tensor; Can be of any shape.
  • dim: int or list of int; Dimensions to be normalized. Default: 1.
  • **kw: dict; Any additional KWargs are passed down to torch.nn.LayerNorm, as well as warm.engine.forward.
  • return: Tensor; Same shape as x.

linear

def :
    x,
  size,
  init_weight=None,
  init_bias=None,
  bias=True,
  **kw 
Linear transformation layer.

  • x: Tensor; 2d or more, with shapes (Batch, Channel, *) where * means any number of additional dimensions.
  • size: int; Size of hidden features, and size of the output channel.
  • init_weight: None or str or callable; Initialization specification for the weight tensor. If a str, should be one of the nonlinearity functions contained in torch.nn.init. If a callable, it will be applied to x directly, i.e. spec(x). If a 2-tuple, it must be of format (callable, kwargs), i.e. callable(x, **kwargs). Default: None, and the weight tensor is initialized using torch.nn.Linears default scheme.
  • init_bias: None or str or callable; Same as init_weight, but for the bias tensor.
  • bias: bool; If True, adds a learnable bias to the output. Default: True.
  • **kw:dict; Any additional KWargs are passed down to warm.engine.forward. Refer to its docs for details.
  • return: Tensor; With shape (Batch, Size, *) where * can be 1d, 2d, 3d that depends on x.

lstm

def :
    x,
  size,
  init_weight_hh='orthogonal_',
  init_weight_ih=None,
  init_bias_hh=None,
  init_bias_ih=None,
  bias=True,
  num_layers=1,
  **kw 
Long Short Term Memory layer.

  • x: Tensor or tuple; If tuple, must be of format (x, (h_0, c_0)), where x is a 3d tensor, with shapes (Batch, Channel, Length).
  • size: int; Size of hidden features, and size of the output channel.
  • init_weight_hh: None or str or callable; Initialization specification for the hidden-hidden weight tensor. If a str, should be one of the nonlinearity functions contained in torch.nn.init. If a callable, it will be applied to x directly, i.e. spec(x). If a 2-tuple, it must be of format (callable, kwargs), i.e. callable(x, **kwargs). Default: 'orthogonal_'.
  • init_weight_ih: None or str or callable; Initialization specification for the input-hidden weight tensor. Default: None, and the weight tensor is initialized using torch.nn.LSTMs default scheme.
  • init_bias_hh: None or str or callable; Initialization specification for the hidden-hidden bias tensor. Default: None, and the weight tensor is initialized using torch.nn.LSTMs default scheme.
  • init_bias_ih: None or str or callable; Initialization specification for the input-hidden bias tensor. Default: None, and the weight tensor is initialized using torch.nn.LSTMs default scheme.
  • bias: bool; If False, then the layer does not use bias_ih and bias_hh. Default: True.
  • num_layers: int; Number of the recurrent layers. Default: 1.
  • tuple_out: bool; If True, the returned value will be a tuple (out, (h_n, c_n)). Default: False.
  • **kw: dict; Any additional KWargs are passed down to torch.nn.LSTM, as well as warm.engine.forward. Refer to their docs for details. Some of the additional LSTM arguments: dropout, bidirectional, batch_first.
  • return: Tensor or tuple; If tuple_out set to true, will return (out, (h_n, c_n), otherwise just out. out has shape (Batch, Size, Length*Directions), where Directions = 2 if bidirectional else 1. h_n is the hidden states with shape (num_layers*Directions, Batch, Size). c_n is the cell states with shape (num_layers*Directions, Batch, Size).

transformer

def :
    x,
  y=None,
  num_encoder=6,
  num_decoder=6,
  num_head=8,
  mask=None,
  causal=False,
  in_shape='BCD',
  **kw 
Transformer layer.

This layer covers functionality of Transformer, TransformerEncoder, and TransformerDecoder. See torch.nn.Transformer for more details.

  • x: Tensor; The source sequence, with shape (Batch, Channel, LengthX). Channel is usually from embedding.
  • y: None or Tensor; The target sequence. Also with shape (Batch, Channel, LengthY). If not present, default to equal x.
  • num_encoder: int; Number of encoder layers. Set to 0 to disable encoder and use only decoder. Default 6.
  • num_decoder: int; Number of decoder layers. Set to 0 to disable decoder and use only encoder. Default 6.
  • num_head: int; Number of heads for multi-headed attention. Default 8.
  • mask: None or dict; Keys are among: src_mask, tgt_mask, memory_mask, src_key_padding_mask, tgt_key_padding_mask, memory_key_padding_mask. See the forward method of torch.nn.Transformer for details.
  • causal: bool; Default false. if true, will add causal masks to source and target, so that current value only depends on the past, not the future, in the sequences.
  • **kw: dict; Any additional KWargs are passed down to torch.nn.Transformer, as well as warm.engine.forward.
  • return: Tensor; Same shape as y, if num_decoder > 0. Otherwise same shape as x.