# Module warm.functional

Wraps around various torch.nn Modules to fit into a functional interface.

## Functions

### batch_norm

```
def :
x,
**kw
```

`x: Tensor`

; 2d or more, with shapes`(Batch, Channel, *)`

where`*`

means any number of additional dimensions.`**kw: dict`

; Any additional KWargs are passed down to`torch.nn.BatchNormNd`

, where N can be 1, 2 or 3. as well as`warm.engine.forward`

. Refer to their docs for details. Some of the additional BatchNorm arguments:`eps, momentum, affine, track_running_stats`

.`return: Tensor`

; Same shape as input`x`

.

### conv

```
def :
x,
size,
kernel,
init_weight=None,
init_bias=None,
bias=True,
**kw
```

`x: Tensor`

; With shape`(Batch, Channel, *)`

where`*`

Can be 1d or 2d or 3d. If 3d, shapes are`(Batch, Channel, Length)`

. If 4d, shapes are`(Batch, Channel, Height, Width)`

. If 5d, shapes are`(Batch, Channel, Depth, Height, Width)`

.`size: int`

; Size of hidden filters, and size of the output channel.`kernel: int or tuple`

; Size of the convolution kernel.`init_weight: None or str or callable`

; Initialization specification for the weight tensor. If a`str`

, should be one of the nonlinearity functions contained in`torch.nn.init`

. If a`callable`

, it will be applied to`x`

directly, i.e.`spec(x)`

. If a 2-`tuple`

, it must be of format`(callable, kwargs)`

, i.e.`callable(x, **kwargs)`

. Default:`None`

, and the weight tensor is initialized using`torch.nn.ConvNd`

s default scheme.`init_bias: None or str or callable`

; Same as`init_weight`

, but for the bias tensor.`bias: bool`

; If`True`

, adds a learnable bias to the output. Default:`True`

.`**kw:dict`

; Any additional KWargs are passed down to`torch.nn.ConvNd`

, where N can be 1, 2 or 3. as well as`warm.engine.forward`

. Refer to their docs for details. Some of the additional ConvNd arguments:`stride, padding, dilation, groups`

.`return: Tensor`

; With shape`(Batch, Size, *)`

where`*`

can be 1d, 2d, 3d that depends on`x`

.

### dropout

```
def :
x,
rate=0.5,
by_channel=False,
**kw
```

During training, randomly zeros part of input tensor `x`

, at probability `rate`

.

`x: Tensor`

; Can be of any shape if`by_channel`

is false, or 2d and up if`by_channel`

is true.`rate: float`

; The probability of dropout. Default 0.5.`by_channel: bool`

; If true, will dropout entire channels (all`'D'`

dimensions will be 0 if x is`'BCD'`

).`by_channel`

true requires`x`

to be 2d or more.`inplace: bool`

; If true, the operation will be in-place and the input`x`

will be altered.`return: Tensor`

; Same shape as`x`

.

### embedding

```
def :
x,
size,
vocabulary=None,
**kw
```

The input is usually a list of indices (integers), and the output is a dense matrix which maps indices to dense vectors. Thus the output will have 1 more dimension than the input.

**Note**: The output of this function is always one more dimension than the input. For input with shape `(*)`

,
The output will be `(*, size)`

. Any shape specifications in the KWargs are ignored.

`x: Tensor`

; Contains indices into the vocabulary. Will be converted to`LongTensor`

of integers. Can be of any shape.`size: int`

; The size of embedding vector.`vocabulary: int or None`

; The size of vocabulary of embedding, or max number of unique indices in`x`

. By default it is set to`max(x)-min(x)+1`

.`**kw: dict`

; Any additional KWargs are passed down to`torch.nn.LayerNorm`

, as well as`warm.engine.forward`

.`return: Tensor`

; With the embedded dim appended to the shape of x. Thus with shape`(*, Size)`

, where`*`

is the shape of`x`

.

### gru

```
def :
*arg,
**kw
```

`x: Tensor or tuple`

; If tuple, must be of format`(x, (h_0, c_0))`

, where`x`

is a 3d tensor, with shapes`(Batch, Channel, Length)`

.`size: int`

; Size of hidden features, and size of the output channel.`init_weight_hh: None or str or callable`

; Initialization specification for the hidden-hidden weight tensor. If a`str`

, should be one of the nonlinearity functions contained in`torch.nn.init`

. If a`callable`

, it will be applied to`x`

directly, i.e.`spec(x)`

. If a 2-`tuple`

, it must be of format`(callable, kwargs)`

, i.e.`callable(x, **kwargs)`

. Default:`'orthogonal_'`

.`init_weight_ih: None or str or callable`

; Initialization specification for the input-hidden weight tensor. Default:`None`

, and the weight tensor is initialized using`torch.nn.GRU`

s default scheme.`init_bias_hh: None or str or callable`

; Initialization specification for the hidden-hidden bias tensor. Default:`None`

, and the weight tensor is initialized using`torch.nn.GRU`

s default scheme.`init_bias_ih: None or str or callable`

; Initialization specification for the input-hidden bias tensor. Default:`None`

, and the weight tensor is initialized using`torch.nn.GRU`

s default scheme.`bias: bool`

; If`False`

, then the layer does not use`bias_ih`

and`bias_hh`

. Default:`True`

.`num_layers: int`

; Number of the recurrent layers. Default: 1.`tuple_out: bool`

; If`True`

, the returned value will be a tuple`(out, (h_n, c_n))`

. Default: False.`**kw: dict`

; Any additional KWargs are passed down to`torch.nn.GRU`

, as well as`warm.engine.forward`

. Refer to their docs for details. Some of the additional GRU arguments:`dropout, bidirectional, batch_first`

.`return: Tensor or tuple`

; If`tuple_out`

set to true, will return`(out, (h_n, c_n)`

, otherwise just`out`

.`out`

has shape`(Batch, Size, Length*Directions)`

, where Directions = 2 if`bidirectional`

else 1.`h_n`

is the hidden states with shape`(num_layers*Directions, Batch, Size)`

.`c_n`

is the cell states with shape`(num_layers*Directions, Batch, Size)`

.

### identity

```
def :
x,
*arg,
**kw
```

### layer_norm

```
def :
x,
dim=1,
**kw
```

`x: Tensor`

; Can be of any shape.`dim: int or list of int`

; Dimensions to be normalized. Default: 1.`**kw: dict`

; Any additional KWargs are passed down to`torch.nn.LayerNorm`

, as well as`warm.engine.forward`

.`return: Tensor`

; Same shape as`x`

.

### linear

```
def :
x,
size,
init_weight=None,
init_bias=None,
bias=True,
**kw
```

`x: Tensor`

; 2d or more, with shapes`(Batch, Channel, *)`

where`*`

means any number of additional dimensions.`size: int`

; Size of hidden features, and size of the output channel.`init_weight: None or str or callable`

; Initialization specification for the weight tensor. If a`str`

, should be one of the nonlinearity functions contained in`torch.nn.init`

. If a`callable`

, it will be applied to`x`

directly, i.e.`spec(x)`

. If a 2-`tuple`

, it must be of format`(callable, kwargs)`

, i.e.`callable(x, **kwargs)`

. Default:`None`

, and the weight tensor is initialized using`torch.nn.Linear`

s default scheme.`init_bias: None or str or callable`

; Same as`init_weight`

, but for the bias tensor.`bias: bool`

; If`True`

, adds a learnable bias to the output. Default:`True`

.`**kw:dict`

; Any additional KWargs are passed down to`warm.engine.forward`

. Refer to its docs for details.`return: Tensor`

; With shape`(Batch, Size, *)`

where`*`

can be 1d, 2d, 3d that depends on`x`

.

### lstm

```
def :
x,
size,
init_weight_hh='orthogonal_',
init_weight_ih=None,
init_bias_hh=None,
init_bias_ih=None,
bias=True,
num_layers=1,
**kw
```

`x: Tensor or tuple`

; If tuple, must be of format`(x, (h_0, c_0))`

, where`x`

is a 3d tensor, with shapes`(Batch, Channel, Length)`

.`size: int`

; Size of hidden features, and size of the output channel.`init_weight_hh: None or str or callable`

; Initialization specification for the hidden-hidden weight tensor. If a`str`

, should be one of the nonlinearity functions contained in`torch.nn.init`

. If a`callable`

, it will be applied to`x`

directly, i.e.`spec(x)`

. If a 2-`tuple`

, it must be of format`(callable, kwargs)`

, i.e.`callable(x, **kwargs)`

. Default:`'orthogonal_'`

.`init_weight_ih: None or str or callable`

; Initialization specification for the input-hidden weight tensor. Default:`None`

, and the weight tensor is initialized using`torch.nn.LSTM`

s default scheme.`init_bias_hh: None or str or callable`

; Initialization specification for the hidden-hidden bias tensor. Default:`None`

, and the weight tensor is initialized using`torch.nn.LSTM`

s default scheme.`init_bias_ih: None or str or callable`

; Initialization specification for the input-hidden bias tensor. Default:`None`

, and the weight tensor is initialized using`torch.nn.LSTM`

s default scheme.`bias: bool`

; If`False`

, then the layer does not use`bias_ih`

and`bias_hh`

. Default:`True`

.`num_layers: int`

; Number of the recurrent layers. Default: 1.`tuple_out: bool`

; If`True`

, the returned value will be a tuple`(out, (h_n, c_n))`

. Default: False.`**kw: dict`

; Any additional KWargs are passed down to`torch.nn.LSTM`

, as well as`warm.engine.forward`

. Refer to their docs for details. Some of the additional LSTM arguments:`dropout, bidirectional, batch_first`

.`return: Tensor or tuple`

; If`tuple_out`

set to true, will return`(out, (h_n, c_n)`

, otherwise just`out`

.`out`

has shape`(Batch, Size, Length*Directions)`

, where Directions = 2 if`bidirectional`

else 1.`h_n`

is the hidden states with shape`(num_layers*Directions, Batch, Size)`

.`c_n`

is the cell states with shape`(num_layers*Directions, Batch, Size)`

.

### transformer

```
def :
x,
y=None,
num_encoder=6,
num_decoder=6,
num_head=8,
mask=None,
causal=False,
in_shape='BCD',
**kw
```

This layer covers functionality of `Transformer`

, `TransformerEncoder`

, and `TransformerDecoder`

.
See `torch.nn.Transformer`

for more details.

`x: Tensor`

; The source sequence, with shape`(Batch, Channel, LengthX)`

.`Channel`

is usually from embedding.`y: None or Tensor`

; The target sequence. Also with shape`(Batch, Channel, LengthY)`

. If not present, default to equal`x`

.`num_encoder: int`

; Number of encoder layers. Set to 0 to disable encoder and use only decoder. Default 6.`num_decoder: int`

; Number of decoder layers. Set to 0 to disable decoder and use only encoder. Default 6.`num_head: int`

; Number of heads for multi-headed attention. Default 8.`mask: None or dict`

; Keys are among:`src_mask`

,`tgt_mask`

,`memory_mask`

,`src_key_padding_mask`

,`tgt_key_padding_mask`

,`memory_key_padding_mask`

. See the`forward`

method of`torch.nn.Transformer`

for details.`causal: bool`

; Default false. if true, will add causal masks to source and target, so that current value only depends on the past, not the future, in the sequences.`**kw: dict`

; Any additional KWargs are passed down to`torch.nn.Transformer`

, as well as`warm.engine.forward`

.`return: Tensor`

; Same shape as`y`

, if`num_decoder`

> 0. Otherwise same shape as`x`

.