Module bastionlab.torch.psg.nn
Functions
expanded_convolution(conv_fn: Callable, tuple_fn: Callable[[~T], Tuple[int, ...]]) ‑> Callable
Classes
ConvLinear(in_features: int, out_features: int, max_batch_size: int, bias: bool = True, device: Union[torch.device, str, ForwardRef(None)] = None, dtype: Optional[torch.dtype] = None)

Linear layer with expanded weights that internally uses an expanded 1D convolution.
Refer to the documentation of convolutions for more about the internals and Pytorch's Linear Layer documentation for more about the parameters and their usage.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Ancestors (in MRO)
 torch.nn.modules.module.Module
Class variables
dump_patches: bool
:training: bool
:Methods
forward(self, x) ‑> Callable[..., Any]

Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note:: Although the recipe for forward pass needs to be defined within this function, one should call the :class:
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Embedding(num_embeddings: int, embedding_dim: int, max_batch_size: int, padding_idx: Optional[int] = None, max_norm: Optional[float] = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, sparse: bool = False, device: Union[torch.device, str, ForwardRef(None)] = None, dtype: Optional[torch.dtype] = None)

Linear layer with expanded weights to be used with DPSGD.
Weights are expanded to the
max_batch_size
so that the autodif computes the persamples gradient needed by the DPSGD algorithm.An embedding layer is essentially a lookup table that internally stores all the vectors of the vocabulary and returns the vector associated with each input index. To compute persample gradients, we "copy" the lookup table as many times as the maximum number of samples in a batch. The input indexes are offseted by their sample number times the vocabulary size before actually looking up so that each sample uses a different "copy" of the lookup table.
The copy of the lookup table is intself costless as we only use an expanded view (similar to broadcasting). The runtime cost is low as well as we just need to remap the input indexes.
Refer to the Pytorch documentation for more on how to use the various parameters: https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html#torch.nn.Embedding.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Ancestors (in MRO)
 torch.nn.modules.sparse.Embedding
 torch.nn.modules.module.Module
Class variables
embedding_dim: int
:max_norm: Optional[float]
:norm_type: float
:num_embeddings: int
:padding_idx: Optional[int]
:scale_grad_by_freq: bool
:sparse: bool
:weight: torch.Tensor
:Methods
extra_repr(self)

Set the extra representation of the module
To print customized extra information, you should reimplement this method in your own modules. Both singleline and multiline strings are acceptable.
forward(self, x: torch.Tensor) ‑> torch.Tensor

Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note:: Although the recipe for forward pass needs to be defined within this function, one should call the :class:
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
LayerNorm(normalized_shape: Union[int, List[int], torch.Size], max_batch_size: int, eps: float = 1e05, elementwise_affine: bool = True, device: Union[torch.device, str, ForwardRef(None)] = None, dtype: Optional[torch.dtype] = None)

LayerNorm layer with expanded weights to be used with DPSGD.
Weights are expanded to the
max_batch_size
so that the autodif computes the persamples gradient needed by the DPSGD algorithm.Expansion is made without copying or allocating more memory as expanded weights are just a view on the original weights (similar to broadcasting).
This comes at no additional cost during the forward pass as LayerNorm involves an affine elementwise operation that can directly be done with the expanded weights with proper views.
Refer to the Pytorch documentation for more on how to use the various parameters: https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html#torch.nn.LayerNorm.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Ancestors (in MRO)
 torch.nn.modules.normalization.LayerNorm
 torch.nn.modules.module.Module
Class variables
elementwise_affine: bool
:eps: float
:normalized_shape: Tuple[int, ...]
:Methods
extra_repr(self)

Set the extra representation of the module
To print customized extra information, you should reimplement this method in your own modules. Both singleline and multiline strings are acceptable.
forward(self, x: torch.Tensor) ‑> torch.Tensor

Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note:: Although the recipe for forward pass needs to be defined within this function, one should call the :class:
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Linear(in_features: int, out_features: int, max_batch_size: int, bias: bool = True, device: Union[torch.device, str, ForwardRef(None)] = None, dtype: Optional[torch.dtype] = None)

Linear layer with expanded weights to be used with DPSGD.
Weights are expanded to the
max_batch_size
so that the autodif computes the persamples gradient needed by the DPSGD algorithm.Expansion is made without copying or allocating more memory as expanded weights are just a view on the original weights (similar to broadcasting).
However, this implies the forward pass is performed with einsum which may slightly decrese the performance of the computation.
Refer to the Pytorch documentation for more on how to use the various parameters: https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Ancestors (in MRO)
 torch.nn.modules.linear.Linear
 torch.nn.modules.module.Module
Class variables
in_features: int
:out_features: int
:weight: torch.Tensor
:Methods
extra_repr(self)

Set the extra representation of the module
To print customized extra information, you should reimplement this method in your own modules. Both singleline and multiline strings are acceptable.
forward(self, x: torch.Tensor) ‑> torch.Tensor

Defines the computation performed at every call.
Should be overridden by all subclasses.
.. note:: Although the recipe for forward pass needs to be defined within this function, one should call the :class:
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Conv1d(in_channels: int, out_channels: int, kernel_size: ~T, max_batch_size: int, stride: ~T = 1, padding: Union[str, ~T] = 0, dilation: ~T = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros', device: Union[torch.device, str, ForwardRef(None)] = None, dtype: Optional[torch.dtype] = None)

Convolutional layer with expanded weights to be used with DPSGD.
Weights are expanded to the provided
max_batch_size
so that the autodiff computes the persamples gradient needed by the DPSGD algorithm.Expansion is made without copying or allocating more memory at the model lifetime scale as expanded weights are just a view on the original weights (similar to broadcasting).
However, weights are reallocated while computing the forward pass for a short amount of time as the forward pass computation needs them in a contiguous format. As layers are typically used one after the other, the overall memory impact is neglectable.
To speed up the computation of the forward pass with expanded weights, we use grouped convolutions with a number of groups equal to the number of samples: the convolution operator uses one kernel group per sample (which makes sample computations independent) and the weights of these are shared thanks to the expansion.
Refer to the Pytorch documentation for more on how to use the various parameters: 1D: https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html#torch.nn.Conv1d 2D: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d 3D: https://pytorch.org/docs/stable/generated/torch.nn.Conv3d.html#torch.nn.Conv3d
Ancestors (in MRO)
 bastionlab.torch.psg.nn._ConvNd
 torch.nn.modules.conv._ConvNd
 torch.nn.modules.module.Module
Class variables
bias: Optional[torch.Tensor]
:dilation: Tuple[int, ...]
:groups: int
:in_channels: int
:kernel_size: Tuple[int, ...]
:out_channels: int
:output_padding: Tuple[int, ...]
:padding: Union[str, Tuple[int, ...]]
:padding_mode: str
:stride: Tuple[int, ...]
:transposed: bool
:weight: torch.Tensor
:Methods
forward(self, x: torch.Tensor) ‑> torch.Tensor
: Conv2d(in_channels: int, out_channels: int, kernel_size: ~T, max_batch_size: int, stride: ~T = 1, padding: Union[str, ~T] = 0, dilation: ~T = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros', device: Union[torch.device, str, ForwardRef(None)] = None, dtype: Optional[torch.dtype] = None)

Convolutional layer with expanded weights to be used with DPSGD.
Weights are expanded to the provided
max_batch_size
so that the autodiff computes the persamples gradient needed by the DPSGD algorithm.Expansion is made without copying or allocating more memory at the model lifetime scale as expanded weights are just a view on the original weights (similar to broadcasting).
However, weights are reallocated while computing the forward pass for a short amount of time as the forward pass computation needs them in a contiguous format. As layers are typically used one after the other, the overall memory impact is neglectable.
To speed up the computation of the forward pass with expanded weights, we use grouped convolutions with a number of groups equal to the number of samples: the convolution operator uses one kernel group per sample (which makes sample computations independent) and the weights of these are shared thanks to the expansion.
Refer to the Pytorch documentation for more on how to use the various parameters: 1D: https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html#torch.nn.Conv1d 2D: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d 3D: https://pytorch.org/docs/stable/generated/torch.nn.Conv3d.html#torch.nn.Conv3d
Ancestors (in MRO)
 bastionlab.torch.psg.nn._ConvNd
 torch.nn.modules.conv._ConvNd
 torch.nn.modules.module.Module
Class variables
bias: Optional[torch.Tensor]
:dilation: Tuple[int, ...]
:groups: int
:in_channels: int
:kernel_size: Tuple[int, ...]
:out_channels: int
:output_padding: Tuple[int, ...]
:padding: Union[str, Tuple[int, ...]]
:padding_mode: str
:stride: Tuple[int, ...]
:transposed: bool
:weight: torch.Tensor
:Methods
forward(self, x: torch.Tensor) ‑> torch.Tensor
: Conv3d(in_channels: int, out_channels: int, kernel_size: ~T, max_batch_size: int, stride: ~T = 1, padding: Union[str, ~T] = 0, dilation: ~T = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros', device: Union[torch.device, str, ForwardRef(None)] = None, dtype: Optional[torch.dtype] = None)

Convolutional layer with expanded weights to be used with DPSGD.
Weights are expanded to the provided
max_batch_size
so that the autodiff computes the persamples gradient needed by the DPSGD algorithm.Expansion is made without copying or allocating more memory at the model lifetime scale as expanded weights are just a view on the original weights (similar to broadcasting).
However, weights are reallocated while computing the forward pass for a short amount of time as the forward pass computation needs them in a contiguous format. As layers are typically used one after the other, the overall memory impact is neglectable.
To speed up the computation of the forward pass with expanded weights, we use grouped convolutions with a number of groups equal to the number of samples: the convolution operator uses one kernel group per sample (which makes sample computations independent) and the weights of these are shared thanks to the expansion.
Refer to the Pytorch documentation for more on how to use the various parameters: 1D: https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html#torch.nn.Conv1d 2D: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d 3D: https://pytorch.org/docs/stable/generated/torch.nn.Conv3d.html#torch.nn.Conv3d
Ancestors (in MRO)
 bastionlab.torch.psg.nn._ConvNd
 torch.nn.modules.conv._ConvNd
 torch.nn.modules.module.Module
Class variables
bias: Optional[torch.Tensor]
:dilation: Tuple[int, ...]
:groups: int
:in_channels: int
:kernel_size: Tuple[int, ...]
:out_channels: int
:output_padding: Tuple[int, ...]
:padding: Union[str, Tuple[int, ...]]
:padding_mode: str
:stride: Tuple[int, ...]
:transposed: bool
:weight: torch.Tensor
:Methods
forward(self, x: torch.Tensor) ‑> torch.Tensor
: