We recall that a transformer is usually composed of the following components:

In this page, we go in details into each of the components.

The embedding and positional Layer

Encoders and decoders

Skip connections

Self-attention VS Encoder-decoder attention

Multi-heads attention (and masking)

Generating the output


Previous Section

Home

Next Section