DSWoK — Data Science Well of Knowledge
Search
Search
Dark mode
Light mode
Explorer
#transformer
3 notes
· co-occurs with
4 tags
· last updated
May 18, 2026
Co-tags
#
nlp
3
#
architecture
3
#
attention
2
#
algorithm
1
Notes tagged
#transformer
01
Attention
The original paper Attention is a mechanism that lets neural networks focus on specific parts of an input sequence.
May 18, 2026
Deep Learning
02
BERT
Most of the information is available in the BERT paper. Key details: Multi-head attention. Transformer encoder.
May 18, 2026
NLP
03
Transformer
The first Transformer was introduced in the Attention Is All You Need paper, soon after that BERT was published.
May 18, 2026
NLP