← Back to Test

Problem 4 - Olympiad

What is the primary purpose of the Transformer architecture's self-attention mechanism?

Correct: C

Self-attention computes pairwise interactions between all positions in one layer, letting the model relate distant words without compression through recurrent or convolution steps.