A Background on Attention Mechanisms in Deep Learning


Attention, in deep learning, is in many ways the same as how humans perceive attention; when you pay attention to something, you place more importance on the subject at hand. Similarly, attention mechanisms in deep learning — whether used for image processing, natural language processing, speech recognition, or something entirely different — place more importance on some inputs by non-uniformly weighting contributions input features.

Though all neural networks end up placing different weights on input features, whether it be through gradient descent or some other method, the mechanism of attention is marginally differently. This mechanism of attention started with the…

Josh Dey

FatBrain Fellow ’20 — ‘21 | Reed College Physics ‘20

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store