162. Residual Blocks

Why are residual blocks called “residual” blocks?

The reason why I was confused was that the equation in the diagram explaining the residual blocks on the research paper was f(x) + x. So I thought, “Where is the residual..?”

When you rephrase the equation, by skip-connecting an identity function, the machine will learn the function R(x) = f(x) – x. This means the model will learn the difference between the original input x and the transformed output f(x), instead of f(x) itself.