AI: Through an Architect's Lens
Activation Functions: A Deep Dive
Exploring ReLU, sigmoid, tanh, and modern activation functions — why they matter and when to use each.
Why Activation Functions Matter
Without activation functions, a neural network — no matter how deep — would reduce to a single linear transformation. Activation functions introduce non-linearity, enabling the network to learn complex patterns.
The Classic Three
Sigmoid
Maps inputs to the range (0, 1). Historically popular, but suffers from vanishing gradients in deep networks.
def sigmoid(x): return 1 / (1 + np.exp(-x))Tanh
Maps to (-1, 1). Zero-centered, which helps with convergence, but still vanishing gradient issues.
ReLU
The workhorse of modern deep learning. Simple, fast, and effective:
def relu(x): return np.maximum(0, x)Coming Soon
This tutorial is a work in progress. Full content coming soon.
Series Progress 1/2 parts