Activation Functions: A Deep Dive — AI: Through an Architect's Lens

Why Activation Functions Matter

Without activation functions, a neural network — no matter how deep — would reduce to a single linear transformation. Activation functions introduce non-linearity, enabling the network to learn complex patterns.

The Classic Three

Sigmoid

Maps inputs to the range (0, 1). Historically popular, but suffers from vanishing gradients in deep networks.

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

Tanh

Maps to (-1, 1). Zero-centered, which helps with convergence, but still vanishing gradient issues.

ReLU

The workhorse of modern deep learning. Simple, fast, and effective:

def relu(x):
    return np.maximum(0, x)

Coming Soon

This tutorial is a work in progress. Full content coming soon.