Train Demo: Fitting a Quadratic Function Bias
This demo shows how to recover the bias term in a simple quadratic function (y = x^2 + \text{bias}) using gradient-based optimization. We provide examples in both TensorFlow and PyTorch.
TensorFlow Implementation
python
import numpy as np
import tensorflow as tf
# Generate data
np.random.seed(0)
x = np.random.uniform(-10, 10, 50).astype(np.float32)
true_bias = 5.0
y = x**2 + true_bias
# Define a trainable parameter
bias = tf.Variable(0.0)
# Optimizer
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
# Training loop
for step in range(2000):
with tf.GradientTape() as tape:
y_pred = x**2 + bias
loss = tf.reduce_mean((y - y_pred)**2)
grad = tape.gradient(loss, [bias])
optimizer.apply_gradients(zip(grad, [bias]))
if step % 200 == 0:
print(f"Step {step}, Loss: {loss.numpy():.4f}, Bias: {bias.numpy():.4f}")
print(f"Training complete, fitted bias ≈ {bias.numpy():.4f}, true bias = {true_bias}")
Explanation:
- We generate 50 points from (y = x^2 + 5).
- The only learnable parameter is
bias
. - Using SGD, we minimize the mean squared error between predicted and true
y
.
PyTorch Implementation
python
import numpy as np
import torch
# Generate data
np.random.seed(0)
x = np.random.uniform(-10, 10, 50).astype(np.float32)
true_bias = 5.0
y = x**2 + true_bias
# Convert to torch tensors
x_tensor = torch.tensor(x)
y_tensor = torch.tensor(y)
# Define a trainable parameter
bias = torch.tensor(0.0, requires_grad=True)
# Optimizer
optimizer = torch.optim.SGD([bias], lr=0.01)
# Training loop
for step in range(2000):
optimizer.zero_grad()
y_pred = x_tensor**2 + bias
loss = torch.mean((y_tensor - y_pred)**2)
loss.backward()
optimizer.step()
if step % 200 == 0:
print(f"Step {step}, Loss: {loss.item():.4f}, Bias: {bias.item():.4f}")
print(f"Training complete, fitted bias ≈ {bias.item():.4f}, true bias = {true_bias}")
Explanation:
- The workflow mirrors the TensorFlow version but uses PyTorch tensors and
autograd
. bias
is the only parameter withrequires_grad=True
.- Each step, we compute the loss, backpropagate gradients, and update the bias.
✅ Key Takeaways
- Both frameworks allow you to optimize parameters with minimal code.
- The gradient descent loop consists of: forward pass → compute loss → backward pass → update parameters.
- Even for simple functions, these frameworks provide a consistent interface for training more complex models.
This demo is a simple illustrative example for understanding gradient-based learning on parameters.