PyTorch-Ignite PyTorch-Ignite

How to effectively increase batch size on limited compute

To effectively increase the batch size on limited GPU resources, follow this simple best practice.

from ignite.engine import Engine

accumulation_steps = 4

def update_fn(engine, batch):
    model.train()

    x, y = prepare_batch(batch, device=device, non_blocking=non_blocking)
    y_pred = model(x)
    loss = criterion(y_pred, y) / accumulation_steps
    loss.backward()

    if engine.state.iteration % accumulation_steps == 0:
        optimizer.step()
        optimizer.zero_grad()

    return loss.item()

trainer = Engine(update_fn)

If you prefer to use the PyTorch-Ignite helper functions for supervised training mentioned here, they also support Gradient Accumulation through the gradient_accumulation_steps parameter. For example

update_fn = supervised_training_step(model, optimizer, criterion, gradient_accumulation_steps=4)
trainer = Engine(update_fn)

would result in the same Engine as above.

Resources

  1. Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups
  2. Code