Skip to main content

Additional Notes

This section provides supplementary information to support simulation-based inference workflows, including performance tips, reproducibility practices, and common pitfalls.


Performance Tips

Use Vectorized Simulators

Whenever possible, implement the simulator as a vectorized function. This significantly improves data generation speed and makes better use of CPU or GPU parallelism.

# Vectorized simulation (example)
x = simulator(theta_batch)

Batch Training

Use appropriate batch sizes during training to improve throughput:

inference.append_simulations(theta, x, batch_size=512).train()

Batch size may need tuning based on available GPU memory.

Monitor GPU Usage

Use nvidia-smi to track GPU utilization in real time. If your GPU is underutilized, check for CPU–GPU transfer bottlenecks.

Reproducibility

To ensure consistent results:

  • Set random seeds for both PyTorch and NumPy
  • Document the prior distribution and simulator version
  • Save trained posterior models using torch.save
torch.manual_seed(42)
np.random.seed(42)

Logging and Debugging

Use logging or TensorBoard to monitor training:

from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter(log_dir='sbi-logs/')

Log loss curves, training metrics, and key hyperparameters for debugging or later review.

Common Pitfalls

Simulation Hangs or Fails

  • Ensure all parameter samples produce valid output
  • Avoid invalid or extreme values near prior bounds
  • Test simulator independently before using in SBI

Posterior Samples Are Degenerate

  • Increase the number of training simulations
  • Use more expressive density estimators (e.g., NSF instead of MDN)
  • Inspect diagnostics such as loss curves and pair plots

Mismatch in Tensor Shapes

  • Always ensure simulator output shape matches the expected observation shape
  • Use .unsqueeze() or .reshape() if needed

Contact

For issues, contributions, or questions, please refer to the project's repository or contact the maintainers.