Mini-batch Gradient Descent

Mini-batch Gradient Descent is an optimization algorithm used in machine learning to minimize the loss function. It combines the benefits of both Stochastic Gradient Descent and Batch Gradient Descent. Instead of using the entire dataset or just one sample, it processes small subsets, or "mini-batches," of data. This approach helps to balance the speed of convergence and the stability of the updates. By using mini-batches, the algorithm can take advantage of vectorized operations, making it more efficient on large datasets. It also introduces some randomness, which can help escape local minima and improve generalization in models like neural networks.