I think the training (or fitting) process is comparable to how a support vector machine is trained. It’s not iterative like SGD in deep learning, it’s closer to the traditional machine learning techniques.
But I agree that this is a pretty academic discussion, it doesn’t matter much in practice.