SVM is great, it's often stacked on top of neural networks to do classification.
But it can't really deal with extracting features, especially from things like images and sounds. If you fed images ( pixel by pixel ) straight to the SVM and tell it to classify them, it would do extremely poorly - especially with big images.
A better thing is to have a deep ( layered ) network that will get pixel data from the input and pass it through layers that will then extract some useful features from it - for example, learn to recognize lines.
Then, after this information passes through the whole network, you are finished with smaller amount of features that are more useful for classification. Then you an send this information to SVM and it will deal with it a lot easier than if you tried to feed it straight pixels.
The thing is, we don't really have problems with classification algorithms - SVM does extremely well. What is the biggest hurdle is simply extracting features that then can be fed to well-working classification algorithms.
The advantage of SVMs is that after re-mapping the data so it becomes separable, they also guarantee to separate it as clear as possible.
Not sure if they can solve all the type of problems as well but in some cases it is considered a better, more analytic approach.