I imagine more advanced things like face recognition and such are not so simple, but from my experience writing a raw converter, a lot of image processing is far simpler than you'd expect.
A lot of the complexity comes down to not just doing the operation, but doing it correctly and quickly. ImageMagick does multiple passes, for instance; this is sub-optimal for both quality and speed.