MapReduce and MPI are very different. MapReduce doesn't use message passing- the map phase reads inputs from sharded files in a separate disk system, applies map to the inputs, and writes out the mapped outputs to temp sort files on a seperate disk system. Then the shuffler sorts those and writes the outputs to the appropriate destination output shards in a seperate disk system, at which point the reducer reads them, applies reduce, and writes the final outputs, sharded by key to a seperate disk system.
The mappers, shufflers, and reducers are all independent of each other, reading and writing from the filesystem, and managed by a coordinator. There's nothing like MPI, other than the use of the Stubby RPC system, which sort of resembles MPI but has completely different distributed communication semantics.
The mappers, shufflers, and reducers are all independent of each other, reading and writing from the filesystem, and managed by a coordinator. There's nothing like MPI, other than the use of the Stubby RPC system, which sort of resembles MPI but has completely different distributed communication semantics.