I had a brief talk with MR folks about the design. Here's the basic thoughts:
- Master make all decisions.
- Mapper writes output to local disk.
- Reducer and shuffler are on the same node.
- Shuffler reads data from all mappers (this could be optimized by mapper local combier) and shuffles. Reducer reads shuffled result key by key.
I had a brief talk with MR folks about the design. Here's the basic thoughts: