Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, one key element of the original map-reduce paper is the way the data is spread around. Instead of building a giant NAS with specialized (expensive) systems, and then building a bunch of specialized (expensive) compute systems, and then shipping massive quantities of data around on fast (expensive) network, the map-reduce system is built on a bunch of well balanced systems in terms of CPU/ram vs. disk, and the job is designed in a way that it can be distributed to these systems and data transfer is minimized.

So in a way, everything is happening in the storage nodes and they need to be much more than just a filesystem.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: