Google has a wide array of massive parallel computing infrastructure at its disposal, and it uses this software to solve problems which operate on large datasets. In this talk, Pim will offer some insight on key pieces of infrastructure: GFS (Google's FileSystem), Chubby (a paxos-based global locking server), MapReduce (a parallel computing framework), the WorkQueue (a grid-computing implementation) and sawzall (a parallelized logs processing language). We will explore what it takes to write a program that literally greps through, let us say, a copy of the web.
Pim van Pelt (1976) is a Site Reliability Engineer at Google, based in Zurich, Switzerland. He is working on Geo (Maps and Earth) and Universal Search. He leads a team that maintains the production infrastructure for highly distributed and high-available internal and end-user facing systems.
Pim can be reached at pim@sixxs.net, pim@google.com and http://www.ipng.nl/resume.html
| Attachment | Size |
|---|---|
| Google_OpenSourceDays_2010.pdf | 355.77 KB |