A project on the GFS

Essentially, I, along with a collaborator (Mitansh K), did the following -

  1. Localized Implementation: Implemented pretty much everything that’s offered by the Google File System inclusing
    1. Write
    2. Read
    3. Replication
    4. Record Append
    5. Re-Replication
    6. Stale Replica Handling
    7. Garbage Collection
    8. Operation Log
    9. Data Integrity

using a codebase of 7200 LOC in Go. Evidently, we missed out on snapshoting and directory management and have left them as to-dos.

  1. Enhanced Consistency: Rather than the at-least once record append operations offered by the GFS, we implemented it to be exactly once. This was done using a modified version of the 2PC protocol and handling idempotency (do see our report!).

  2. Comprehensive System Analysis: Validated system performance through extensive benchmarking, demonstrating near-linear scaling in throughput for reads, writes, and appends, while maintaining consistency and reliability.

The implementation could be found here and the report here.

Any thoughts/improvements (especially on the GitHub issues) would be welcome.




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Trying to Teach Freshmen Robotics 1