PFS is a core building block of PCloud providing distributed blob storage, with replication for redundancy and high availability. It is designed along the lines of GFS (Google File System): https://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf PFS consists of two main components: Controller and Chunk servers. Chunk servers store actual data while controller maintains a global view and acts on changes in the cluster.
Chunk servers maintain list of chunks they store. Actual chunk payloads will be stored on local disk using OS provided file system. Whole metadata, chunk server needs to maintain its state, must be periodically persisted on disk so chunk server can quickly recover upon failure.
Chunk ids will be represented as RFC 4122 compliant 128 bit UUID. Chunk metadata will consist of:
type ChunkInfo struct { // Status of the chunk: NEW, CREATED, ..., READY Status ChunkStatus // Total size of chunk in bytes Size int // Number of bytes committed to disk Committed int }
Total of 16 + 3 * 32 = 112 bytes are needed to store single chunk metadata. On top if this thread-safe hash map backed ChankInfoStore structure will be built with two Load and Store methods. Store method will update in memory hash map and also append it to transaction logs. Background process will compact transaction logs periodically and persist full hash map contents on disk.
Controller will not persist any data locally. Instead it will receive state of chunk servers periodically using heart beats. This makes it is easier to keep metadata stored in controller and chunk servers consistent.