Cluster Based File System (A Review)
With a growth in the size of internet users and other big-data-reliant applications, the need for fault tolerant file systems that can provide concurrent and redundant access to file is growing.The Access, location, concurrency and failure transparency provide by cluster based file systems has inspired many big data users like Google and Oracle to adopt cluster based file systems. The heterogeneity and transparency provided by such systems also dampens expansion and managerial costs,however the management of such systems is not tribal.
Cluster based file systems often require complex operations to ensure synchronicity and concurrency control within the cluster.The immense distributed management of cluster also creates a performance overhead especially when involving remote requests. cluster based systems are thus challenged to provide the seemingly redundant and resilient operation rooted in their architectures while not compromising on performance,security and other functionality metrics.Thus review will focus on the strategies adopted in the optional deployment of cluster based file systems and the schemes used in some of them to overcome some of the challenges faced by cluster based systems. The review will focus on file clustered file clustered file systems (Hadoop file systems, lustre, Gluster file systems,Caph, Moose file system and Quick file systems). These systems were selected based on their recentness. Additionally,servers like Hadoop are being used by industry leaders like Yahoo and Facebook and hence more prone to rigorous standardization.The survey will review them based on their Architectures, Naming method, Replication mechanism and fault management and detection mechanism.