Analisa Kinerja Hadoop Distributed File System (HDFS) Berdasarkan File Kecil
Keywords:
Hadoop, HDFS, Small Files, Parallel Computing, TestDFSIOAbstract
Hadoop Dostributed File System [1] is a distributed file system which can process large amounts of data effectively through large clusters, the HADOOP framework which is based on it has been widely used in various clusters to build large scale, hig performance systems. HDFS is designed to handle large files. In this paper, we want to analyze the performance of HDFS when it is dealing with large number of small files and make a comparison between a few large files processing and large number of small files processing.