Hadoop comes with a distributed filesystem called HDFS, which stands for Hadoop Distributed Filesystem.

The Design of HDFS
HDFS is a filesystem designed for storing very large files with streaming data access patterns, running on clusters of commodity hardware.
Very large files
“Very large” in this context means files that are hundreds of megabytes, gigabytes, or terabytes in size.
There are Hadoop clusters running today that store petabytes of data.
Streaming data access
HDFS is built around the idea that the most efficient data processing pattern isa write-once, read-many-times pattern.
A dataset is typically generated or copied from source, and then various analyses are performed on that dataset over time Each analysis will involve a large proportion, if not all, of the dataset, so the time to read the whole dataset is more important than the latency in reading the first record.
Commodity hardware
Hadoop doesn’t require expensive, highly reliable hardware.
It’s designed to run on clusters of commodity hardware (commonly available hardware that can be obtained from multiple vendors) for which the chance of node failure across the cluster is high, at least for large clusters.