Data Center Storage Revolution: The Rise of QLC SSDs

As data volumes surge and energy efficiency requirements rise, data center storage technology is undergoing a new wave of transformation. In March 2025, Meta proposed a new idea in its technical blog. They suggested using QLC SSDs as a “middle tier” in the data center storage architecture. This tier is situated between traditional HDDs and TLC SSDs. This strategy aims to tackle the performance limitations of HDDs. It addresses the cost pressures of TLC. Together, these efforts offer a new choice for hyperscale data centers.

Performance Bottlenecks and Tiering Challenges of HDDs

HDDs have long been crucial for cold and archival storage in data centers. This is due to their lower cost per terabyte. They also have relatively stable power consumption. Nonetheless, Meta points out that as HDD capacities continue to grow (e.g., from 16–20TB to even higher densities), their I/O performance has not seen significant improvement. Bandwidth per terabyte (BW/TB) declines yearly. This decline limits the responsiveness of certain “hot data” (often accessed data).

To solve this issue, engineers have adopted two mitigation strategies. The first is migrating hot data to the TLC SSD tier to enhance performance. The second is compensating for the bandwidth shortfall by reserving overprovisioned space (OP). Yet, TLC SSDs carry a much higher cost per terabyte. This makes them impractical as a full HDD replacement. Overprovisioning also increases procurement and operational costs. These strategies run counter to efficiency-first principles.

Meta believes QLC SSDs can serve as a middle ground between HDDs and TLC SSDs. They can bridge the performance and cost gap.

Technology Characteristics and Positioning of QLC SSDs

Since its debut in 2009, QLC flash has offered higher storage density by storing 4 bits per cell. This is compared to TLC’s 3 bits and SLC’s 1 bit. It theoretically enables large-capacity storage at a lower cost per terabyte. Nonetheless, early adoption faced constraints. These included lower capacity, typically less than 32TB. It also had weaker endurance and lacked competitive pricing.

In recent years, advancements in NAND technology have significantly addressed these shortcomings. For example, the introduction of 2Tb QLC NAND dies and 32-die stacking has rapidly increased QLC SSD densities. Meta predicts that QLC SSDs will start to surpass TLC in density in the near term. They will keep a long-term lead. This will drive improvements in byte density at both the server and rack levels. It will also reduce acquisition costs. Additionally, it will decrease power consumption costs per terabyte.

In terms of performance, QLC SSDs are positioned between HDDs and TLC SSDs. They are suitable for workloads with bandwidth requirements of 10–20 MB/s/TB. This includes scenarios that rely on the performance of 16–20TB HDDs and large-scale batch I/O tasks now using TLC. These tasks demand moderate performance—better than HDDs—but not at TLC’s cost, which aligns well with QLC’s capabilities. Most NAND flash power consumption comes from write operations. Meta’s target workloads are predominantly read bandwidth-intensive with lower write needs. Thus, QLC excels in energy efficiency. Still, its write endurance is still lower than TLC, requiring improvement for workloads with higher read-write ratios.

Meta’s QLC Implementation: A Joint Hardware-Software Approach

Meta has been working with industry partners to promote QLC SSD adoption, with a notable collaboration with Pure Storage. Pure Storage’s DirectFlash Modules (DFM) and DirectFlash software offer a reliable QLC solution. They leverage existing NAND packaging technologies. This approach scales capacities up to 600TB. Additionally, Meta is working with multiple NAND vendors to integrate standard NVMe QLC SSDs, ensuring supplier diversity and competitive costs.

Regarding hardware form factors, Meta notes that while E1.S has performed well in TLC deployments, its limited space for NAND packages makes it unsuitable for long-term QLC scalability. In contrast, the U.2-15mm form factor offers broader compatibility. It provides scalability—up to 512TB. Still, the fragmented E3 standard with four variants has not delivered enough added value to be widely adopted. Meta has also developed a server slot design compatible with both DFM and U.2, aiming to increase QLC server byte density to six times that of current highest-density TLC servers. To support high density and throughput, systems need stronger CPUs, faster memory, and advanced networking subsystems, making hardware-software co-optimization crucial.

On the software side, Meta’s QLC system requires significantly higher throughput than traditional single-server systems. This is due to its high density and its role above HDDs. Thus, the software stack must efficiently assign data and computation across multi-core CPUs and multi-socket architectures. It should reduce touchpoints. The stack should also segregate based on I/O types. Pure Storage uses Linux user-space block drivers (ublk) and io_uring. This setup enables zero-copy operations in coordination with a user-space Flash Translation Layer (FTL). Other vendors’ NVMe QLC SSDs interact directly via io_uring.

There is a stark disparity between QLC read and write throughput. Read speeds can exceed write by over 4x. Due to the latency sensitivity of read operations, Meta must fine-tune its rate controllers and I/O schedulers. This is necessary to prevent write interference.

The Potential of QLC in AI and Industry Applications

The rise of AI is expanding storage demands to support inference workloads and large-scale model storage. Meta believes QLC SSDs are well-suited for read-intensive use cases where datasets are often updated but don’t need frequent overwrites. Research from TrendForce also highlights QLC’s suitability for read-heavy AI workloads, content delivery networks (CDNs), and machine learning applications.

Testing of Solidigm’s D5-P5336 QLC SSD shows promising results in AI training checkpoint tasks. QLC lags behind TLC in write-intensive scenarios. Nevertheless, its high capacity and efficiency meet the needs of parts of the AI storage pipeline.

Industry-wide, QLC is gaining traction. Pure Storage’s DFM architecture proves it can handle mainstream workloads. Solidigm’s 122TB QLC SSD claims a 3x reduction in rack space. It also offers 20% power savings and 31% lower total cost. Dell’s PowerScale and certain NetApp platforms have also integrated QLC. These developments show a shift in QLC’s role—from cold to primary storage. Nonetheless, differing opinions exist. Some believe dual-actuator HDDs serve as a short-term choice. Others argue that the endurance-capacity tradeoff makes QLC feasible only for workloads with a read-write ratio above 10:1.

Current Status, Challenges, and Future Outlook

Meta acknowledges that while QLC offers a lower cost than TLC, it still lacks the competitive edge. It can’t fully replace HDDs. Its current strength lies in energy efficiency and performance enhancement for specific use cases, like read bandwidth-intensive workloads. As NAND vendors improve manufacturing processes (e.g., YMTC claims QLC endurance has reached TLC levels) and scale production, costs are expected to decline, expanding QLC’s applicability. Nonetheless, challenges persist—especially write endurance limitations and complex software integration, which demand careful consideration in mixed workload environments.

Experts also point to other potential technologies, like pseudo-SLC (pSLC) modes, which dynamically adjust storage density to boost performance. These factors influence QLC’s market positioning, but Meta’s initiative clearly sets a reference path for the industry.

Conclusion: The Real-World Significance and Open Questions of QLC

Meta is striving to position QLC SSDs as a new middle tier in data center storage. This reflects an effort to balance cost, capacity, and performance. With its high density and energy efficiency, QLC shows strong potential in hyperscale and AI environments. Yet, successful adoption hinges on overcoming challenges in endurance, cost, and system-level improvement. As technology matures, QLC will secure its place in the storage hierarchy. The ecosystem evolves, making it a trend worth continued attention.

滚动至顶部