In this paper we present DRAMSim2, a cycle accurate memory system simulator. The goal of DRAMSim2 is to be an accurate and publicly available DDR2/3 memory system model which can be used in both full system and trace-based simulations. We describe the process of validating DRAMSim2 timing against manufacturer Verilog models in an effort to prove the accuracy of simulation results. We outline t…
We present a system architecture that uses high-efficiency processors as opposed to high-performance processors, NAND flash as byte-addressable main memory, and high-speed DRAM as a cache front-end for the flash. The main memory system is interconnected and presents a unified global address space to the client microprocessors. A single cabinet contains 2550 nodes,networked in a highly redundant…
Current ultra-high-performance computers execute instructions at the rate of roughly 10 PFLOPS (10 quadrillion floating-point operations per second) and dissipate power in the range of 10 MW. The next generation will need to execute instructions at EFLOPS rates—100 as fast as today’s—but without dissipating any more power. To achieve this challenging goal, the emphasis is on power-effici…