Programmer-managed GPU memory is a major challenge in writing GPU applications. Programmers must rewrite and optimize an existing code for a different GPU memory size for both portability and performance. Alternatively, they can achieve only portability by disabling GPU memory at the cost of significant performance degradation. In this paper, we propose ScaleGPU, a novel GPU architecture to ena…