With the growth in the size of HPC systems measured for example in the number of CPUs and the growth in the amount of cache, especially L3 cache per CPU, it is now possible to access up to a terabyte of cache distributed across a whole HPC cluster. This allows a new paradigm for HPC where we run computations only in cache, avoiding memory bottlenecks entirely. We would treat CPU memory as a high-speed disk system, again with up to petabytes of storage across an HPC cluster and investigate a range of memory optimizations to enhance performance. The goal of this project is to demonstrate the potential speedups of cache only programming and to develop the tools and methods to enable cache only programming with the goal of closing the gap between CPU and GPU performance.