You've been doing it for a few years.
Memory bandwidth optimization

Note that what I'm saying is sort of x86-centric, with specific illustration for Pentium II/III, although it'll also work on AMD and even on other platforms (like PowerPC). The more PC/workstation like the platform, the more truth this holds. DSPs with SRAMs aren't anything like this, though -- programmer beware.

