Saving Energy With Contextual Caching

Energy usage by human society is rightly a big issue moving forward. Not because using energy is intrinsically bad but because our methods of obtaining enough usable energy are damaging the planet.

Energy usage by IT continues to grow1 and is now a non-trivial proportion of global energy use2. Even as our technology continues to become more energy efficient due to the hard work of many smart people our total energy requirements continue to grow. The demands of our information driven society shows no signs of abating. Much of this energy is used in the data centres which power the backend services behind the web sites and apps we use daily.

Incremental efficiency savings are not enough. Consider an analogy with goods distribution logistics: Logistics companies buy the latest energy efficient vehicles and maintain them well. However they are also smart about when are where they are driven. They earn their use of the word logistics because they employ sophisticated algorithms to plan routes combining many deliveries and collections to minimise the distance driven.

Contrast this with the naive approach of making an individual journey from their hub for every collection and delivery. The saving gained from this planning saves far more energy (many thousands of percent) than the modest savings of running efficient vehicles. Our current IT systems are asked the same questions repeatedly and are often asked very similar questions repeatedly too, usually within short periods of time. Could they not combine journeys also?

It is not immediately obvious that this question can be answered in the affirmative. However the answer is a unequivocal yes. What is particularly non-apparent is the extent to which this is true. The problem however is recognised. We employ an approach called caches which use various techniques to return information much faster than would be possible without them. I.e. without making the full journey again and again.

One particularly well known example of a cache is within a microprocessor architecture. This cache will store blocks of memory very close to the processor indexed by it’s address in main memory. When the processor asks for the value stored in a particular memory location the cache is first checked. If it’s there then good, the processor gets the value very quickly. If it’s not there the block of memory is loaded from main memory into the cache displacing some old block.

Another example of a cache is a web cache. A web cache will keep web pages and other resources such as images in a close location such as the browsers filesystem. Again this is done to speed up the browsing experience of a user but also has the effect of reducing the number of requests sent to the web server. When web servers serve requests they attach metadata to their response. This metadata often contains guidance on how the response may be cached if at all. This is why some requests such as pesky tracking beacons always get sent but the big images used in modern web page designs are cached.

There are a few characteristics of these caches that are worth pointing out:

  • Firstly these cache require an identifier for the information they want or the questions they ask. This identifier must be unique to the cache whether it be a 64bit memory address or a URL.
  • Secondly caches rely on the same questions being asked again within a certain time period to be useful. Because caches have, physically and practically, to be of limited size they must decide what is cached if they cannot hold everything. It turns out that the universe is kind to our caches. Because of Locality_of_reference, the concentrating factor of Power_laws and Normal_distributions real world systems see a much higher probability of repeat questions.
  • Thirdly, these caches are placed in a specific location within an IT architecture. This might seem obvious and indeed it is… However by having a cache at a definite single location we are avoiding a problem. Great you say! Don’t we normally find solutions to problems that avoid other problems? Indeed we do, but sometimes that solution holds us back and stops us stepping off the plane and into somewhere new. In this story the somewhere new is the application of caching computation to computation theory.

What I want to talk about now is in one sense a completely obvious observation of the world and yet on the other hand is exactly the concept that is missing from information processing systems.

When we, as humans, ask a question we use context. Without context a question has no meaning. Take for example the question “What colour is the sky?” If I don’t use the context of the english language then that string of 23 characters could just be random noise3.

But context is not so simple. Most people might answer “the sky is blue” but if you’re on the moon you’d say “black”, or if in England you’d say “grey”. The general situation is that any question is asked with an associated set of contexts of which some may be relevant and others not. So, again for example, if I ask “Is it nice?” to a person eating a cake then the quality of the cake will affect their answer. It will hopefully not affect their answer to the question “what is 1+1?”

That single platonic world of mathematical truths is somewhat like the single place in an information architecture where a cache is placed. Having a single common context only the identity of the question affects the answer - two plus two is always four.

Can’t context just be part of the question? Is it really different. I.e. “what colour is the sky in England?” Short answer: sometimes. In the general case no. The reason is two-fold:

  • Firstly, as we’ve previously stated, usually not all context in a question that is asked affects the answer. E.g. The same question “Is that apple sweet?” asked to a person eating an apple in both a red room and a blue room. If we combined the context with the question then we’d have two different questions with no ability to determine that they’re actually the same question with the same answer.

  • Secondly, context actually determines how a question is parsed and understood. So even if we could encode context into a question we would still need some global context to determine how to parse the question specific context and the encoding of the question.

So why is context important to computation? What is the point? The answer is of course computational efficiency. To step back to the distribution logistics analogy. Low-level optimisations of the hardware and software stack are akin to improving the fuel efficiency of the trucks. Adding point caching layers are akin to adding regional distribution hubs. Adding intrinsic contextual caching is akin to using holistic route optimisation to minimise distance driven whilst still delivering consignments on time. All three are important to differing degrees.

Intrinsic contextual caching is the ability to uniquely identify the computation that is occurring such that when the same computation occurs again the result can be pulled from cache. It can distinguish that computations made in partially different contexts where that different context doesn’t affect the outcome. This caching is not just a veneer but by functional decomposition of a process or algorithm recursively each small part of the whole can be cached independently. Thereby we can gain caching benefits on just the parts that are repeated even when the whole is not.

Intrinsic contextual caching hasn’t been done before because it involves a number of new distinct concepts:

  • the ability to create an identifier for each and every piece of information and for derivative computations on that information
  • an abstraction for context that encompasses concepts of scope in programming languages but steps outside of programming language
  • an understanding of how recursive computational evaluation progressively affects the context relevant to bounding a computation.
  • a mechanism to deal with the reality that information and context is not static so that cached state can be invalidated based on changes propagated from the source rather than simplistic time period heuristics.
  • a practical and hardened runtime so that these concepts can be used to implement real-world problems.

We hope with the widespread adoption of these ideas there can be a significant impact on global IT energy consumption.

To paraphrase Albert Einstein:
> Insanity: doing the same thing over and over again when you know the result will be the same.

To get an idea of how these concepts affect energy use see my previous post
Reducing Power Consumption with ROC

For a (slightly more) in-depth coverage of how caching works in NetKernel see
Caching: How and Why it Works