Caching: How and why it works

NetKernel’s embodiment of Resource Oriented Computing (ROC) has the unique property of being able to transparently, efficiently and safely (i.e. not return stale values) cache computation. This is achieved through a process similar to Memoisation in functional programming languages but is generalized to be applied beyond functions without side-effects and even to functions that can change at non-deterministic times.

Wikipedia describes caching as:

…a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer access time) or to compute, compared to the cost of reading the cache… Once the data is stored in the cache, it can be used in the future by accessing the cached copy rather than re-fetching or recomputing the original data.

NetKernel has two distinct caches. Firstly a Representation Cache keeps response representations keyed against the requests that where issued for them. This is only useful for the usually idempotent request verbs of SOURCE, EXISTS and META. Don’t worry right now about stale cache entries or non idempotent requests, NetKernel has a rich and powerful mechanism for expiry of representations that we will talk about in a minute. Before any request is processed by an endpoint, NetKernel will first check the Representation Cache to see if it can just pull on a pre-computed representation.

Secondly a Resolution Cache keeps request resolutions keyed against requests that are issued. Remember from our previous discussion of Dynamic Resolution how the resolution process can potentially become quite involved when large modular address-space is instantiated within a complex system. This cache acts to shortcut the resolution traversal. The Resolution Cache acts as backup behind the representation cache so that even when the response from a request cannot be cached NetKernel can start executing the request processing endpoint immediately.

Representation Validity

HTTP uses smattering of headers to define response expiration heuristics [Expires, Cache-Control, Last-Modified, If-Modified-Since, ETags]. Part of the proliferation is legacy caused and partly because of the number of approaches to best optimising sharing best knowledge of potential expiry for different server implementations over the latency and cost of the network.

Luckily things are much simpler in ROC. A response has a single IExpiry interface:

boolean isExpired()

The expiry function is should be a monotonic function which should never return false after it has returned true. I.e. once expired, always expired.

Standard expiry implementations are provided:

  • ALWAYS_EXPIRED - response should never be reused. The request is not idempotent or lacks referential transparency.
  • NEVER_EXPIRED - response will always be valid
  • TIMED_EXPIRY - response is valid until a given absolute time in the future.
  • DEPENDENT_EXPIRY - response is valid until any sub-requests response is invalid. This is usually the default expiration function and ensures that expiration propagates to all dependent resources.

In addition custom expiry functions can be programmatically attached to a response allow rich and dynamic expiration mechanism such as watching for modified filesystem files or the golden thread pattern used to layer expiration semantics over external data sources such as relational databases.

Scope

Earlier in this discussion we said that the Representation and Resolution caches are keyed on the request. The actually fields used from the request are the resource identifier, the request scope, and the request verb. However to improve hit rates we need to be a bit more subtle. NetKernel 3 employed a user setable flag “isContextSensitive” on each response, if false the response did not depend upon the request scope, i.e. the context of the request would not effects it’s response. If true the response depends upon the request scope and a request issued with a slightly different scope would be considered and computed independently. NetKernel 4 automatically determines how much of the requests scope is needed to bound the response. Not only is transparent but it is optimal.

If you found that last paragraph a little dry your certainly not going to be the only one. A possibly easier way understand the problem we solve here is to consider a real world request. If I should shout out “Hey Peter I want a cup of tea!” what will happen? If I’m sat in the office and Peter is in a good mood then this request will resolve to Peter Rodgers, CEO of 1060 Research (for want of a better identity) will jump up and make me a cup of my favorite tea. In this example Peter (tea making endpoint) was resolved in the scope of the office. He used his current mood (scope) used my tea preferences (scope) to determine his response. (Of course real world cups of tea are not cachable but lets pretend they are.) Now if I change my scope by going out into the street and shout “Hey Peter I want a cup of tea!” what will happen? If I’m lucky enough to get a resolution of this request from some passing Peter then they are not going to have my tea making preferences in scope and this request will need to be re-evaluated. Let’s say I ring in to the office to speak with Peter and ask for a cup of tea, now my scope is different however after a puzzled pause Peter says “The cup I made you earlier is still on your desk”. Pulled back from the cache I get my cup of tea with out any effort. Peter didn’t care about my location (scope) to know what tea I wanted. Ok I admit it, that was slightly contrived!

Temporal locality

One of the things we came to realize was how effective caching was. We thought that an inevitable consequence of high level of abstraction that ROC brings would be a performance overhead. To quote the SOA Manifesto “Flexibility over optimization”, we expected a compromise to gain the malleability and scalability of ROC. But in actuality we got increased performance too. Even small caches can have big effects on performance. It turns out that this is due to Temporal locality. The real world has tendency for requests to form normal distributions with many resources being used again and again over a period of time and only a few that are one offs. Even with unique requests on the edge of a system the internals often gain a lot from caching. We find real world systems typically reduce computation by 50%.

I hope this posting has helped explain some of the magic behind caching in NetKernel.