Over the last few months more and more folks have been pushing NetKernel in enterprise environments where multiple servers are required either for anticipated load scaling or for reliability and fail-over. My standard answer has been to create a NetKernel load balancer instance by pairing a NetKernel Protocol (NKP) server up with a set of NetKernel Protocol clients talking to set of NetKernel instances. Then put a simple algorithm in the middle to relay the requests to the appropriate place. It sounds quite simple… indeed we’ve created basic load balancing scenarios with the unit tests for NKP.
But now I thought, it’d be better for us to create a general purpose implementation. I’m glad that I did. I got a basic round robin load balancer scenario working in about an hour - yes Peter could probably have done it quicker but he’d have created more of an angry bird. Once I started to round things out, though, and dig into the nice details things took a little longer. Here is my list of requirements:
1) Simple instantiation with a separate front and rear configurations NKP has very rich configuration options that allow tailoring of behaviour. Both the NKP transport on the front of the load balancer and the clients on the rear can have their own configurations:
<rootspace name="LB Balance Server" private-filter="false" public="false"> <transport> <prototype>NKPServer</prototype> <config> <port>20000</port> </config> </transport> <endpoint> <prototype>NKPLoadBalancer:Failover</prototype> <config> <hosts> <host> <hostname>localhost</hostname> <port>20001</port> </host> </hosts> <keepalive>false</keepalive> </config> </endpoint> <import> <uri>urn:com:ten60:netkernel:nkp</uri> <private> </private> </import> </rootspace>
2) Place overlays between front and back. Because the load balancer is instantiated in the ROC domain with the front server and rear client as separate endpoints this lets us interleave overlays between them. There are many possible things that this could be useful for, to name a few, throttling/bandwidth shaping, logging, quality of service monitoring. Priority activation is also a possibility. This is where based on load or connection pool availability new instances are started to support increased load. Yet another possibility is creating heterogeneous load balancers taking HTTP, JMS, or whatever and bridging them down to NKP.
3) Dynamic configuration changes to take hosts in and out of the pool. The load balancer takes a configuration resource (in the example above it is a literal in the module.xml configuration file.) However this resource can change and the load balancer will automatically adapt adding or removing hosts from the pool in real-time.
4) Tolerance to failure with dynamic switching. A host can go down at any time, even whilst requests are dispatched to it and no requests will be lost. They’ll always go to a host that is up and if it goes down midway through a request the request will be retried on another host. Of course in the situation where no hosts are left standing the load balancer will fail gracefully.
5) State capture and reporting with a control panel visualisation The load balancer monitors host uptime to generate availability statistics and captures total requests and failures. These are aggregated by a control panel tool which generates and persists historical charts.
6) Pluggable balancing algorithms When I looked, as I do, at the Wikipedia page on load balancer my first thoughts were that there were so many different variants driven by different business needs that it’s essential that alternate algorithms could be plugged in easily. I first though it’d be nice to create a pluggable load balancer where each request’s routing was determined by an ROC request just like the pluggable overlay is a generic overlay driven by an ROC request. However each case I looked at needed access to different disparate information such as the status of the connection pool, incoming request details. So for the moment I’ve just created sub-classed endpoints that contain the minimal differences between each case, currently:
- Round Robin - select the next available host for each request
- Session Affinity - based on an application specified header which could be user name, remote hostname, geographic region or whatever ensure that equal requests are always routed to the same host in the pool.
- Failover - always use the first host in the pool unless it becomes unavailable, then try the second, third, etc
We are planning to release the load balancer as a module for enterprise edition within the next couple of weeks.