Dynamic Grammars

Last week I received a request for advice on an interesting problem from Grégoire Colbert. He’s given me permission to share his request and I hope to share with you the solution.

Let’s imagine that I have a constantly evolving OWL file of living species. I want NetKernel to read this OWL resource, and allow the user to get a representation for each entry in the OWL file :

> /birds/sparrow
> /fishes/tuna 
The difficulty is that when NetKernel starts, I (as a programmer) don’t have any idea of what the OWL file will contain, and so I cannot write a static grammar with /birds/sparrow and /fishes/tuna. Still, I don’t want to use a wildcard. I want the endpoints to be created dynamically but from a solid source.

Most of the endpoints you will create as a developer will have static grammars but this example is a case where a dynamic grammar is desired. We do have a number of endpoints in NetKernel that are driven by dynamic grammars but how to achieve this is not well documented. Examples include the mapper and Virtual Host endpoints.

As is usual with NetKernel there are a number of ways this can be achieved. My first thoughts were to consider overriding the onResolve() method that all endpoints implement. This is usually hidden away in base classes and rightly so - It can be dangerous and hard to implement that method as takes a certain amount of low level knowledge. I don’t really want to open that Pandora’s box if it’s not necessary!

Grégoire suggested a customised grammar implementation that uses an internal hashmap to make large grammars perform well. That is possible too, however I’ve always been pleasantly surprised by the performance of grammars and their regular expression foundations so I don’t want to worry about that unless it proves a real concern. Besides this still leaves the mechanics of how to create an endpoint with a dynamic grammar.

Peter Rodgers made the suggestion of just using the mapper overlay with a dynamic configuration resource. This is the idea of taking some configuration resource and dynamically transforming it into a mapper configuration in front of our endpoint that would have a static grammar. This approach will certainly work. If you’re a disciple of the school of no code then that might even be your preferred solution. However I was looking for a “tighter” solution with less indirection.

The solution I propose involves extending the endpoint base class in the standard module jar:

org.netkernel.module.standard.endpoint.ConfiguredEndpointImpl

This endpoint abstracts the idea that all the behaviour of the endpoint is contained within a configuration that is dynamically generated from a resource. So when ever the resource changes the configuration will be regenerated based upon new state. The behaviour that we can control is not just the internal handling of requests but also the logical endpoints (including grammars) that this (physical) endpoint will expose.

Here is the configuration document we’ll use. We’ll expose it using a fileset from a static file (that we can edit dynamically) in the module however it could come from anywhere. The base attribute on the document element will be used to provide a base identifier to offset the species group and item.

<groups base="res:/species1/">
	<group name="birds">
		<item name="sparrow"/>
		<item name="hawk"/>
		<item name="blackbird"/>
	</group>
	<group name="fishes">
		<item name="tuna"/>
		<item name="salmon"/>
		<item name="swordfish"/>
	</group>
</groups>

Here is the source code of the accessor:

public class DynamicGrammar1 extends ConfiguredEndpointImpl
{
 /** Create new configuration at startup and everytime configuration changes. Configuration
  * is SOURCEd from config parameter.
  */
 @Override
 protected IConfig createConfig(INKFRequestContext aContext) throws Exception
 { IHDSNode configDoc=aContext.source("param:config",IHDSNode.class);
  return new Config(configDoc,this,aContext);
 }
 
 @Override
 protected void destroyConfig(IConfig aConfig) throws Exception
 { // nothing to do here - only needed when config needs specific teardown code
 }
  
 /** Config class implements IConfig interface which includes method to supply necessary metadata and provide an
  * onRequest implementation.
  */
 private static class Config implements IConfig
 { 
  /** the identifier for single logical endpoint we expose */
  private final String mEndpointId;
  /** the metadata for the single logical endpoint we expose */
  private final IEndpointMeta mEndpointMeta;
  /** the space elements structure showing which logical endpoints we expose and how they are resolved */
  private final ISpaceElements mElements;
   
  public Config(IHDSNode aConfigDoc, DynamicGrammar1 aEndpoint, INKFRequestContext aContext) throws Exception
  { 
   // create our space elements structure consisting of a single logical endpoint
   TupleList parts = new TupleList(3,1);
   mEndpointId=Utils.generateUniqueEndpointIdentifier(aEndpoint.getSpace(), aEndpoint, null);   
   parts.put(new SimpleIdentifierImpl(mEndpointId), null,aEndpoint);
   mElements = new SpaceElementsImpl(parts);
 
   // create metadata for logical endpoint
   String base=aConfigDoc.getFirstValue("/groups/@base").toString();
   EndpointMetaBuilder emb=new EndpointMetaBuilder();
   emb.setVerbs(INKFRequestReadOnly.VERB_SOURCE);
   emb.setName("Dynamic Grammar example at "+base);
   IGrammarGroup grammar=BNFGrammarFactory.createGroup(null);
   BNFGrammarFactory.createText(grammar, base);
   IGrammarGroup choice=BNFGrammarFactory.createChoiceGroup(grammar,"path",null);
    
   //iterate over config document to fill out all of grammar
   for (IHDSNode group : aConfigDoc.getNodes("/groups/group"))
   { String groupName=group.getFirstValue("@name").toString();
    for (IHDSNode item : group.getNodes("item"))
    { String itemName=item.getFirstValue("@name").toString();
     BNFGrammarFactory.createText(choice, groupName+"/"+itemName);
    }
   }
   emb.setGrammarRoot(grammar);
   IMessages messages=aContext.getKernelContext().getKernel().getLogger();
   mEndpointMeta=emb.getMeta(messages);
  }
 
  /** Return logical endpoints we expose
   */
  @Override
  public ISpaceElements getElements(INKFRequestContext aContext) throws Exception
  { return mElements;
  }
 
  /** Return dynamic spaces we instantiate (none)
   */
  @Override
  public ISpace[] getSpaces(INKFRequestContext aContext) throws Exception
  { return ConfiguredEndpointImpl.NO_SPACES;
  }
 
  /** Return metadata for the single logical endpoint we expose
   */
  @Override
  public IMetaRepresentation getElementMeta(String aElementId)
  { if (aElementId.equals(mEndpointId))
   { return mEndpointMeta;
   }
   else
   { return null;
   }
  }
 
  /** See if our logical endpoint can resolve the request
   */
  @Override
  public String getResolvedElement(INKFResolutionContext aContext) throws Exception
  { INKFRequestReadOnly toResolve=aContext.getRequestToResolve();
   if (MetadataUtils.match(mEndpointMeta, toResolve, aContext))
   { return mEndpointId;
   }
   else
   { return null;
   }
  }
 
  /** Dummy implementation of onRequest. He we will receive all the requests we say we
   * can evaluate. This implementation could also use the configuration to drive processing. 
   */
  @Override
  public void onRequest(String aElementId, INKFRequestContext aContext) throws Exception
  { String result="Request for "+aContext.getThisRequest().getArgumentValue("path");
   aContext.createResponseFrom(result).setExpiry(INKFResponse.EXPIRY_ALWAYS);
  }
 }
}

We declare this endpoint into our modules rootspace with a config parameter which is the resource identifier that we’ll SOURCE to get the configuration.

<endpoint>
	<config>res:/demo/dynres/pub/config1.xml</config>
	<class>resources.endpoints.DynamicGrammar1</class>
</endpoint>

I’ve made the complete demo module available here. Also included is an alternate version of the accessor which creates separate logical endpoints for each species rather than lumping them into one grammar. You’ll be able to see the difference by using the space explorer. Also included are some rudimentary unit tests.