P4 Config file to Java Classes - prasadtalasila/BITS-Darshini GitHub Wiki

Tasks for Parsing P4 to Java

Four tasks are identified to parse a user supplied P4 header config file and convert it required java class -

  • Determine the file location, and read the file

The first task is easy enough since the file could be uploaded to a predetermined path by the front-end. Such a predetermined path can be set as an environment variable in CATALINA_OPTS in Tomcat and can be easily retrieved in java programs. Again reading the file line by line is a standard procedure and coding effort for the same is very less.

  • Parse the file and map each header field marked with tag extract to a field name

Since the p4 config file has a definite structure to it, parsing it line by line is somewhat easier than for any unstructured text file. Each line has a certain anticipation in the parsing code (and if it doesn't fulfill that then it's fail fast to catch a malconfigured config file). Thus this part of the work doesn't introduce coding work beyond moderate difficulty.

  • Determine the data types for each field to be stored into (possibly from the config file itself)

This is one of the major problems to tackle. If different primitive data types like int, long and others like String are used for different header fields then auto generation of methods becomes cumbersome. This problems on its own is not so much complicated since P4 config file format could be alter slightly to include the type of field as well (much as the extract tag). But since the problem is coupled with the auto-generation problem, it cannot be handled so lightly.
In both jnetPcap example and pcap4j example, many fields are stored as int but flags are stored with boolean. So there is no uniformity.
One solution I think is to store each field as String. To interpret field value from the byte array ranges could be done for three options -

  1. If it's 1 bit field - Extract the bit (as a byte or an int), convert to appropriate boolean value and store its String representation.

  2. < 32 bits and >= 2 bits - Extract and read as an int. Store its String representation.

  3. < 64 bits and >= 32 bits - Extract and read as long. Store its String representation.

This obviously doesn't take into account certain specific String representation like HexString (in case of Ethernet addresses) or '.'-delimited String in case of IP addresses. etc. Discussion is needed on this point.

  • Auto-generate java code to extract header and extract header-fields by making use of byte-ranges provided

Auto generation of certain code fragments like the ones in which Eclipse IDE will autogenerate method stubs for a class which is declared to be implementing a certain interface and so on, will be achievable with moderate coding effort. However complete method generation seems a much more difficult problem to tackle especially if parameter names are not uniform and if you have multiple options of interpreting a field as discussed above.

This problem can have multiple solutions-

  1. The file method - Field extraction and interpretation rules could be hardcoded. Different String representation could be a nightmare but let's not worry about that for now. Each line in the extractor method could be fixed in a StringBuilder with only variable names and byte ranges to be altered as p4 config file is parsed. This whole writing business could be easier however, the generated java file will also needed to be compiled dynamically and the corresponding class be added in the WAR file currently being served by the server. I can't think how this could be done without reloading the context on the server in which case the connection would reset.

  2. Use specialized tools - JavaPoet is an excellent project to generate the classes and methods and what-not. It essentially does the same thing as described in the first option earlier but with much more sophistication. I recommend that this is the worthiest option to be researched further. (Don't forget about the dynamic compilation problems here as well)

  3. Annotation based generation - Some projects like lombok auto-generate certain redundant methods like getter, setter, toString, equal etc by annotation based processing. One could study the source code (it's an open source project) and see how it's done. But I recommend that this is the hardest option so far discussed.