This library is intended to provide conveniences for working with flat data structures. There are a few basic components:
The flatfile package is built on a number of interfaces, of which instances can be combined to represent any flat data structure (and some bumpy ones). All interfaces extend java.io.Serializable.
Interface | Description | Basic implementation(s) |
---|---|---|
Entity | A container for some content of 0 or more bytes | Field, DynamicField |
FieldOption | Marker interface for "field" options | Various and open-ended |
EntityCollection | An Entity that is a collection of child Entities | implemented via subinterfaces |
IndexedEntityCollection | Indexed EntityCollection | EntityArray |
NamedEntityCollection | EntityCollection whose children are identified by String keys | EntityMap |
EntityFactory | Describes an object that can return an Entity instance given some Object "cue" | CloningEntityFactory, CompositeEntityFactory, ParserEntityFactory |
The package provides an EntityFactory implementation that associates Entity definitions with String identifiers, read from a custom DSL which is loosely based on COBOL's data definition format. Here are some definitions in this format:
/* Java multi-line comments are supported */ // Java single-line comments are supported // type foo, length 1: foo (1), //the comma is optional /* type bar, length 3 with default value "bar". Note that string literals, including unicode chars, are as in Java: */ bar (3) "bar" /* type optionalField, length 10, default value of all underscores specified using the 'c'* "fill-character" syntax. Again note that character literals, including unicode representation, are as in Java: */ optionalField (10) '_'* // type baz, default value "baz", length (3) implicit: baz "baz" // type delimiter immutable field of length 3 filled with asterisks: delimiter (3) '*'*! // type blah, immutable value "blah", length (4) implicit: blah "blah"! // type simpleArray, 3 occurrences of 2 bytes each: simpleArray (2) [3] // complex type dateYYYYMMDD: dateYYYYMMDD { year (4) month (2) day (2) } // complex type dateRange with type references: dateRange { start $dateYYYYMMDD ? '-'!// anonymous (filler) child with immutable value and implicit length (1) end $dateYYYYMMDD } // type complexArray, 3 occurrences of a named entity collection: complexArray { a (1) b (2) c (3) } [3] // previous example, initialized to all spaces: complexArray { a (1) b (2) c (3) } [3] ' '*
That's nice, but it's not always enough. Field options can be used to zero in on the exact behavior you need from a given field definition. Field options supported:
Option name | Function | Type | Values |
---|---|---|---|
pad | Used when a too-small value is specified | byte | default (byte) 0 |
justify | Specify field justification when a too-small value is specified | PadJustifyFieldSupport$Justify enum | LEFT (default), RIGHT, CENTER |
overflow | Specify behavior on too-large value | FieldSupport$Overflow enum | ERROR (default), IGNORE |
underflow | Specify behavior on too-small value | FieldSupport$Underflow enum | ERROR (default), IGNORE |
//define an integer field: intField (9) pad='0' justify=RIGHT //define a field for which overflow is permitted: truncateMe (20) overflow=IGNORE
It is possible to define an IndexedEntityCollection (implemented by EntityArray) whose number of occurrences is not known:
anySize (2) [] acceptableRange (2) [1..5] minOccurs (2) [1..] maxOccurs (2) [..5] optional (2) [..1]
The EntityCollection implementations returned by the DSL-based EntityFactory support some handy properties:
Property | Type | Description | Default value |
---|---|---|---|
delim | byte[] | Content to be written between each child entity | byte[0] |
delimAfter | boolean | Whether a delimiter should follow the final child | true |
suppressEmptyChildren | boolean | Whether to suppress children of zero length (and, more importantly, their delimiters) | true |
Occasionally there may be a requirement that fields of unknown length be intermingled with fields of predetermined length or value. Here is an example:
structure { "foo="! fooValue (*) // any length "bar="! barValue (1..) // at least length 1 "baz="! bazValue (..10) // at most length 10 "blah="! blahValue (3..4) // 3 or 4 } delim="\r\n"
Option name | Function | Type | Values |
---|---|---|---|
pad | Used when a too-small value is specified | byte | default (byte) 0 |
justify | Specify field justification when a too-small value is specified | PadJustifyFieldSupport$Justify enum | LEFT (default), RIGHT, CENTER |
Overflow | Specify behavior on too-large value | FieldSupport$Overflow enum | ERROR (default), IGNORE |
Underflow | Specify behavior on too-small value | FieldSupport$Underflow enum | ERROR (default), IGNORE |
You can also default options for certain types. ParserEntityFactory defines certain constants to show where this is possible:
Field name | Value |
---|---|
OPTION_FIELD | field |
OPTION_DYNAMIC_FIELD | dynamicField |
@field justify=CENTER pad=' '; // semicolon indicates end @dynamicField underflow=IGNORE
A final feature of the DSL-based EntityFactory is the idea that it may run a number of checks against entities as they are read from the definition file. The only check implemented at this time is the length check, which is specified by appending a colon and expected length after any entity definition, as shown:
myRecord { a (10) ? ' '! b (50) ? ' '! c { c1 (20) c2 (20) } ? ' '! d (5) [2] ? ' '! e (24) ? ' '! f (1) } : 140 multilineRecord { foo (2) bar (2) baz (1) } delim=' ' delimAfter=false [10] delim="\r\n" : 90
We have covered the core APIs that attempt to represent flat structures of virtually unlimited complexity in what is intended to be a simple way. Next we saw how the provided DSL allows us to build the included entity representations using a terse syntax that aims to yet be as clear as, or more so than, the equivalent Java code. Finally we can go a step further and provide an efficient means for our flat Entity-based structures to interoperate with Java POJOs. By implementing the reflection and conversion APIs defined by the Morph project, we can provide, for relatively little investment, a simple means to copy data between Entity graphs and POJO graphs. By inserting Entity-aware Reflectors and Transformers at opportune points in a Morph configuration, it is possible to achieve a surprising amount of basic functionality. More complex things can be accomplished by extending the APIs. We need to provide examples!