Table Of Contents

This library is intended to provide conveniences for working with flat data structures. There are a few basic components:

Entity API

The flatfile package is built on a number of interfaces, of which instances can be combined to represent any flat data structure (and some bumpy ones). All interfaces extend java.io.Serializable.

Interface Description Basic implementation(s)
Entity A container for some content of 0 or more bytes Field, DynamicField
FieldOption Marker interface for "field" options Various and open-ended
EntityCollection An Entity that is a collection of child Entities implemented via subinterfaces
IndexedEntityCollection Indexed EntityCollection EntityArray
NamedEntityCollection EntityCollection whose children are identified by String keys EntityMap
EntityFactory Describes an object that can return an Entity instance given some Object "cue" CloningEntityFactory, CompositeEntityFactory, ParserEntityFactory
See also Flatfile API

Data Definition DSL

The Basics

The package provides an EntityFactory implementation that associates Entity definitions with String identifiers, read from a custom DSL which is loosely based on COBOL's data definition format. Here are some definitions in this format:

/* Java multi-line comments are supported */
// Java single-line comments are supported
// type foo, length 1:
foo (1), //the comma is optional

/* type bar, length 3 with default value "bar".
   Note that string literals, including unicode chars, are as in Java:
 */
bar (3) "bar"

/* type optionalField, length 10, default value of all underscores
   specified using the 'c'* "fill-character" syntax.
   Again note that character literals, including unicode representation, are as in Java:
 */
optionalField (10) '_'* 

// type baz, default value "baz", length (3) implicit:
baz "baz"

// type delimiter immutable field of length 3 filled with asterisks:
delimiter (3) '*'*!

// type blah, immutable value "blah", length (4) implicit:
blah "blah"!

// type simpleArray, 3 occurrences of 2 bytes each:
simpleArray (2) [3]

// complex type dateYYYYMMDD:
dateYYYYMMDD {
  year (4)
  month (2)
  day (2)
}

// complex type dateRange with type references:
dateRange {
  start $dateYYYYMMDD
  ? '-'!// anonymous (filler) child with immutable value and implicit length (1)
  end $dateYYYYMMDD
}

// type complexArray, 3 occurrences of a named entity collection:
complexArray {
  a (1)
  b (2)
  c (3)
} [3]

// previous example, initialized to all spaces:
complexArray {
  a (1)
  b (2)
  c (3)
} [3] ' '*

        

Field Options

That's nice, but it's not always enough. Field options can be used to zero in on the exact behavior you need from a given field definition. Field options supported:

Option name Function Type Values
pad Used when a too-small value is specified byte default (byte) 0
justify Specify field justification when a too-small value is specified PadJustifyFieldSupport$Justify enum LEFT (default), RIGHT, CENTER
overflow Specify behavior on too-large value FieldSupport$Overflow enum ERROR (default), IGNORE
underflow Specify behavior on too-small value FieldSupport$Underflow enum ERROR (default), IGNORE
Example:
//define an integer field:
intField (9) pad='0' justify=RIGHT

//define a field for which overflow is permitted:
truncateMe (20) overflow=IGNORE
Field options are by no means magical; rather the option setting syntax shown above applies to any obvious (String, byte, numeric) property type. Additionally a String value will be converted to a public static member of a class that implements the FieldOption marker interface. This is more than an implementation detail; this is important information about how you can implement Flatfile's Entity interface to satisfy requirements that may be more specific than what is covered in the basic package. You can even specify nested properties in el-style syntax--just surround the property expression with double quotes!

Dynamically-Sizable Arrays

It is possible to define an IndexedEntityCollection (implemented by EntityArray) whose number of occurrences is not known:

  anySize (2) []
  acceptableRange (2) [1..5]
  minOccurs (2) [1..]
  maxOccurs (2) [..5]
  optional (2) [..1]
Entities defined thus will return IndexedEntityCollections for which #isSizable() returns true. When this is the case, #setSize() can be called to set the size when the correct size is known.

EntityCollection Child Delimiters

The EntityCollection implementations returned by the DSL-based EntityFactory support some handy properties:

Property Type Description Default value
delim byte[] Content to be written between each child entity byte[0]
delimAfter boolean Whether a delimiter should follow the final child true
suppressEmptyChildren boolean Whether to suppress children of zero length (and, more importantly, their delimiters) true

Dynamically-Sizable Fields

Occasionally there may be a requirement that fields of unknown length be intermingled with fields of predetermined length or value. Here is an example:

  structure {
    "foo="! fooValue (*) // any length
    "bar="! barValue (1..) // at least length 1
    "baz="! bazValue (..10) // at most length 10
    "blah="! blahValue (3..4) // 3 or 4
  } delim="\r\n"
Dynamically-sizable fields, or DynamicFields, support the following options:
Option name Function Type Values
pad Used when a too-small value is specified byte default (byte) 0
justify Specify field justification when a too-small value is specified PadJustifyFieldSupport$Justify enum LEFT (default), RIGHT, CENTER
Overflow Specify behavior on too-large value FieldSupport$Overflow enum ERROR (default), IGNORE
Underflow Specify behavior on too-small value FieldSupport$Underflow enum ERROR (default), IGNORE

Default Options

You can also default options for certain types. ParserEntityFactory defines certain constants to show where this is possible:

Field name Value
OPTION_FIELD field
OPTION_DYNAMIC_FIELD dynamicField
You can use these constants, prefaced by an "at" (@) symbol to set default options for any type supported, at the top of the resource:
@field justify=CENTER pad=' '; // semicolon indicates end
@dynamicField underflow=IGNORE

Entity Checks

A final feature of the DSL-based EntityFactory is the idea that it may run a number of checks against entities as they are read from the definition file. The only check implemented at this time is the length check, which is specified by appending a colon and expected length after any entity definition, as shown:

  myRecord {
    a (10)
    ? ' '!
    b (50)
    ? ' '!
    c {
      c1 (20)
      c2 (20)
    }
    ? ' '!
    d (5) [2]
    ? ' '!
    e (24)
    ? ' '!
    f (1)
  } : 140

  multilineRecord {
    foo (2)
    bar (2)
    baz (1)
  } delim=' ' delimAfter=false [10] delim="\r\n" : 90

Object Graph Representation

We have covered the core APIs that attempt to represent flat structures of virtually unlimited complexity in what is intended to be a simple way. Next we saw how the provided DSL allows us to build the included entity representations using a terse syntax that aims to yet be as clear as, or more so than, the equivalent Java code. Finally we can go a step further and provide an efficient means for our flat Entity-based structures to interoperate with Java POJOs. By implementing the reflection and conversion APIs defined by the Morph project, we can provide, for relatively little investment, a simple means to copy data between Entity graphs and POJO graphs. By inserting Entity-aware Reflectors and Transformers at opportune points in a Morph configuration, it is possible to achieve a surprising amount of basic functionality. More complex things can be accomplished by extending the APIs. We need to provide examples!