Output format

From version 4.1.0.0, the results of a simulation are not stored in the database but in a binary file. This greatly reduces memory requirements and improves the performance.

The results are written as a stream of data. Once written, they are not modified.

The storage format is based on this principle.

Each record is in the form of a mark identifying the type (and, possibly, the format version) of the result, followed by the result itself.

The records are written one after the other. A specific routine must be written to read and to write each type of result.

Mark format

The mark is a long holding 8 characters. The characters must be encoded using BitUtil.toMark and decoded using BitUtil.fromMark.

Array record

Structure:

  • (long) mark: ‘resmat01’
  • (int) step: the timestep number
  • (UTF8) name: name of the result
  • (int) dimSize: the number of dimensions in the array
  • (UTF8)+ dimNames: the names of each dimension
  • (int)+ dims: the size of each dimension
  • (UTF8)+ sems: the semantics of each dimension
  • (int) dataSize: the number of cells in the array
  • (double)+ data: the values in the array

Performance

The simulations using this format are six times faster and, more importantly, use much less memory enabling larger simulations to be carried out in an acceptable timeframe as the Java Virtual Machine uses much less processor time.

Extension

With this structure, new types of result can be added easily. A generic reader switches to the appropriate reader function depending on the mark at the start of the record.

Each object will have its own code for the writing the data required to be stored, putting its own mark before recording the data.

Problem

This mechanism uses memory-mapped files and Windows is still unable to address a memory-mapped file correctly. Furthermore, with the use of sparse arrays and massive datasets, the arrays are now stored in text files in storage optimized for arrays.