SLANG is short for Simple data description Language; it was introduced by Schürmann for the PLAYOUT project and is the main file format of MOOSE project (and for NyktOp, which was in the first time thought as a replacement for some bugs in MOOSE).
SLANG is, as the name tells us, a very simple file format; it is a very explicite textual file format, and thus is tends to cause big files.
Data is stored in (attribute-)lists and records (of lists).
Following is the example for a SLANG file:
( SLANGVersion 2.00 )
( MCL-StoreOn
<
{ the first toplevel instance }
( DotClass
<
{ some simple attributes }
( KEY 123 )
( _x 100 )
( _y 100 )
{ a single relation }
( _partOf
<
( KEY 234 PolylineClass )
>
)
>
)
( PolylineClass
<
( KEY 234 )
{ some more complex attributes (themselfes again instances) }
( _start
<
( _x 0 )
( _y 100 )
>
)
( _end
<
( _x 100 )
( _y 0 )
>
)
{ a multiple relation }
( _steps 1
<
( KEY 123 DotClass )
>
)
>
)
>
)
The Example defines a line of three dots; 2 dots are attributes of the line and one is connected via a relation.
Below is an example for a simple lex-scanner to tokenize a SLANG file.
%%
"<" { return RECBRA; }
">" { return RECKET; }
"(" { return LISTBRA; }
")" { return LISTKET; }
[A-Z_][A-Z_0-9.-=]* { STORE; return SYMBOL; }
[+-]?[0-9]+([.][0-9]+([eE][+-][0-9]+)?)? {STORE; return NUMBER; }
[\\][0-9]+ { STORE; return ENUM; }
[']([^']+|[']['])*['] { STORE; return STRING; }
[{][^}]*[}] { /* NOP: comments */ }
[\t ]+ { /* NOP: whitespace */ }
[\n] { /* NOP: newlines */ }
%%
The different symbols have to be defined e.g. inside the parser; the STORE-macro would need to strip the different irrelevant information from data (i.e. "\" before ENUM or '' around STRINGs).
The following minimalist parser does just show the structure of a SLANG file; it does no semantic checking for certain types.
%token RECBRA %token RECKET %token LISTBRA %token LISTKET %token SYMBOL %token NUMBER %token ENUM %token STRING %% file : lists ; lists : lists list | /**/ ; list : LISTBRA SYMBOL data LISTKET ; data : data value | /**/ ; value : SYMBOL | NUMBER | ENUM | record ; record: RECBRA lists RECKET ; %%
We do also omit code to specify relations; (this has also been omitted in the scanner, that would have to scan for 'KEY' separately.
For an unknown reason (Ok, bison does explicitely instructs U to build such functions) no library seems to contain yywrap() nor yyerror(); more precise: there is no liby.a. Thus U might have to add something like
#include <stdio.h>
int yyerror (const char *s) { fprintf (stderr, "%s\n", s); return 1; }
int yywrap(void) { return 1; }
This file format is surely not the "best one"; but it is very simple and as long as I can't find another one, that has similar features (text-based, structured, nestable and representing data hierarchy), I can't add support for it.
Major disadvantages are for example:
The greatest disadvandage however is at the same time it's biggest advantage: The file format exactly matches a class hierachy. changes in a class hierachy cause changes in file format automatically. This is good, since using generic readers/writers there is no need to think about file I/O any more. But it is bad too, since model changes may cause our applications stopping to read elder slang files.
We could e.g. re-use the syntax used in FrameMaker MIF files (that is effectively using only lists, however bracketed with " < ... > "). They define references being possible as well via unique ids as via certain key-attributes. Or - better - XML, but can anyone gimme a hint about data structures in XML (dtd)? (the discussion notes on w3.org does not fit my needs; it seems not to give any possibility to re-use an attributename with a different purpose or type; furthermore IMHO implicite double-linked relations are a bit too sloppy for the file format, and I do not like references via find-by-attribute instead of index keys being the default; this slows down file-IO in an unneccessary way - IMHO - and might make it neccessary to do multi-pass reading; finally there is no obvious possibility to write Tabulators and Newlines - all whitespace seem to be treated equally in xml)