Behavior Reference

CCL implementations must make choices about how to handle certain parsing and access scenarios. These behaviors represent options where the CCL specification allows flexibility or where different implementations may reasonably differ.

How Behaviors Work

Behaviors are tags that describe implementation choices. When running the CCL test suite, you declare which behaviors your implementation uses, and tests use a conflicts field to indicate incompatible behaviors.

const capabilities = {
  behaviors: [
    'crlf_normalize_to_lf',
    'boolean_strict',
    'continuation_tab_to_space',
    'array_order_insertion'
  ]
};

Important: Behaviors are not inherently mutually exclusive. A test can require multiple behaviors (e.g., both continuation_tab_to_space and crlf_normalize_to_lf). The conflicts field on individual tests determines what combinations are incompatible. For example, a test expecting lexicographic ordering would have conflicts: { behaviors: ["array_order_insertion"] } to skip implementations that preserve insertion order.

Behavior Groups

Line Endings

Options: crlf_preserve_literal vs crlf_normalize_to_lf

Controls how Windows-style line endings (CRLF, \r\n) are handled during parsing.

`crlf_normalize_to_lf`

Converts all CRLF sequences to LF (\n) before or during parsing. This is the most common choice for cross-platform compatibility.

key = value\r\n
nested = \r\n
  child = data\r\n

Result: The \r characters are stripped, values contain only \n.

{"key": "value", "nested": {"child": "data"}}

`crlf_preserve_literal`

Preserves \r characters exactly as they appear in the input. Values may contain literal carriage return characters.

key = value\r\n

Result: The value includes the \r character.

{"key": "value\r"}

Recommendation: Be consistent with your line-ending handling. For reference compliance, follow the behavior required by the reference_compliant variant. Use crlf_preserve_literal when byte-exact carriage-return preservation is required; use crlf_normalize_to_lf when cross-platform normalization is the priority.

Note: crlf_preserve_literal applies uniformly to flat and nested structures. See CRLF Handling in Nested Structures for the line-splitting requirements implementations must meet.

Boolean Parsing

Options: boolean_lenient vs boolean_strict

Controls which string values are accepted as booleans by typed access functions like get_bool.

`boolean_strict`

Only accepts true and false as boolean values, but comparison is case-insensitive (so True, FALSE, tRuE are all valid).

enabled = true
disabled = False
valid = TRUE
active = yes

getBool(obj, "enabled")   // → true
getBool(obj, "disabled")  // → false
getBool(obj, "valid")     // → true
getBool(obj, "active")    // → ERROR: "yes" is not a valid boolean

`boolean_lenient`

Accepts additional truthy/falsy values beyond true/false. All comparisons are case-insensitive.

Truthy Values	Falsy Values
`true`, `yes`, `on`, `1`	`false`, `no`, `off`, `0`

Any case variation is accepted (e.g., YES, No, TRUE, oFf).

enabled = yes
disabled = no
active = on
inactive = off

getBool(obj, "enabled")   // → true
getBool(obj, "disabled")  // → false
getBool(obj, "active")    // → true
getBool(obj, "inactive")  // → false

Note: Both modes accept true and false. The difference is whether additional values like yes/no are also accepted.

Recommendation: Use boolean_strict for stricter validation, boolean_lenient for more user-friendly configuration files.

Tab Handling

Options: continuation_tab_to_space vs continuation_tab_preserve

Controls how leading tab characters on continuation lines are treated during parse.

Two related rules are not controlled by this behavior — they are universal:

Interior tabs in values are always preserved verbatim (the tab_in_value_preserved feature).
Boundary tabs immediately after = are always trimmed as whitespace.

This behavior only affects leading tabs on a continuation line.

`continuation_tab_to_space`

Each leading \t on a continuation line normalizes 1:1 to a single space during parse. This is the OCaml reference behavior.

section =
    foo

Result: Value is "\n foo" (two tabs → two spaces).

`continuation_tab_preserve`

Leading tabs on continuation lines are preserved verbatim during parse.

section =
    foo

Result: Value is "\n\t\tfoo".

Recommendation: continuation_tab_to_space matches the OCaml reference and is the default for variant:reference_compliant. Choose continuation_tab_preserve only if you need byte-exact fidelity of tab-indented source.

Indentation Style

Options: indent_spaces vs indent_tabs

Controls how indentation is rendered in output functions like canonical_format.

`indent_spaces`

Output uses spaces for indentation (typically 2 spaces per level).

canonicalFormat(parsed)
// → "section =\n  child = value"

`indent_tabs`

Output uses tab characters for indentation.

canonicalFormat(parsed)
// → "section =\n\tchild = value"

Note: This behavior affects output formatting only, not parsing. Most implementations use indent_spaces for consistency.

List Coercion

Options: list_coercion_enabled vs list_coercion_disabled

Controls how get_list behaves when accessing a single value (non-list).

`list_coercion_enabled`

When get_list accesses a single value, it wraps it in a list automatically.

single = value
multiple =
  = item1
  = item2

getList(obj, "single")    // → ["value"] (coerced to list)
getList(obj, "multiple")  // → ["item1", "item2"]

`list_coercion_disabled`

get_list only succeeds on actual list values. Single values cause an error.

getList(obj, "single")    // → ERROR: not a list
getList(obj, "multiple")  // → ["item1", "item2"]

Recommendation: list_coercion_disabled provides stricter type safety; list_coercion_enabled is more convenient for optional list fields.

Continuation Baseline

Options: toplevel_indent_strip vs toplevel_indent_preserve

Controls how the baseline indentation (N) is determined during top-level parsing. This affects whether lines at the same indentation level are treated as continuations or separate entries.

`toplevel_indent_strip`

Top-level parsing always uses N=0 as the baseline. Any line with indentation > 0 is treated as a continuation of the previous entry. Leading whitespace is effectively “stripped” when determining entry boundaries.

  key = value
  next = another

Result: One entry, because both lines have indent 2 > 0.

{"key": "value\n  next = another"}

This is the OCaml reference implementation’s behavior.

`toplevel_indent_preserve`

Top-level parsing uses the first key’s indentation as N. Lines at the same indentation level as the first key are separate entries. The original indentation structure is “preserved” for entry boundary detection.

  key = value
  next = another

Result: Two entries, because both lines have indent 2 = N (2), so 2 > 2 is false.

{"key": "value", "next": "another"}

Trade-offs:

toplevel_indent_strip matches the OCaml reference and existing test suite expectations
toplevel_indent_preserve may be more intuitive: indenting your whole document doesn’t change parsing semantics

Implementation note: With toplevel_indent_preserve, top-level and nested parsing use the same algorithm—always determine N from the first non-empty line’s indentation. The distinction between parse and parse_indented only matters for toplevel_indent_strip, where top-level parsing must force N=0. See Continuation Lines for details.

Recommendation: Use toplevel_indent_strip for reference compliance. Consider toplevel_indent_preserve if your use case involves documents that may be indented as a whole (e.g., embedded within other files), or if you want a simpler single-algorithm implementation.

Array Ordering

Options: array_order_insertion vs array_order_lexicographic

Controls the order of elements returned by get_list and related typed access functions. The canonical CCL data model is order-agnostic — ordering is a concern of the typed access layer only.

`array_order_insertion`

Elements appear in the order they were defined in the source file (insertion order).

items =
  = cherry
  = apple
  = banana

getList(obj, "items")  // → ["cherry", "apple", "banana"]

`array_order_lexicographic`

Elements are sorted lexicographically (alphabetically) regardless of source order.

items =
  = cherry
  = apple
  = banana

getList(obj, "items")  // → ["apple", "banana", "cherry"]

Recommendation: array_order_insertion preserves author intent; array_order_lexicographic provides deterministic output regardless of source formatting.

Delimiter Mode

Options: delimiter_first_equals vs delimiter_prefer_spaced

Controls how CCL locates the delimiter = when a line contains multiple = characters (e.g., URLs with query strings, shell assignments).

`delimiter_first_equals`

Always split on the first = character. Keys cannot contain =.

a=b = c=d

Result:

{"a": "b = c=d"}

`delimiter_prefer_spaced`

When multiple = exist, prefer the one surrounded by spaces (=). Falls back to the first bare = if no spaced delimiter is found.

https://api.example.com/search?q=test&page=1 = search_results

Result (spaced delimiter preferred):

{"https://api.example.com/search?q=test&page=1": "search_results"}

Compare with delimiter_first_equals:

{"https://api.example.com/search?q": "test&page=1 = search_results"}

Recommendation: delimiter_prefer_spaced handles URLs and key=value style strings more naturally. delimiter_first_equals is simpler and matches common tokenizer behavior.

Multiline Values

Option: multiline_values

Marks tests where the value spans multiple source lines via the multiline_continuation feature. Unlike the paired options above, multiline_values has no alternate form — it is a capability flag that declares an implementation supports multi-line values at all. An implementation that does not declare it will have multi-line value tests filtered out.

Path Traversal

Option: path_traversal

Marks tests that exercise deep path traversal through nested structures (e.g. get_string(ccl, "a.b.c.d.e")). Like multiline_values, this is a capability flag without a paired alternate — implementations either support deep navigation or they don’t.

Declaring Behaviors in Test Runners

When building a test runner against the CCL test suite, declare your implementation’s behaviors. Tests in the generated format include a conflicts field that specifies incompatible behaviors:

const capabilities = {
  functions: ['parse', 'build_hierarchy', 'get_string', 'get_bool', 'get_list'],
  behaviors: [
    'crlf_normalize_to_lf',
    'boolean_strict',
    'continuation_tab_to_space',
    'indent_spaces',
    'list_coercion_disabled',
    'array_order_insertion',
    'delimiter_prefer_spaced'
  ],
  variants: ['reference_compliant']
};

// Filter tests based on conflicts field
const compatibleTests = allTests.filter(test => {
  // Skip if test conflicts with any of our declared behaviors
  const hasConflictingBehavior = test.conflicts?.behaviors?.some(b =>
    capabilities.behaviors.includes(b)
  );
  return !hasConflictingBehavior;
});

Note: The conflicts field is per-test, not per-behavior-group. A test that expects array_order_lexicographic results would include conflicts: { behaviors: ["array_order_insertion"] }.

Summary Table

Behavior Group	Option A	Option B	Notes
Line Endings	`crlf_preserve_literal`	`crlf_normalize_to_lf`	Preserve for fidelity; normalize for cross-platform
Boolean Parsing	`boolean_strict`	`boolean_lenient`	Both are case-insensitive
Tab Handling	`continuation_tab_to_space`	`continuation_tab_preserve`	Leading tabs on continuation lines — OCaml reference uses `to_space`
Indentation	`indent_spaces`	`indent_tabs`	Output formatting only
List Access	`list_coercion_enabled`	`list_coercion_disabled`	Disabled for type safety
Continuation Baseline	`toplevel_indent_strip`	`toplevel_indent_preserve`	Strip for reference compliance
Array Ordering	`array_order_insertion`	`array_order_lexicographic`	Insertion preserves intent
Delimiter	`delimiter_first_equals`	`delimiter_prefer_spaced`	Spaced preferred for URLs/query strings
Multiline Values	`multiline_values`	(capability flag)	Declares support for multi-line values
Path Traversal	`path_traversal`	(capability flag)	Declares support for deep path navigation

Behavior Reference

How Behaviors Work

Behavior Groups

Line Endings

crlf_normalize_to_lf

crlf_preserve_literal

Boolean Parsing

boolean_strict

boolean_lenient

Tab Handling

continuation_tab_to_space

continuation_tab_preserve

Indentation Style

indent_spaces

indent_tabs