YAML, what is it good for?

YAML (YAML Ain’t Markup Language) has been a structured data format frequently used in of Ruby development for some time. It has many useful features that often go unnoticed. Its ability to store serialized objects has recently made much news in the form of very serious vulnerabilities in the Rails framework.

Let’s explore ways of using YAML beyond a simple store of arrays and hashes of data, as well as the risks and benefits of doing so.

We will discuss YAML databases, configuration files, conversion into and from Ruby objects, incompatibilities between parsers (Syck and Psych) and the reasons for them, and some gotchas.

Some reference will be made to YAML libraries in Ruby and other languages. http://www.yaml.org/

In progress notes on the topic follow

YAML Spec and Terminology

%YAML 1.2
---
YAML: YAML Ain't Markup Language

What It Is: YAML is a human friendly data serialization
  standard for all programming languages.

The primary objective of this revision is to bring YAML into compliance with JSON as an official subset. YAML 1.2 is compatible with 1.1 for most practical applications - this is a minor revision. An expected source of incompatibility with prior versions of YAML, especially the syck implementation, is the change in implicit typing rules. We have removed unique implicit typing rules and have updated these rules to align them with JSON’s productions. In this version of YAML, boolean values may be serialized as “true” or “false”; the empty scalar as “null”. Unquoted numeric values are a superset of JSON’s numeric production. Other changes in the specification were the removal of the Unicode line breaks and production bug fixes. We also define 3 built-in implicit typing rule sets: untyped, strict JSON, and a more flexible YAML rule set that extends JSON typing.

Terms

Collections Sequence Mapping Mapping-in-Sequence Shortcut Sequence-in-Mapping Shortcut Merge key Basic Types Strings Indicators in Strings Plain scalars Null Boolean Integers Integers as Map Keys Floats Time Date Blocks Single ending newline The '+' indicator Three trailing newlines in literals Extra trailing newlines with spaces Folded Block in a Sequence Aliases and Anchors Documents Trailing Document Separator Leading Document Separator YAML Header YAML For Ruby Symbols Ranges Regexps Perl Regexps Struct class Nested Structs Objects Extending Kernel::Array Extending Kernel::Hash

YAML Ruby Library Code

Psych and Syck

Guides

Gotchas

Ruby

  • 1.8 : uses Syck
  • 1.9 YAML == Psych, but can YAML::ENGINE.yamler = ‘syck’
  • 2.0 YAML == Psych, Syck removed from stlibi

Tricks

defaults: &defaults
  adapter: mysql2
  encoding: utf8
  reconnect: false
  pool: 5
  username: sqluser
  password: s3cret
  host: localhost

development:
  <<: *defaults
  database: app_development

test: &test
  <<: *defaults
  database: app_test

production:
  <<: *defaults
  username: productionsqluser
  password: productions3cret
  database: app_production

Security

Interesting

Misc dump for now