New post about YAML

This commit is contained in:
Thomas Lovén 2018-08-27 14:04:17 +02:00
parent 5d59956cf2
commit 8166afd667

View File

@ -0,0 +1,504 @@
layout: post
title: YAML for Non-programmers
subtitle: and programmers
tags: [homeassistant]
It's been a while...
### Introduction
I've been fiddling with home automation for a few years, but only recently
found my way to [http://www.home-assistant.io](homeassistant) - a project with
great developers and a great community. I've been hanging about the official
[Discord Chat Server](https://discord.gg/c5DvZ4e), and try to give something
back by helping people when I can.
One thing I noticed is that people often struggle with is the fact that the
configuration is made through YAML. A strange choice for the kind of
quasi-programming you may want to do when automating your home appliances.
In this post I have tried to describe how YAML works, and how I think about the
way it represents basic data structures. I hope it can be useful to someone...
### Dictionaries and lists
To understand YAML, you need to understand what it's describing. There are two
main concepts: *Dictionaries* and *Lists*.
**Lists** are quite simple. It's just a list - an ordered collection of things.
There are two things you need to remember though. Let me show them to you in a
list :)
- Lists are ordered. If you make a list with a dog, a cat and an elephant, the
dog will be first, the elephant will be last and the cat will be in between.
- Lists can be lists of anything; strings, integers, booleans, or even
dictionaries or lists.
A list of lists might sound weird, but just think about the lists you have in
your home.
- A shopping list
- A todo list
- A list of things to pack for the vacation next week
- The list of passwords on a post-it under your keyboard
**Dictionaries** are not much more complicated. You've probably seen it in some
way or another, but perhaps with a different name. In different programming
languages they can be called *dictionaries*, *hashes*, *maps*, *hash tables*,
*tables*, *collections* or even just *objects*. The technical name is
*Associative Array*.
Regardless of name, the concept is simple. A dictionary is a collection of
key-value pairs. That is, a Name for something, and that Something. The name -
the *key* must be unique. The same key can not be used twice in the same
dictionary. The Something - the *value* can be anything at all, just like the
items in a list.
Let's look at some sample dictionaries:
```yaml
Monday: Sausage and beans
Tuesday: Fish
Wednesday: French onion soup
Thursday: Pea soup and pancakes
Friday: Pizza
```
Each day is labeled by a key, and has a value - what are you going to eat that day.
Note that while you could add another `Wednesday` to the end of the list, it
wouldn't really make sense. Thus keys have to be unique. The values doesn't
however. It would make perfect sense to eat pizza again on Saturday.
Since keys are unique, their order is not important:
```yaml
Wednesday: French onion soup
Friday: Pizza
Monday: Sausage and beans
Tuesday: Fish
Thursday: Pea soup and pancakes
```
This dictionary contains exactly the same data as the one above. A clear
difference from a list, where the order itself is a part of the data.
Another dictionary:
```yaml
Name: Thomas Lovén
Email: thomasloven@example.com
```
That's a dictionary. Looks kind of like a database of a sort, doesn't it?
Like the address book in your email program? Ah! But don't get fooled. The
address book is a **list**, not a **dictionary**. However - the *items* in the
list are *dictionaries*.
Let's add on to that dictionary:
```yaml
Name: Thomas Lovén
Email: thomasloven@example.com
Hobbies: singing, woodworking, home automation
```
Now we added an entry to the dictionary where the value is a list. I have three
hobbies. This illustrates that the value of a dictionary key-value pair can be
anything. Even a dictionary:
```yaml
Name: Thomas Lovén
Email: thomasloven@example.com
Hobbies: singing, woodworking, home automation
Phones: Home: +46 (0)XX XXXXXX, Work: +40 (0)XX XXXXXX
```
And remember that lists can contain dictionaries too...
```yaml
Name: Thomas Lovén
Email: thomasloven@example.com
Hobbies: singing, woodworking, home automation
Phones: Home: +46 (0)XX XXXXXX, Work: +40 (0)XX XXXXXX
Children: Name: N, Age: 3 ; Name: H, Age: 1
```
But that this point things are getting advanced. It's hard to keep track of
what is a dictionary and what is a list and what contains what...
If only there was a language to describe these concepts... a sort of Markup
Language, if you will... but who needs Yet Another one of those?
Let me tell you about
### JSON
Javascript Structured Object Notation - [JSON](https://www.json.org/). You
thought I was going to say YAML, didn't you? We'll get there too.
JSON is a simple way of writing down the concepts described above which can be
easily understood both by humans and by computers.
Basically, there are *objects* and *arrays*, but let's call them
*dictionaries* and *lists* instead.
*lists* are surrounded by square brackets and contain items separated by commas.
The items can be strings, numbers, dictionaries, lists or any of the magic
values `true`, `false`, or `null`.
*Dictionaries* are surrounded by curly braces and contain key-value pairs
separated by commas. Each key-value pair has the key, a colon and the value.
Keys must be strings, but values can be anything that can be in a list.
Let's look at our dictionary from above in JSON format:
```json
{
Name: "Thomas Lovén",
Email: "thomasloven@example.com",
Hobbies: [
"singing",
"woodworking",
"home automation"
]
Phones: {
Home: "+46 (0)XX XXXXXX",
Work: "+46 (0)XX XXXXXX"
}
Children: [
{ Name: "N", Age: 3},
{ Name: "H", Age: 1}
]
}
```
I added some line breaks and indentations to make it more pretty, but this is
much easier to read. Even the last key-value pair about my children.
Two things to note
- You don't need quotes around the keys, but you do around values that are
strings. If you want whitespace in a key (which is entirely OK) it must be
quoted, though.
- The indentations and newlines I added, and in fact any whitespace not in
quotes, is ignored.
OK. Now you understand one markup language. Let's learn something different.
### YAML
[YAML](http://yaml.org/) Ain't a Markup Language - but it's pretty darn close,
to be honest.
While probably not historically accurate, YAML can be seen as an evolution of
JSON. In fact, any valid JSON is also valid YAML. That might be important to
remember. There are some notable differences, though.
First of all, YAML does away with the braces. Instead items in lists are
separated by newlines where each item starts with a dash:
```yaml
- Item 1
- Item 2
- Item 3 is a long one that stretches over multiple lines.
The new item won't start until we get to a line that starts
with a dash, like the one below this one.
- Item 4
-
- Item 5a
- Item 5b
- Item 5c
- Item 5
```
Some things to note:
- There are no quotes. In YAML quotes are pretty much optional.This can be a
blessing and a curse. For example `"true"` is a string, but `true` is a
boolean value.
- Indentation is important. Item 3 in the list stretches over multiple lines.
Each line after the first one is indented (with an equal number of spaces,
*NOT* tabs). The same is true for Item 5, which is a list. Each item in the
sub-list is indented with an equal number of spaced.
- The items of the list are not of the same type. Most are strings, but item 5
is a list.
Dictionaries are also separated on lines with the key, a colon and the value:
```yaml
Name: Thomas Lovén
Email: thomasloven@example.com
Hobbies:
- singing
- woodworking
- home automation
Phones:
Home: +46 (0)XX XXXXXX
Work: +46 (0)XX XXXXXX
Children:
-
Name: N
Age: 3
- Name: H
Age: 1
```
Things to note:
- The value corresponding to the key `Hobbies` is a list. Like above, each line
of the value is indented by an equal number of spaces.
- The value corresponding to the key `Phones` is a dictionary. The same
indentation rules apply.
- The value corresponding to the key "Children" is a list where each item is a
dictionary. So each line in each dictionary is indented twice.
- The second entry in the list of children uses a contracted form, where the
first key-value pair of the dictionary is put on the same line that signifies
the list item. More on this later.
And that's all the basics of data representation using YAML.
### On indentation
As I've been trying to help people with their configurations on the
homeassistant Discord server, I have found one problem which occurs more than
any other. Indentation errors.
*Indentation is important*
It *must* be correct, or the YAML won't be accepted by the parser, or it will
describe something entirely different from what you intended.
The only advice I can give is to think carefully about the structure of the
data you are trying to represent. What is your object? Is it a dictionary or a
list? Where is it contained? Is it freestanding? Is it the value of a
dictionary key-value pair? Is it an item in a list? What is it's parent? What
are it's children? What are it's siblings?
It it's a complex structure, it might help to make a drawing on actual paper.
In the YAML dictionary sample above, I used a contracted for in my list of
dictionaries. This is common practice, but may be a bit confusing at first
since it makes the indentation unclear.
If might be easier to understand the structure of the document if you use the
expanded form:
```yaml
Children:
-
Name: N
Age: 3
-
Name: H
Age: 1
```
### Advanced topics
#### Comments
Adding comments to your code makes it easier to understand. Both to other
people, and - more importantly - to you when you return to it in six months
because something stopped working.
In YAML, comments begin with a number sign `#`, last until the end of the line
and are ignored by the parser.
```yaml
# A dictionary about me
Name: Thomas Lovén
Email: thomasloven@example.com # This isn't really my email
Hobbies:
# Just some of the ways I like to waste time
- singing # choir, mostly
- woodworking
```
#### Spaces and colons
As mentioned, YAML doesn't require quotes around strings, but they are allowed.
Quotes can be useful to tweak the parsing. Imagine for example the following list:
```yaml
- Halflife 1
- Halflife 2
- Halflife 2: Episode Two
```
This is a list of strings, right? Wrong. The third entry is a dictionary with
the key "Halflife 2" and the value "Episode Two" (keys can contain spaces, by the way).
To fix this, you can use quotes:
```yaml
- Halflife 1
- Halflife 2
- "Halflife 2: Episode Two"
```
#### Using JSON
There's a reason I went through JSON to explain YAML. As I said, all JSON is
also valid YAML. This allows for compact notation:
```yaml
Name: Thomas Lovén
Email: thomasloven@example.com
Hobbies: [singing, woodworking, home automation]
Phones:
Home: +46 (0)XX XXXXXX
Work: +46 (0)XX XXXXXX
Children:
- {Name: N, Age: 3}
- {Name: H, Age: 1}
```
I mention this because you just might run into it sometime. I like to use it in
my configurations to bring down the line count, but it's easy to go overboard
and make the data hard to read instead. In the end it's a matter of taste.
Note that there are still no quotes. That's OK as long as you don't want
commas, } or ] in the value.
#### Merging
Dictionaries can be merged using the key: `<<`. For example:
```yaml
a key: a value
b key: b value
<<: {d key: d value, e key: e value}
```
will be parsed as
```yaml
a key: a value
b key: b value
c key: c value
d key: d value
```
and so will
```yaml
a key: a value
b key: b value
<<:
c key: c value
d key: d value
```
In short, the `<<` key takes a dictionary as its value, and merges it into the
parent dictionary.
#### Node anchors
Merging is very convenient when used in combination with node anchors.
Node anchors are a way of saving a dictionary, and reusing it later
```yaml
my_dict: &my_dict
a: 1
b: 2
c: 3
```
In this case `&my_dict` is NOT the value corresponding to the key `my_dict`,
but a node anchor - as signified by the ampersand `&`.
The anchor saves the value for later reuse and can be recalled any number of
times using an asterisk `*`:
```yaml
a dictionary: &saved
a: 1
b: 2
c: 3
another dictionary: *saved
a list:
- *saved
- *saved
```
This will be parsed as:
```yaml
a dictionary:
a: 1
b: 2
c: 3
another dictionary:
a: 1
b: 2
c: 3
a list:
-
a: 1
b: 2
c: 3
-
a: 1
b: 2
c: 3
```
You can also merge an anchor if you want to add more entries to the dict:
```yaml
base: &base
a: 1
b: 2
extended version:
<<: *base
c: 3
```
At this point, understanding how this will parse shouldn't be a problem to you.
Now, for my final trick:
#### Merging while defining.
The problem with the above examples is that you need to put the definition
somewhere. The YAML snippets above will have the dictionary keys `a
dictionary`and `base` defined and set no matter what. Sometimes that's
impractical, which is why you often see the following in homeassistant
packages:
```yaml
homeassistant:
customize:
package.node_anchors:
common: &common
key1: val1
key2: val2
sensor.my_sensor:
<<: *common
icon: mdi:temp
sensor.another_sensor:
<<: *common
icon: mdi:home
```
The `package.node_anchors` key in the `customize` dictionary contains a
dictionary of stuff that is simply ignored. Anything you put there will have no
effect on the package, so it's a great place to define anchors.
Another possibility is to put the definition in the first place it is used, and
merge it immediately:
```yaml
homeassistant:
customize:
sensor.my_sensor:
<<: &common {key1: val1, key2: val2}
icon: mdi:temp
sensor.another_sensor:
<<: *common
icon: mdi:temp
```
Not all YAML parsers allow this, but it seems to work with homeassistant.