Introduction

An inventory is a set of variables that defines an environment for ansible. In this environment ansible finds the values needed to perform actions on machines or a group of machines. There will be no tasks in an inventory, only values.

An inventory is defined in yaml or json format, whatever you prefer..
We will discuss the yaml format for readability, but all yaml files can easily be converted into json format.

In this documentation we will take you through the process of creating an inventory in its simplest form, to an inventory with multiple sites with different parameters
for subsets and how to arrange them in your inventory.

This guide is by no means a 'must do', but it will help you to setup an inventory that can handle multiple sites in one inventory.

The inventory described here, can be a git repository, but can also be the result of a script that generates the result as described. I will describe the resulting inventory only, for there are too many ways and sources to describe a script that will deliver the desired output.
Having the desired output, will help you to make the script for a 'dynamic' inventory from the sources that are availlable to you.

While it is possible to create 1 inventory that holds 'all' of your environments, it is maybe not advisable to do this from scratch, an error in your inventory, might disable your ability to run playbooks in your entire organization, thus stopping all automation until this error is fixed. In a later stage these inventories can be merged, if required.

On the ansible documentation site, the creation of an inventory is described in detail. While reading this is almost mandatory, you can get lost in the possibilities. In these pages we will provide you with step-by-step information and the reasons why..

Inventory structure

An inventory is basically a directory structure with files that describe the various objects in the inventory.

The (directory) structure

.
├── group_vars
│   └── all
└── host_vars

The basic structure is shown above, this is as seen from the inventory root directory.
This is the structure we will be filling with files to create a working version of an inventory.

Inventory Basics

The most basic version of an inventory is a single file, this file can be written
in 2 formats:

  • ini format
  • yaml format

We will use the ini format, as it's a well known format, anyone can read an ini file.

The hosts.ini(or hosts.yml) is the heart of any inventory, this defines what hosts are in the inventory (eg. usable by ansible) and the relations between the hosts and the rest of the inventory but that comes later.
As said before, the simplest form of an inventory is a single file, that is the hosts.ini by itself.

Example hosts.ini:

[ALL]
host1.example.com
host2.example.com

Example of hosts.yaml:

all:
  hosts:
    host1.example.com:
    host2.example.com:

In its most basic form, this is a working version of an inventory, both the hosts in this file can be targeted by ansible using the hostname. Provided the hostname is resolvable through dns.

a playbook can be run against one of these machines with the following command:

ansible-playbook play.yml -i hosts.ini -e hosts=host1.example.com

or 

ansible-playbook play.yml -i hosts.yaml -e hosts=host1.example.com

The playbook has no other variables availlable then the hostname, so all other variable must be in the playbook itself. This is not very manageble when you environment consists of several hundreds of machines, every machine has at least some different parameters en config. An inventory like this would force you to make a special custom playbook for every machine.

This is not how automation is meant to work, you should be able to write a playbook that is generic for all or at least a group of hosts.

This is where the group_vars kick in...

Inventory Global vars

In a large environments there are almost always variables/properties that must be
set on a global level, like the name of the organization among other things.
This can be done by adding these to a yaml file in the group_vars/all directory.
The files in this directory will all be included by default as a machine is targeted
by ansible.

Typically some of variables found in all/*.yml files:

  • corporate variables, like name
  • central management servers
  • time servers
  • authentication providers
  • ....

It is a best practice to separate these variables into files that have descriptive names.

so the file for the corporate variables could be named like:

organization.yml

---
organization_name: example
organization_domain: example.com
....

The file for central management servers:

management.yml

---
monitoring_server: 192.168.10.2
logging_sever: 192.168.10.11
logon_server: 192.168.10.4

As you add files,you will be adding variables, be sure to keep those names either unique
or expect those to be overridden in maybe a lower level..
Our inventory tree looks as follows now:

.
├── group_vars
│   ├── all
│   │   ├── management.yml
│   │   └── organization.yml
│   └── all.yml
├── hosts.ini
└── host_vars

An inventory group_vars directory is not a directory structure of groups!
All groups are files directly under the group_vars directory. Mapping to lower levels is done in the inventory.yaml file.

Inventory Host vars

The host_vars are variables used mainly to facillitate deployment of a virtual machine.
Each host has a seperate file to store these variables in.
So all variables needed to deploy a host can reside in this file. However this is not a best
practice....

Typically, you specify ony the vers that are unique to a host in this file, the rest comes from
the group_vars or the global variables, depending on the environment.

Example host_vars for host1.example.com:

In the inventory, the file will be named: host1.example.com.yml and wil be placed under the directory
host_vars in your inventory. The content of the file might look like the example below:

hostname: host1.example.com
os:
  name: rhel
  version: 1.0.0
primary_interface:
  ip: 192.168.10.100
  network: frontend1
size: s
type: vm

In the above example we see a number of variables, that define a VM ( hence the type ) and its properties.
We see the following properties:

  • hostname the hostname of the virtual machine
  • os.name It is a Redhat Enterprise Linux machine
  • os.version The GIT version of the code to be used for this virtual machine
  • ip The ip address for the primary interface
  • network The network name to be used
  • size The (T-shirt)sizing of the VM
  • type Its a VM

As these variables seem to describe a virtual machine, they are incomplete, there is no code that will
deploy this machine with this definition, for there is data missing...
Let's zoom in a bit...

hostname

This is probably one of two variables that is directly useable, a hostname must be unique.

os.name

The Operating system is as such useable, but there is no version given, so we still don't know
what to do, we need additional data to be able to build something like a VM.

os.version

There the version of the RHEL so could be named, but we chose to refer to the code version that deploys
the os. So we still need additional data to be sure what to deploy.

ip

The ip address is the second variable that is directly useable from this file. But we need additional network
variables to make it work as a full configuration.

network

This name is not the name of a physical network, but a reference to a variable that holds the rest of the network
definition needed t configure a working interface.

size

The T-shirt sizing for the virtual machine, the translation of these sizes will be defined elsewhere in the inventory.

type

A type of machine, the definition here tellls us it will be a VM, but the code could cofigure a hardware machine as well.

As you see here, not all specs of a host are in this file, if that were the case, the file would have been much larger and
you would resort to copying this to a new file for a new host and change probably just 2 lines...
A lot of data would be in this file (about 150 lines), you would never touch until...the environment changes...
Then you would have a lot of work changing all these lines in all files...

This is where the group_vars come in..

After adding the host_vars to the inventory, the tree would look like:

.
├── group_vars
│   └── all
│       ├── management.yml
│       └── organization.yml
├── hosts.ini
└── host_vars
    ├── host1.example.com
    ├── host2.example.com
    └── host3.example.com

This could be a working inventory, if all variables to deploy hosts were in the host_vars files for each host.
But since that is not a best practice, we won't even show you.
The next step is adding a level of group_vars, in which we will define missing variables.

Inventory Group vars

Group_vars can add a lot of dimension to you inventory, there is a right way
and, if there is a right way, there is also a less right way (maybe not wrong
but labour intensive and error prone).

You can add hosts to a group, and add variables (properties) to that group.
These properties will be known to any member host of that particular group.

So we will extend our previous inventory by adding a group file to the inventory
and see what happens...
We replaced hosts.ini by a better name for this, inventory.yml. Form now on we will only use the yaml version of the inventory.
The tree view of the inventory after adding the group_vars/group1.yml file:

.
├── group_vars
│   ├── all
│   │   ├── management.yml
│   │   └── organization.yml
│   └── group1.yml
├── inventory.yml
└── host_vars
    ├── host1.example.com
    ├── host2.example.com
    └── host3.example.com

In this group1.yml file we added some properties that are common for all hosts in this group.

---
fontend1:
  network: 192.168.10.0
  netmask: 255.255.255.0
  default: 192.168.10.1
  dns: 8.8.8.8

sys_type: linux
rhel
  major_version: 8

s: 
  memory: 1024
  cpu: 1
m:
  memory: 2048
  cpu: 2
l:
  memory: 4096
  cpu: 4

just adding the file, will not change anything, we still need to add the systems to the group, this
can be done by editing the inventory.yml and adding the group there and add the hosts to that group:

all:
  children:
    group1:
      hosts:
        host1.example.com:
        host2.example.com:
  hosts:
    host3.example.com

As you see in the above example, we added two hosts to the goup1 and added a third host which is not a member
of that group.

This means that the variables defined in the file group1.yml will be availlable in host1 and host2, but not in host3.
Those variables can be used in any playbook you run against these two hosts.

This can be verified by using the following command in a terminal on a system where the inventory is copied:

ansible-inventory -i inventory.yml --host host1.example.com

or

ansible-inventory -i inventory.yml --host host3.example.com

This will give you a list of all variables that are available to the mentioned host from the inventory. This is a powerfull tool to use in the process of inventory building. When building an inventory, always check a host in each group, for missing variables before implementing an inventory in production. This will save you a lot of corrections afterwards.
TIP! Test your inventory in each stage of development..

Now we add a second group.. group_vars/group2.yml

---
net_network: 192.168.11.0
net_netmask: 255.255.255.0
net_default: 192.168.11.1
net_dns: 8.8.8.8

sys_type: linux
rhel
  major_version: 8

s:
  memory: 1024
  cpu: 1
m:
  memory: 2048
  cpu: 2
l:
  memory: 4096
  cpu: 4

and we make the third host part of that group through inventory.yml:

all:
  children:
    group1:
      hosts:
        host1.example.com:
        host2.example.com:
    group2:
      hosts:
        host3.example.com

We can now use a playbook that uses the network variables to do something, without the need to create
2 separate specific plays for each network. This used with many variables can be very powerfull.

But there is a pitfall here, you already can see duplicate values in the group_vars files, and with many more
variables added to these files, you will find that managing these will consume a lot of time and might lead to
errors as some will get changed and others forgotten.
We will address this in the next chapter.

Adding layers to your inventory

As we have seen, there are losts of variables in an inventory. We need to arrange
them into a structure that is maintainable,so we don't get stuck copying them
hundreds of times and make changes a timeconsuming job.
So how do we do this:

Every organization is different, not the least in IT. So there is no guaranteed
working cookbook that will work everywhere. Here it comes.....
You know best how your organization is structured IT wise, its up to you to analyse
and structure the data into an inventory.

In this page, we will try to give you the handles to get into the modelling of the
data in an inventory, after that it is entirely up to you. You shouldn't do this on
your own, you'll need multiple departments to figure out all the variables for your organization.

Data you will need:

  • datacenters
  • networks
  • authentication servers
  • vmware hosts/clusters/vcenter(s)
  • dns
  • time servers
  • logging servers
  • mail providers
  • management servers
  • monitoring servers
  • hostnames
  • ip addresses
  • locations
  • sub locations
  • client sites (optional)

These are just an example of what you might need for your organization and each topic holds
a specific set of data that can hold duplicates, if you don't organize them in the right way.
The above can even exist in multiple environments, like dev, acceptance and production.

You will have to find the logic in this and place each variable on the right spot, so it occurs
only once with the same value.

In the (old school)database world, where storage was expensive, there was a design process 'data normalisation'.
This was done by data analists, to ensure that a value was only stored once in the entire database, so the least
amount of storage was used to store complex data. Guess what.... the same principal applies to an inventory
and can help us manage our inventory.

Inventory hierachy

We should look at our inventory as a tree of data that is inherited as we go deeper into the inventory tree.
Lets give a small example:

└── organization
    ├── authentication.yml
    ├── central_date_time.yml
    ├── central_mail_provider.yml
    ├── central_management_servers.yml
    └── datacenter
        ├── authentication.yml
        ├── deployment.yml
        ├── logging_server.yml
        ├── monitoring.yml
        └── network
            └── host1.yml

In the above sample we see a tree of files that represent an inventory. The host in this example inherits all values
in the files above the host, this defines much more than the host itself, but also its working environment, which
can be automated through playbooks.

Try to organise your environment in the way show in the above example, an start doing this for 1 host first.
If you have this for 1 host, try adding more hosts in the same datacenter. If done correctly, you won't need many
variables to define the host, in the ideal situation, you should only add the host file with a minimal set of
values.

If this is accomplished, now add a network in the same datacenter, and adding hosts to that network should be just as
easy.

Extend the same procedure to adding datacenters...even customer sites, and think of where to put the branch in your
inventory.

.
└── organization
    ├── authentication.yml
    ├── central_date_time.yml
    ├── central_mail_provider.yml
    ├── central_management_servers.yml
    ├── datacenter
    │   ├── authentication.yml
    │   ├── deployment.yml
    │   ├── logging_server.yml
    │   ├── monitoring.yml
    │   ├── network
    │   │   ├── host1.yml
    │   │   └── host2.yml
    │   └── network2
    │       ├── host3.yml
    │       └── host4.yml
    └── datacenter2
        ├── authentication.yml
        ├── deployment.yml
        ├── logging_server.yml
        ├── monitoring.yml
        └── network
            └── host5.yml

How to write this as an inventory

We now know the hierarchie tree for the data. The next step is to convert this data tree into an inventory.

Everything on the organization level is apparantly global for this organization, so these file will land in
the ALL section of the inventory:

.
├── group_vars
│   ├── all
│   │   ├── authentication.yml
│   │   ├── central_date_time.yml
│   │   ├── central_mail_provider.yml
│   │   └── central_management.yml
│   ├── dc1_datacenter.yml
│   ├── dc1_network1.yml
│   ├── dc1_network2.yml
│   ├── dc2_datacenter.yml
│   └── dc2_network1.yml
├── inventory.yml
└── host_vars
    ├── host1.example.com
    ├── host2.example.com
    ├── host3.example.com
    ├── host4.example.com
    └── host5.example.com

The inventory looks different from the hierarchie structure in the previous section.
The first part in the all section is not so different, but te second part is.
You see that in the inventory tree view, all the goup_vars are unstructured in the directory.
This means the structure must be defined somewhere else, and that is true.

All hosts are in the host_vars directory, as you can see each host must have a unique name
for this to work as an inventory.

The naming of the files in the inventory here is crucial, if there are hundreds of files, naming
them according to structure is key to keep them together. The naming here is an example, your
organization will have other requirements or other names..
Filenames in an inventory must be unique!

The one file that ties this whole inventory together is inventory.yml.
In this file you define the structure of your inventory and where the machines are placed.

all:
  children:
    dc1_datacenter:
      children:
        dc1_network1:
          hosts:
            host1.example.com:
            host2.example.com:         
        dc1_network2:
          hosts:
            host3.example.com:
    dc2_datacenter:
      children:
        dc2_network1:
          hosts:
            host4.example.com:
            host5.example.com:

Group names defined in the inventory.yml are typically found in the group_vars folder as a: - {group_name}.yml file
- {group_name} directory name
What determines if it is a .yml file or a directory? You do!. Start with only .yml files for your variables, you will eventually find that a single file is becomming too large to edit comfortably, this is the moment you split the file into multiple files using a directory with the group_name. The names of the files within this group, must still be unique, so start with the group_nhame and append an usefull name for the variables this file holds.

In the above example, host1.example.com inherits vars from the following files: - dc1_network1.yml - dc1_datacenter.yml - all files in the group_vars/all directory

Host5.example.com is in a different datacenter and inherits from the following files: - dc2_network1.yml - dc2_datacenter.yml - all files in the group_vars/all directory

So host5 shall have a different configuration from host1 and host2 on the following items or
even more as defined in the datacenter.yml or network.yml file:
- ip address space / subnet / gateway - will deploy on a different vmware cluster - will send logging to another server - will be in an other domain for logins - have local monitoring servers in dc2

The code to deploy the virtual machine can be the same in all locations, because we have diffentiated
this in the inventory. This should always be considered when creating an inventory.

Test your inventory

Test your inventory by running th ansible-inventory command and verify if the hosts have all variables your code needs
for deployment and configuration of the machine in its environment.
If that is the case, you will probably need to adapt the code to your new inventory structure. But keeping your inventory
will be much easier now.

testing your inventory can be as simple as:

ansible-inventory -i <path_to_inventory> --host <host5.example.com>

It will output all variables for that host that are found in the inventory

ansible-inventory -i <path_to_inventory> --graph

Will produce the following output:

$ ansible-inventory -i hosts.ini --graph 
@all:
  |--@ALL:
  |--@dc1_datacenter:
  |  |--@dc1_network1:
  |  |  |--host1.example.com
  |  |  |--host2.example.com
  |  |--@dc1_network2:
  |  |  |--host3.example.com
  |--@dc2_datacenter:
  |  |--@dc2_network1:
  |  |  |--host4.example.com
  |  |  |--host5.example.com
  |--@ungrouped:

In the above example output, you see that the structure we defined earlier in our analysis is very
simular to the output of the graph, this tells us we have done something right here :-)
You can add layers by writing more group_vars.yml files and add them to your hosts.ini in the same way
we added these layers.
In the graph output they will be added too, and with more layers comes more control...
Be sure to have a naming convention for these groups in place, so you will know where you put a variable.
Document your layout and your changes...

Constructed Inventories

An inventory is in most cases a collection of variables that when brought together, describe an environment where a server with specific specs will be deployed. These servers defined in this inventory are all managed by Ansible Automation Platform.

As the inventory grows and the number of systems increases, the chances of someone making an error also increases. This is one of the callenges of central management, keeping this organized and secure. Sharing all variables with all teams in AAP is probably not the best or most secure option, this is where constructed inventories can help. Be aware that there are a lot of other options, this is but one and might not be the best fr your organization.

In the description below we describe a possible solution to reduce complexity for teams working is AAP and still leaving the resposibilities where they belong.

This is possible by splitting the inventory in (from a team view) 2 parts, each team has its own inventory specific part, in which they have full rights to change anything.
There is also the base corporate part, controlled by the automation management team, that holds all company wide(global) variables.

Global (automation part)

This part of the inventory holds all variables needed to deploy machines or devices, except the machines the teams control themselves. All infrastructural variables are already defined in the correct groups, so these can be easily used.

This inventory is stored in a git repository,that will be loaded into AAP as a project sourced inventory.

Below the ppossible structure of the base inventory:

── group_vars
│   ├── all
│   │   ├── env.yml
│   │   └── satellite.yml
│   ├── dev.yml
│   ├── test_env.yml
│   ├── org_1.yml
│   ├── org_2.yml
│   └── vmware.yml
├── inventory.yml
└── host_vars
    ├── controller.localdomain.yml
    ├── gitlab.localdomain.yml
    ├── automationhub.localdomain.yml
    ├── reposerver.localdomain.yml
    ├── nameserver.localdomain.yml
    ├── satellite.localdomain.yml
    ├── virtserver.localdomain.yml
    └── gitrunner.localdomain.yml

As previously stated, this is only the deployment environment definition. The hosts defined in here, are just the hosts that define a environment to deploy a host in.

How can this be used for a team?

Team part

Each team wil have a separate repository in which thier part of the inventory is stored, they are in full controll of this part of the inventory.
The structure of this inventory is this:

├── inventory.yml
├── group_vars
└── host_vars
    ├── host1.localdomain.yml
    └── host2.localdomain.yml

In this structure, the team can create new group_vars files to contain their own group variables and host_vars files to define new hosts to be deployed. There are a few rules all teams must obide to: - Hostnames must be unique - Group_vars files must be unique (prefix the filename with your teamname)

The inventory.yml wil look like this:

org_1:
  hosts:
    host1.localdomain.yml
    host2.localdomain.yml

In case of some extra groups:

org_1:
  children:
    org_1_group1:
      hosts:
        host1.localdomain.yml
        host2.localdomain.yml

this way extra variables can be defined in the file group_vars/org_1_group1.yml and automaticly loaded from the inventory.

How do we add these

Option 1 By defining an inventory for each team that uses 2 inventory sources: - 1 the base inventory - 2 the own inventory of the team using an inventory file "org_1.yml" (rename inventory.yml to org_1.yml)

This wil not create a functioning inventory, to make this work, in the base inventory, the inventory yml must be adapted for each new team that is added:

dev:
  children:
    org_1:
    org_2:
    test_env:
      hosts:
        git.example.com
        vcenter.example.com
...

The number of layers in this inventory is kept low for clarity reasons.
Adding more organizations is simply add a new group entry into the children for the environment in inventory.yml.

Pros and cons

A split inventory has some pros and cons, below youll find a few:

Pros: - a team has only a view of the enviroment part that theyare responsible for.

Cons: - In case of an update to the base inventory, all combined invntories is AAP need to be refreshed, this is added overhead. - updates are not implemented by default in a static inventory.
- For system patching a full inventory is needed to be able to patch all systems.

Option 2 In AAP 2.4 and up there is a new inventory type is added, the constructed inventory.
This requires both inventories must be created separately and combined through a "contructed" inventory.
This way all variables are availlable to jobs.

Dynamic inventory on proxmox

To create a dynamix inventory on proxmox, we need an inventory plugin.
The collection community.proxmox has the proxmox inventory plugin and this is what we are going to use.
No programming, no special code, just the plugin in an execution environment that we will also need to deploy machines and containers on proxmox.
This inventory is nothing more than a gitlab repository with some files in there, using a certain structure, as we will explain.

excution environment

As we said before, we will need a execution environment to run the inventory plugin. This environment can be built with the following procedure:
Building excution environments

content of the execution environment

If you use the build instructions on this site, the ee_vars.yml, should contain the following vars:

ee_image_name: ee_proxmox
ee_python:
  - dnspython
  - proxmoxer
  - requests
  - netaddr
ee_collections:
  - community.general
  - community.proxmox
  - ansible.utils
ee_system:
  - openssh-clients [platform:redhat]
use_ansible_cfg: true
basic_image: quay.io/rockylinux/rockylinux:9.5-minimal
ee_build_steps:
ee_version: 1.0

This will build the EE you can use for this inventory.

gitlab project

The gitlab project for the inventory should contain the following files/directories:

.
├── group_vars
│   ├── proxmox.yml
│   ├── lxc.yml
│   ├── qemu.yml
│   ├── ansible.yml
│   ├── proxmox_all_lxc.yml
│   └── proxmox_all_qemu.yml
└── inventory
    ├── 00-static-groups.yml
    └── 01-inventory.proxmox.yml

The functional part of this niventory is in the invnetory directory, there are 2 files here:

1) The first is 00-static-groups.yml.
In this file we layout the structure of the inventory with the groups the plugin genereates, this must be read first, otherwise this will not work correctly.
You can use your tags for the hosts on proxmox here to order your groups.

all:
  children:
    proxmox:
      children:
        ansible:
          children:
            lxc:
            qemu:
        proxmox_all_lxc:
        proxmox_all_qemu:
    proxmox_cluster:
      children:
        proxmox_nodes:

This will structure the inventory using the base groups I defined and the groups the plugin reads from the proxmox cluster.
ansible, lxc and qemu are tags I use on virtual machines to order the variables they need for ansible.

2) The second file is 01-inventory.proxmox.yml
In this file the inventory plugin is configured to retrieve the inventory information from the cluster.

The plugin configuration is as follows:

---
plugin: community.proxmox.proxmox
url: https://proxmox01.homelab:8006
validate_certs: false
want_facts: true

# Instead of login with password, proxmox supports api token authentication since release 6.2.
user: <audit user on cluster>
password: <password>

# Group vms by tags for reference in playbooks.
keyed_groups:
  - key: proxmox_tags_parsed
    separator: ""
    prefix: ""

compose:
  ansible_host: proxmox_name

Ensure that the password is used as a credential, for simplicity, I left it in here.
There are a lot of other possibilities to group your hosts read from proxmox, for that, read the plugin documentation.

The files in the group_vars directory map to the groups in the inventory and contain static variables for various playbooks that use the inventory.

the inventory in config as code

Below the inventory as defined in configuration as code, using the execution environment ee-proxmox, we built for this.

controller_inventories:

  - name: MGT_inventory_proxmox
    description: MGT proxmox inventory
    organization: MGT

controller_inventory_sources:

  - name: MGT_inventory_proxmox
    description:
    organization: MGT
    source: scm
    source_project: MGT_proxmox_inventory
    execution_environment: ee-proxmox
    inventory: MGT_inventory_proxmox
    update_on_launch: true
    overwrite_vars: true
    overwrite: true

Do not specify source_path: / in the inventory_source definition, it will break your inventory!