Ansible Variables & Facts, In Depth: the 22-Level Precedence, Facts, register & set_fact

Variables are where Ansible stops being a list of commands and starts being configuration management. The same playbook installs httpd on RHEL and apache2 on Ubuntu, listens on port 80 in staging and 443 in production, and templates a different worker_processes value onto every host — all because a value was looked up at run time instead of hard-coded. Get variables right and a single role serves a hundred different machines; get them wrong and you spend an afternoon asking why is this host reading the wrong value?

That “why is it reading the wrong value” question has one answer in Ansible, and it is precedence. A variable named http_port can be set in a dozen places at once — a role default, a group_vars file, the inventory, a set_fact, and -e on the command line — and Ansible has a single, fixed, documented rule for which one wins. This lesson covers that rule exhaustively: the full ~22-level precedence ladder as an ordered table you can actually use to debug. It then covers the three things that generate variables at run time rather than declaring them up front — facts (what Ansible discovers about a host), register (capturing the result of a task), and set_fact (computing a variable mid-play) — plus the magic variables that expose inventory and play state. By the end you will never again be surprised by which value a host actually used.

This is an Intermediate lesson in the Variables module of the Ansible Zero-to-Hero course, written for ansible-core 2.17+ / Ansible 10+ (2026), using fully-qualified collection names (FQCN) such as ansible.builtin.debug throughout.

Learning objectives

By the end of this lesson you will be able to:

Define variables in every supported location — extra-vars, play vars, inventory, group_vars/host_vars, role defaults and vars, registered results, and set_fact — and predict which wins.
Read and apply the full ~22-level variable precedence order to debug “wrong value” problems.
Reference variables correctly, choose the right data type, and avoid the bare-string and unquoted-{{ }} YAML gotchas.
Use gathered facts (ansible_facts, the setup module, gather_subset) and write custom facts with facts.d.
Capture task output with register and consume .stdout, .rc, .changed, and .results (for loops).
Compute variables at run time with set_fact, understand cacheable: true, and know when to reach for it.
Use the magic variables — hostvars, groups, group_names, inventory_hostname, ansible_play_hosts — to write inventory-aware playbooks.

Prerequisites & where this fits

You should be comfortable with playbook anatomy — a play has hosts, tasks, and keywords like become; a task names a module and passes it arguments — and with static inventory (groups, host_vars/, group_vars/). If those are shaky, read Ansible Playbooks, In Depth and Ansible Inventory, In Depth first; this lesson assumes that vocabulary and builds the variable system on top of it. It sits immediately after the core-modules lesson and immediately before Conditionals, Loops, Handlers & Tags — because conditionals and loops are driven by variables, so you must understand where variables come from before you branch on them. Everything here maps directly to the RHCE (EX294) objectives for variables and facts.

Core concepts

A variable in Ansible is a named value — a string, number, boolean, list, or dictionary — that is resolved when a task runs, using Jinja2 templating. You reference a variable by wrapping its name in double curly braces: {{ http_port }}. Internally every variable lives in a single namespace per host: when a play runs against web01, Ansible builds one merged dictionary of variables for that host, and {{ http_port }} is a lookup into it.

The subtlety is that the same key can be supplied from many sources at once, and Ansible must decide which value populates the merged dictionary. That decision is precedence: a strict, fixed ordering from “weakest” (most easily overridden) to “strongest” (overrides everything). The canonical mental model has two anchors you should memorise:

Role defaults are the weakest of all. roles/<name>/defaults/main.yml exists precisely so a role author can ship a sensible value that any other source can override. Defaults are the floor.
Extra-vars (-e / --extra-vars) are the strongest of all. Nothing overrides a value passed on the command line. Extra-vars are the ceiling.

Everything else slots between those two anchors. A second mental model governs how Ansible decides between sources at the same level or across the inventory hierarchy: “more specific beats more general.” A value set on a host beats the same value set on a group the host belongs to; a value on a child group beats one on its parent; and among sibling groups, the one that sorts last alphabetically wins (unless you change a group’s ansible_group_priority). Hold those two ideas — the precedence ladder and specificity within the inventory — and the rest is detail.

One more concept underpins facts: gathering. Before a play’s tasks run, Ansible can connect to each host and discover properties of it — OS family, IP addresses, memory, mounted disks — by running the ansible.builtin.setup module. The results are facts, exposed under ansible_facts (and, with the legacy inject_facts_as_vars setting on, as top-level ansible_* variables). Facts are how a playbook adapts to the machine in front of it.

Variable types and how to reference them

Ansible variables are typed by YAML. The five types you use daily:

Type	YAML example	Reference	Notes
String	`app_name: webapp`	`{{ app_name }}`	Quote if it contains `:`, `{`, `#`, or leading `!`/`@`.
Number (int/float)	`http_port: 8080`	`{{ http_port }}`	Unquoted YAML numbers stay numeric; quoting makes them strings.
Boolean	`enabled: true`	`{{ enabled }}`	Use `true`/`false`. YAML 1.1 also accepts `yes/no/on/off` but `true/false` is clearest.
List (array)	`pkgs: [git, vim]`	`{{ pkgs }}`, `{{ pkgs[0] }}`	Iterate with `loop: "{{ pkgs }}"`.
Dictionary (map)	`limits: {soft: 1024, hard: 4096}`	`{{ limits.soft }}` or `{{ limits['soft'] }}`	Dot vs bracket — see the gotcha below.

Dot versus bracket notation. {{ limits.soft }} and {{ limits['soft'] }} usually mean the same thing, but bracket notation is safer. Dot notation breaks if the key collides with a Python dictionary method or attribute — {{ mydict.keys }} returns the built-in keys method, not a key named keys; and keys containing hyphens or starting with a digit ({{ my-dict.0 }}) are not valid dot syntax at all. Use brackets when a key is dynamic or might collide: {{ ansible_facts['distribution'] }}.

The bare-variable and unquoted-`{{ }}` gotchas

Two YAML-versus-Jinja2 traps catch everyone learning Ansible.

1. A value that starts with {{ must be quoted. YAML sees a leading { as the start of an inline dictionary (flow mapping), so this is a syntax error:

# WRONG — YAML thinks { } is a dict, then chokes
- ansible.builtin.debug:
    msg: {{ app_name }}

Quote the whole value so YAML treats it as a string and hands it to Jinja2:

# RIGHT
- ansible.builtin.debug:
    msg: "{{ app_name }}"

The rule is simple: if a value begins with {{, wrap it in quotes. If {{ }} appears in the middle of a string (msg: "Port is {{ http_port }}") you would have quoted it anyway.

2. The bare-variable when exception. The when:, failed_when:, changed_when:, and assert.that: keys are already Jinja2 expressions — Ansible wraps them in {{ }} for you. So you write the bare variable name, no braces:

# RIGHT — when is implicitly a Jinja2 expression
- ansible.builtin.service:
    name: httpd
    state: started
  when: enable_web        # NOT  when: "{{ enable_web }}"

Putting {{ }} inside when works but Ansible warns about it (“{{ }} should not be used”), because you are templating a template. Bare name in conditionals; braces everywhere else.

3. vars: can reference other vars, but order is not guaranteed across files. Within a single vars: block you may build one variable from another (base_url: "https://{{ host }}"), and Ansible resolves the chain lazily at use time. But do not rely on a variable defined in one source being visible to a higher-precedence source that is evaluated earlier — resolve such dependencies explicitly with set_fact if you hit ordering surprises.

The full variable precedence: the ~22-level ordered table

This is the heart of the lesson. When the same variable name is set in more than one place, Ansible applies a fixed precedence. The list below is the official ordering from lowest (most easily overridden) to highest (overrides everything). The last entry wins.

#	Source (lowest → highest)	Where it lives	Scope	Typical use
1	Command-line values (non-`-e`)	e.g. `-u`/`--user` on the CLI	Run	Connection defaults like remote user; weakest of all.
2	Role defaults	`roles/<r>/defaults/main.yml`	Role	The intended-to-be-overridden floor for a role.
3	Inventory file / script group vars	groups defined in `inventory` (INI/YAML) or a dynamic-inventory script	Group	Vars written next to group definitions in the inventory.
4	Inventory `group_vars/all`	`inventory_dir/group_vars/all`	All hosts	Site-wide defaults living beside the inventory.
5	Playbook `group_vars/all`	`group_vars/all` next to the playbook	All hosts	Project-wide defaults beside the playbook.
6	*Inventory `group_vars/`**	`inventory_dir/group_vars/<group>`	Group	Per-group vars beside the inventory.
7	*Playbook `group_vars/`**	`group_vars/<group>` next to the playbook	Group	Per-group vars beside the playbook.
8	Inventory file / script host vars	host lines / dynamic-inventory `_meta`	Host	Vars written next to host definitions in the inventory.
9	*Inventory `host_vars/`**	`inventory_dir/host_vars/<host>`	Host	Per-host vars beside the inventory.
10	*Playbook `host_vars/`**	`host_vars/<host>` next to the playbook	Host	Per-host vars beside the playbook.
11	Host facts / cached `set_fact`	gathered facts; `set_fact` with `cacheable: true`	Host	Discovered facts and persisted computed facts.
12	Play `vars`	`vars:` in the play	Play	Values scoped to one play.
13	Play `vars_prompt`	`vars_prompt:` in the play	Play	Interactive prompt values.
14	Play `vars_files`	`vars_files:` in the play	Play	External files loaded into the play.
15	Role `vars`	`roles/<r>/vars/main.yml` (and `include_vars`)	Role	Role-internal vars meant to be hard to override.
16	Block `vars`	`vars:` on a `block:`	Block	Values shared by tasks in a block.
17	Task `vars`	`vars:` on a single task	Task	Values scoped to one task.
18	`include_vars`	`ansible.builtin.include_vars` at run time	Play (from load point)	Dynamically loaded var files.
19	`set_fact` / `register`	`ansible.builtin.set_fact`; task `register:`	Host	Run-time computed and captured values.
20	Role (and `include_role`) params	params passed to a role via `roles:`/`import_role`/`include_role`	Role invocation	Per-call role parameters.
21	`include` params	params passed to an included task file (`include_tasks` / `vars` on `import_tasks`)	Include	Per-include parameters.
22	Extra vars	`-e`/`--extra-vars` on the CLI	Global	Always wins. Nothing overrides this.

A few load-bearing observations about this ladder:

-e / extra-vars is absolute. It overrides facts, set_fact, role params — everything. That is why -e is the right tool for a one-off override (-e "http_port=8443") and the wrong tool for routine config, because once a value is in extra-vars nothing inside the playbook can change it.
Role defaults (level 2) are deliberately near the bottom, while role vars (level 15) are high up. This is the single most important role-authoring rule: put a value in defaults/main.yml if you want users to override it; put it in vars/main.yml if you want it locked. Mixing these up is the #1 cause of “my group_vars won’t take effect” — because group_vars (levels 3–10) cannot override role vars (level 15).
Inventory vars lose to play/role vars. Notice that all inventory sources (levels 3–10) sit below play vars (12) and role vars (15). A value in group_vars/all is easily overridden by a vars: block in the play. Inventory is for site facts, not for forcing values.
host_vars beats group_vars. Within inventory, host-level (8–10) always beats group-level (3–7), because a host is more specific than a group.
Connection vars are not magic. ansible_user, ansible_host, etc. follow this same table — they are just ordinary variables that connection plugins read. Set ansible_user in host_vars and it beats one in group_vars like any other variable.

Specificity within the inventory: groups, child groups, and priority

Levels 3–10 collapse to a sub-rule when one host belongs to several groups. Ansible flattens group vars in this order:

Child beats parent. If webservers is a child of all (it always is) and you also nest prod_web under webservers, a var on prod_web beats the same var on webservers, which beats one on all. Depth wins.
Among sibling groups at the same depth, the last alphabetically wins. If web01 is in both datacentre_a and zone_blue (siblings, same depth) and both set dns_server, zone_blue wins because z > d. This is surprising and a classic interview trap.
ansible_group_priority overrides the alphabetical tiebreak. Set ansible_group_priority: 10 on a group (default is 1) to make it win regardless of name. Higher number wins; ties fall back to alphabetical. Note priority only breaks ties between groups at the same level — it does not let a group beat a host or a child.

After all groups are merged, host vars are layered on top, so any host-level value beats any group-level value.

Debugging tip: when a value is wrong, run ansible <host> -m ansible.builtin.debug -a "var=http_port" to see the resolved value for that one host, then walk this table top-down to find which source is providing it. ansible-inventory --host <host> dumps all inventory-sourced vars for a host, which usually reveals the culprit at levels 3–10.

Facts: what Ansible knows about a host

Facts are variables Ansible discovers about a managed node by running the ansible.builtin.setup module at the start of a play. They let one playbook adapt to many machines — install dnf packages on RedHat hosts and apt packages on Debian hosts by reading ansible_facts['os_family'].

Gathering: `gather_facts` and where facts appear

By default every play runs an implicit setup task before its first real task; the play-level keyword controls it:

- hosts: web
  gather_facts: true     # default; set false to skip and speed up
  tasks:
    - ansible.builtin.debug:
        var: ansible_facts['distribution']

Facts appear under the ansible_facts dictionary — the modern, recommended namespace: ansible_facts['distribution'], ansible_facts['default_ipv4']['address'], ansible_facts['memtotal_mb']. Historically Ansible also injected each fact as a top-level ansible_-prefixed variable (ansible_distribution, ansible_default_ipv4). That injection is controlled by the inject_facts_as_vars setting (in ansible.cfg or INJECT_FACTS_AS_VARS), which still defaults to true for backward compatibility but is deprecated; prefer the ansible_facts['...'] form, which always works regardless of the setting.

The most-used facts:

Fact (`ansible_facts[...]`)	What it tells you	Example value
`distribution`	OS distribution	`Ubuntu`, `RedHat`, `CentOS`
`distribution_version` / `distribution_major_version`	OS version	`22.04` / `9`
`os_family`	Distro family (great for branching)	`Debian`, `RedHat`
`architecture`	CPU architecture	`x86_64`, `aarch64`
`processor_vcpus` / `processor_cores`	CPU counts	`4` / `2`
`memtotal_mb`	RAM in MB	`7976`
`default_ipv4['address']`	Primary IPv4	`10.0.1.12`
`all_ipv4_addresses`	List of all IPv4s	`["10.0.1.12", "172.17.0.1"]`
`hostname` / `fqdn`	Names	`web01` / `web01.corp.local`
`mounts`	List of mounted filesystems (size, used)	`[{mount: "/", size_total: ...}]`
`service_mgr`	Init system	`systemd`
`pkg_mgr`	Package manager	`dnf`, `apt`
`python['version']['major']`	Target Python major	`3`
`date_time['iso8601']`	Time on the host at gather	`2026-06-15T09:00:00Z`

`gather_subset`: gather less (or more)

Fact gathering is the slowest part of many plays. gather_subset controls which facts are collected, trading completeness for speed. Pass it as a play-level keyword or as the setup module’s gather_subset argument. Valid tokens (combine with commas; prefix with ! to exclude):

Token	Collects
`all`	Everything (default behaviour).
`min`	A minimal, fast core set (distribution, hostname, etc.).
`hardware`	CPU, memory, devices, mounts (can be slow — scans disks).
`network`	Interfaces and IP addressing.
`virtual`	Virtualisation role/type.
`ohai` / `facter`	Pull in Chef Ohai / Puppet Facter facts if installed.
`!hardware`, `!all`, `!min`	Exclude a subset (e.g. `!all,!min,network` = network only).

- hosts: web
  gather_facts: true
  gather_subset:
    - "!all"
    - "!min"
    - network          # gather network facts only — much faster

Two related knobs: gather_timeout (seconds before a slow fact subset gives up, default 10) and running setup explicitly with filter to fetch only matching keys: ansible.builtin.setup: filter=ansible_default_ipv4*.

Fact caching (teaser)

Because gathering is expensive, Ansible can cache facts between runs so a play with gather_facts: false can still read facts gathered earlier. You enable it in ansible.cfg with a fact_caching plugin — jsonfile (a directory of JSON files), redis, or memcached — plus fact_caching_connection (the path or server) and fact_caching_timeout. With a cache, set_fact ... cacheable: true values also persist across runs (see below). Fact caching is covered in depth in a later operations lesson; for now, know that cached facts enter the precedence ladder at level 11, same as freshly gathered facts.

Custom facts: `facts.d`

You can teach a host to report your own facts. Drop an executable or INI/JSON file ending in .fact into the directory /etc/ansible/facts.d/ on the managed node. At gather time the setup module reads them and exposes the results under ansible_facts['ansible_local'].

An INI .fact file produces a section→key→value map.
An executable .fact (script) must print JSON on stdout.

Example — /etc/ansible/facts.d/app.fact:

[deployment]
tier=frontend
version=4.2.1

After gathering, ansible_facts['ansible_local']['app']['deployment']['tier'] is frontend. You can point the setup module at a different directory with fact_path. Custom facts are perfect for surfacing data only the host knows — a build number written by a previous deploy, a hardware asset tag, a feature flag.

`register`: capturing a task’s result

Modules return structured JSON when they run. register saves that JSON into a variable so later tasks can branch on it. This is how you turn “run a command” into “run a command and react to what happened.”

- name: Check if the service is active
  ansible.builtin.command: systemctl is-active httpd
  register: svc
  changed_when: false           # a read-only check never "changes" anything
  failed_when: false            # don't fail the play just because it's inactive

- name: Restart only if it was not active
  ansible.builtin.service:
    name: httpd
    state: restarted
  when: svc.stdout != "active"

The registered variable is a dictionary. The keys you will actually use:

Key	Meaning	Typical use
`.stdout`	Command’s stdout as one string	`when: result.stdout == "active"`
`.stdout_lines`	stdout split into a list of lines	`loop: "{{ result.stdout_lines }}"`
`.stderr` / `.stderr_lines`	Standard error	Diagnostics, error matching.
`.rc`	Return/exit code (command/shell)	`when: result.rc != 0`
`.changed`	Did this task report a change?	`when: result.changed`
`.failed`	Did the task fail?	`when: not result.failed`
`.skipped`	Was the task skipped (by `when`)?	Guard downstream tasks.
`.msg`	Human-readable message from the module	Logging, asserts.
`.results`	List of per-item results when the task has a `loop`	Iterate over loop outcomes.
`.attempts`	How many tries (with `until`/`retries`)	Retry diagnostics.

Three things people get wrong with register:

A registered variable is per-host and persists for the rest of the play. Each host has its own copy; do not assume one host’s registered result is visible to another (use hostvars for that — see below).
Registering inside a loop gives you .results, not .stdout. The top-level variable becomes a wrapper; the real per-iteration data is the list result.results, each element having its own .stdout, .rc, .item (the loop value), and .changed. Iterate it: loop: "{{ result.results }}" then reference {{ item.stdout }}.
A skipped task still registers a variable — one with .skipped == true and no .stdout. Referencing result.stdout after a skip throws “dict object has no attribute ‘stdout’”. Guard with when: not result.skipped or use the default filter: {{ result.stdout | default('') }}.

A registered variable is just data, so you can post-process it with filters: {{ pkglist.stdout | from_json }}, {{ df.stdout_lines | select('match', '/dev') | list }}.

`set_fact`: computing variables at run time

Sometimes the value you need does not exist until the play is running — it is computed from a fact, a registered result, or another variable. ansible.builtin.set_fact creates or updates a variable mid-play, on a per-host basis, and it sits high in precedence (level 19), so it overrides almost everything except role/include params and extra-vars.

- name: Derive the package name from the OS family
  ansible.builtin.set_fact:
    web_pkg: "{{ 'httpd' if ansible_facts['os_family'] == 'RedHat' else 'apache2' }}"

- name: Build a value from a registered result
  ansible.builtin.set_fact:
    short_hostname: "{{ inventory_hostname.split('.')[0] }}"
    worker_count: "{{ ansible_facts['processor_vcpus'] | int * 2 }}"

- ansible.builtin.debug:
    msg: "Will install {{ web_pkg }} with {{ worker_count }} workers"

Key properties of set_fact:

Per-host and persistent within the play. Like register, each host gets its own value, and it lives for the remainder of the play (and into later plays in the same run, for that host).
It is not a fact in the gathered sense — by default it does not survive into a future ansible-playbook run. To make it persist across runs, add cacheable: true, which writes it through the active fact-caching plugin. A cacheable set_fact is stored under ansible_facts and re-enters precedence at level 11 (cached facts) on the next run, while still being usable at level 19 in the current run.
Use it for derived values, not for overriding config. A good set_fact computes something (worker_count from CPU count); a bad one tries to force a config value that should have come from group_vars. Because set_fact is high precedence, over-using it makes playbooks hard to override from the outside.
Booleans vs strings. set_fact: ready: "{{ x }}" yields a string; if you need a real boolean for a when:, cast it: ready: "{{ (x | int) > 0 }}" or apply | bool.

A common, legitimate pattern is the OS-conditional set_fact to normalise differences once, then write the rest of the play in terms of your own variable:

- name: Normalise per-distro names once
  ansible.builtin.set_fact:
    web_pkg: "{{ 'httpd' if ansible_facts['os_family'] == 'RedHat' else 'apache2' }}"
    web_svc: "{{ 'httpd' if ansible_facts['os_family'] == 'RedHat' else 'apache2' }}"
    conf_dir: "{{ '/etc/httpd' if ansible_facts['os_family'] == 'RedHat' else '/etc/apache2' }}"

Magic variables: inventory and play state

Magic variables are special, always-available variables Ansible populates itself. You do not set them; you read them to make a playbook inventory-aware. They are not affected by the precedence table (you cannot meaningfully override them). The essentials:

Magic variable	What it holds	Use
`inventory_hostname`	The name of the current host as written in inventory	Per-host filenames, identity.
`inventory_hostname_short`	The part before the first `.`	`web01` from `web01.corp.local`.
`hostvars`	Dict of every host’s variables and facts, keyed by inventory name	Read another host’s facts: `hostvars['db01']['ansible_facts']['default_ipv4']['address']`.
`groups`	Dict mapping group name → list of member hosts	Loop over all DB servers: `loop: "{{ groups['db'] }}"`.
`group_names`	List of groups the current host belongs to	`when: "'prod' in group_names"`.
`ansible_play_hosts`	Hosts in the current play still active (not failed/unreachable)	Quorum logic, “all the web nodes”.
`ansible_play_hosts_all`	All hosts targeted by the play, including failed ones	Reporting.
`ansible_play_batch`	Hosts in the current `serial` batch	Rolling-update awareness.
`play_hosts`	Deprecated alias of `ansible_play_hosts`	Avoid in new code.
`ansible_host`	The address Ansible actually connects to	May differ from `inventory_hostname`.
`ansible_hostname`	The host’s discovered short hostname (a fact)	Contrast with `inventory_hostname` (inventory’s name).
`inventory_dir`	Directory of the inventory source	Locate companion files.
`playbook_dir`	Directory of the running playbook	Build paths relative to the playbook.
`ansible_check_mode`	`true` when running with `--check`	Skip destructive steps in dry-run.
`ansible_version`	Dict of the controller’s Ansible version	Feature gating.
`omit`	A sentinel meaning “drop this parameter”	`mode: "{{ file_mode

Two patterns make magic variables click:

Cross-host data with hostvars. A web server often needs the database server’s IP. The DB host gathered its own facts; the web host reads them through hostvars:

- name: Template the app config with the DB address
  ansible.builtin.template:
    src: app.conf.j2
    dest: /etc/app/app.conf
  vars:
    db_ip: "{{ hostvars['db01']['ansible_facts']['default_ipv4']['address'] }}"

This only works if db01 has been gathered (was in a prior play, or facts are cached) — otherwise its facts are not in hostvars yet.

Iterating a group with groups. Build an /etc/hosts or a load-balancer backend list from inventory:

- name: List every web node's IP
  ansible.builtin.debug:
    msg: "{{ hostvars[item]['ansible_facts']['default_ipv4']['address'] }}"
  loop: "{{ groups['web'] }}"

The classic confusion to settle now: inventory_hostname is what you named the host in inventory; ansible_hostname (and ansible_facts['hostname']) is what the machine calls itself. They are often different (an inventory alias web-prod-01 pointing at a box whose hostname is ip-10-0-1-12). Use inventory_hostname for identity in your automation; use the fact when you need the machine’s actual hostname.

The diagram shows the precedence ladder on the left (extra-vars overriding everything down to role defaults at the floor) and, on the right, the four run-time sources — gathered facts, custom facts.d facts, register, and set_fact — feeding the per-host variable namespace that magic variables like hostvars then expose.

Hands-on lab

This lab is free — it runs entirely on localhost plus one or two local containers, costs ₹0, and needs only Ansible installed.

Goal: prove precedence, gather and read facts, write a custom fact, use register and set_fact, and read magic variables.

Step 0 — Set up a tiny inventory

Create a working directory and an inventory with a group var, a host var, and overlapping values to demonstrate precedence.

mkdir -p ~/ansible-vars-lab/group_vars ~/ansible-vars-lab/host_vars
cd ~/ansible-vars-lab

inventory.ini:

[web]
localhost ansible_connection=local

[web:vars]
http_port=80

group_vars/web.yml:

http_port: 8080      # overrides the inventory [web:vars] value (level 7 > level 3)
greeting: "from group_vars"

host_vars/localhost.yml:

greeting: "from host_vars"   # host beats group → this wins

Step 1 — Watch precedence resolve

play-precedence.yml:

- name: Demonstrate precedence
  hosts: web
  gather_facts: false
  vars:
    http_port: 9090            # play vars (level 12) beat all inventory/group_vars
  tasks:
    - ansible.builtin.debug:
        msg: "http_port={{ http_port }} | greeting={{ greeting }}"

Run it, then override with extra-vars:

ansible-playbook -i inventory.ini play-precedence.yml
ansible-playbook -i inventory.ini play-precedence.yml -e "http_port=443"

Expected: the first run prints http_port=9090 (play vars beat group_vars 8080 and inventory 80) and greeting=from host_vars (host beat group). The second run prints http_port=443 — extra-vars override even the play vars. You have just watched levels 3, 7, 12, and 22 fight, and the higher level win every time.

Step 2 — Gather and read facts

play-facts.yml:

- name: Read gathered facts
  hosts: web
  gather_facts: true
  gather_subset:
    - "!all"
    - "!min"
    - network
  tasks:
    - ansible.builtin.debug:
        msg: >-
          os={{ ansible_facts['distribution'] }}
          ip={{ ansible_facts['default_ipv4']['address'] | default('n/a') }}
          cpus={{ ansible_facts['processor_vcpus'] | default('n/a') }}

ansible-playbook -i inventory.ini play-facts.yml

Expected: your machine’s distribution and primary IP, gathered quickly because only the network subset (plus the implicit minimum) was collected.

Step 3 — Write and read a custom fact

sudo mkdir -p /etc/ansible/facts.d
printf '[deployment]\ntier=lab\nversion=1.0\n' | sudo tee /etc/ansible/facts.d/app.fact

play-localfacts.yml:

- hosts: web
  gather_facts: true
  tasks:
    - ansible.builtin.debug:
        var: ansible_facts['ansible_local']['app']['deployment']['tier']

ansible-playbook -i inventory.ini play-localfacts.yml

Expected: lab. You taught the host a fact and read it back through ansible_local.

Step 4 — `register` and `set_fact`

play-register.yml:

- hosts: web
  gather_facts: true
  tasks:
    - name: Capture the kernel version
      ansible.builtin.command: uname -r
      register: kern
      changed_when: false

    - name: Derive values at run time
      ansible.builtin.set_fact:
        kernel: "{{ kern.stdout }}"
        double_cpus: "{{ ansible_facts['processor_vcpus'] | int * 2 }}"

    - ansible.builtin.debug:
        msg: "kernel={{ kernel }} rc={{ kern.rc }} double_cpus={{ double_cpus }}"

ansible-playbook -i inventory.ini play-register.yml

Expected: the kernel string, rc=0, and twice your CPU count — proof that register captured the command result and set_fact computed a new value from a fact.

Step 5 — Magic variables (optional, with a container)

If Docker or Podman is available, add a second target to see hostvars/groups span hosts:

docker run -d --name node2 --rm rockylinux:9 sleep infinity

Append to inventory.ini:

node2 ansible_connection=docker

play-magic.yml:

- hosts: all
  gather_facts: true
  tasks:
    - ansible.builtin.debug:
        msg: >-
          I am {{ inventory_hostname }};
          groups={{ group_names }};
          web members={{ groups['web'] }}

ansible-playbook -i inventory.ini play-magic.yml

Expected: each host prints its own inventory_hostname, the groups it belongs to, and the shared member list of web — the magic variables exposing inventory state.

Validation

# Resolved value for one host, walking precedence:
ansible web -i inventory.ini -m ansible.builtin.debug -a "var=http_port"
# All inventory-sourced vars for the host (levels 3–10):
ansible-inventory -i inventory.ini --host localhost

Cleanup

docker rm -f node2 2>/dev/null || true
sudo rm -f /etc/ansible/facts.d/app.fact
rm -rf ~/ansible-vars-lab

Cost note

₹0. Everything ran on localhost and an ephemeral local container; nothing was provisioned in any cloud.

Common mistakes & troubleshooting

Symptom	Likely cause	Fix
`group_vars` value “won’t take”	The value is also set in role `vars/main.yml` (level 15) which beats inventory (levels 3–10)	Move the role value to `defaults/main.yml` (level 2), or override with `-e`.
`template error ... expected token ':'` on a task value	Value starts with `{{` and is unquoted — YAML parses `{` as a dict	Quote it: `msg: "{{ x }}"`.
Ansible warns “`{{ }}` should not be used”	`{{ }}` inside `when:`/`changed_when:` etc.	Use the bare variable name; those keys are already Jinja2.
`dict object has no attribute 'stdout'`	Reading `.stdout` from a skipped task or a looped register	Guard with `when: not r.skipped`; for loops read `r.results[*].stdout`.
Wrong value when a host is in two groups	Sibling groups — the one sorting last alphabetically won	Set `ansible_group_priority` on the group that should win.
`set_fact` value gone on the next playbook run	`set_fact` is per-run unless cached	Add `cacheable: true` and enable a `fact_caching` plugin.
`hostvars['db01'][...]` is undefined	`db01`’s facts were never gathered this run	Gather it in an earlier play, enable fact caching, or run `setup` against it.
A number is treated as a string in a comparison	Value came from `register`/`set_fact` as text	Cast: `{{ r.stdout
`ansible_distribution` undefined	`gather_facts: false`, or `inject_facts_as_vars=false`	Gather facts, and prefer `ansible_facts['distribution']`.

Best practices

Name with the source in mind. Reserve defaults/main.yml for everything a user might tune; put only locked, role-internal constants in vars/main.yml. This single discipline prevents most precedence pain.
Use host_vars/group_vars for configuration; use -e only for genuine one-offs. Because extra-vars cannot be overridden, routine config in -e is a trap.
Prefer ansible_facts['x'] over ansible_x. It is the supported form and survives inject_facts_as_vars: false, which is the future default.
Gather only what you need. gather_subset (and gather_facts: false where you can read cached facts) shaves real time off large fleets.
Make read-only commands honest. Pair register with changed_when: false on checks so they do not pollute the changed count.
Normalise per-OS differences once with a single set_fact, then write the rest of the play against your own variables — never sprinkle os_family conditionals through twenty tasks.
Cast types explicitly (| int, | bool) whenever a value originates from register/set_fact, which are strings by default.
Keep variable scope as narrow as it can be. A task-scoped vars: is clearer than a play var that only one task uses.

Security notes

Never store secrets in plain group_vars/host_vars. Encrypt them with Ansible Vault (ansible-vault encrypt_string) — covered in Ansible Vault, In Depth. A password in group_vars/all is a password in your Git history.
Treat extra-vars as visible. Values passed with -e appear in shell history, process listings, and CI logs. Pull secrets from a vault file (-e @vault.yml) or a secret store, not inline on the command line.
Use no_log: true on tasks that handle secrets, including those that register sensitive output — otherwise the secret is printed at -v and in callbacks.
Custom facts run code on the target. An executable .fact in /etc/ansible/facts.d/ runs as part of gathering; restrict who can write to that directory (root-owned, 0755) so a compromised user cannot inject facts — or worse, code.
Cached facts can leak. A jsonfile fact cache is plaintext on the controller; if facts include anything sensitive (a cacheable: true token, an internal IP map), protect the cache directory and consider its retention.
Magic variables expose your inventory. hostvars and groups reveal every host and its facts to any play; be deliberate about printing them in logs that others can read.

Interview & exam questions

What overrides everything in Ansible variable precedence, and what is at the very bottom? Extra-vars (-e/--extra-vars) override everything; role defaults (defaults/main.yml) are the weakest, overridden by every other source.
A value in group_vars/all is being ignored in favour of one in a role. Why? The role almost certainly sets it in vars/main.yml (precedence level 15), which beats all inventory sources (levels 3–10). Move it to the role’s defaults/main.yml (level 2) so group_vars can override it.
host_vars vs group_vars — which wins, and why? host_vars wins. A host is more specific than a group, and the precedence table places host-level inventory vars above group-level ones.
A host belongs to two groups that both set the same variable. Which value applies? Among sibling groups at the same depth, the last alphabetically wins, unless you set ansible_group_priority to break the tie. Child groups always beat parent groups regardless of name.
What is the difference between register and set_fact? register captures the JSON result of a task into a variable (.stdout, .rc, .changed, .results); set_fact creates or computes a variable explicitly from any expression. Both are per-host. set_fact can persist across runs with cacheable: true; register cannot.
What does cacheable: true on set_fact do? It writes the value through the configured fact-caching plugin so it survives into future ansible-playbook runs, entering precedence at level 11 (cached facts) next time, while still usable at level 19 this run.
You looped a task and registered the result; result.stdout errors. Why and what’s the fix? With a loop, the registered variable wraps a list in result.results; each element has its own .stdout/.rc/.item. Iterate result.results and read item.stdout.
Difference between inventory_hostname and ansible_facts['hostname']? inventory_hostname is the name you gave the host in inventory (could be an alias); ansible_facts['hostname'] (a.k.a. ansible_hostname) is the short hostname the machine reports. They are frequently different.
How do you read another host’s IP from the current play? Through the hostvars magic variable: hostvars['db01']['ansible_facts']['default_ipv4']['address'] — provided db01’s facts were gathered (this run or from cache).
How do you speed up a play that doesn’t need every fact? Set gather_subset to only the needed subsets (e.g. ["!all","!min","network"]), raise/limit gather_timeout, or skip gathering (gather_facts: false) and rely on cached facts.
Why must a value beginning with {{ be quoted, but a when: condition must not be? A leading {{ makes YAML try to parse a flow mapping, so it must be quoted to be a string; when: (and changed_when:, failed_when:) are already Jinja2 expressions, so you supply the bare variable and adding {{ }} is redundant (and warned against).
Where do custom facts live and where do they appear? Executable or INI/JSON files ending in .fact go in /etc/ansible/facts.d/ on the managed node (or a path set via fact_path); they surface under ansible_facts['ansible_local'].

Quick check

Which is higher precedence: play vars or host_vars?
True/false: set_fact values automatically persist to the next playbook run.
What key holds per-item results when you register a task that has a loop?
Which gather_subset value collects the least?
Name the magic variable that maps each group name to its list of member hosts.

Answers

Play vars (level 12) beats host_vars (levels 8–10).
False — only if you add cacheable: true and a fact-caching plugin is configured; otherwise they last only for the current run.
.results — a list, each element with its own .stdout, .rc, .item, .changed.
min (or excluding everything with !all,!min,... leaving a single small subset). min collects a minimal fast core.
groups (e.g. groups['web']).

Exercise

Build a two-host setup (e.g. localhost plus one container) and a playbook that:

Sets app_port: 8080 in group_vars/all, app_port: 9090 in host_vars for one host only, and proves with a debug task that the two hosts resolve different values.
Gathers facts with only the network subset and prints each host’s primary IPv4.
Adds a custom fact facts.d/build.fact reporting a version, and reads it back via ansible_local.
Runs df -h / with register + changed_when: false, then uses set_fact to compute root_fs_line from .stdout_lines.
Uses hostvars and groups['all'] to print every host’s IP from a single play.
Finally, override app_port with -e "app_port=1234" and confirm both hosts now report 1234, demonstrating extra-vars at the top of precedence.

Bonus: add cacheable: true to a set_fact, enable the jsonfile fact cache in ansible.cfg, run twice, and confirm the value is available on the second run even with gather_facts: false.

Certification mapping

This lesson maps to the RHCE (EX294) exam objectives:

“Use variables” and “Manage variable precedence” — the full precedence table and the role defaults-vs-vars rule.
“Use Ansible facts” — gather_facts, ansible_facts, gather_subset, and custom facts in /etc/ansible/facts.d.
“Create and use roles” (variable portion) — defaults vs vars precedence for roles.
“Work with the registered variables” and run-time variable creation — register, set_fact, and consuming results.
Magic variables (hostvars, groups, inventory_hostname) recur throughout the exam’s inventory and templating tasks.

Glossary

Precedence — the fixed ordering Ansible uses to choose a value when a variable is set in multiple places; extra-vars highest, role defaults lowest.
Extra-vars — variables passed with -e/--extra-vars; the highest precedence; cannot be overridden.
Role defaults — variables in roles/<r>/defaults/main.yml; the lowest precedence, meant to be overridden.
Fact — a property of a managed node discovered by the setup module and exposed under ansible_facts.
gather_subset — controls which categories of facts are collected (all, min, hardware, network, virtual, with ! to exclude).
Custom fact (facts.d) — a user-supplied .fact file on the host, surfaced under ansible_facts['ansible_local'].
register — keyword that stores a task’s JSON result in a variable for later use.
set_fact — module that creates or computes a variable mid-play; cacheable: true persists it via the fact cache.
Magic variable — an Ansible-populated, always-available variable such as hostvars, groups, group_names, inventory_hostname, ansible_play_hosts.
hostvars — magic dictionary of every host’s variables and facts, keyed by inventory name; used to read another host’s data.
inventory_hostname — the host’s name as written in inventory (may be an alias), distinct from the discovered ansible_hostname fact.
ansible_group_priority — per-group setting that breaks the alphabetical tie between sibling groups (higher wins).

Next steps

Ansible Conditionals, Loops, Handlers & Tags, In Depth — branch and iterate using the variables, facts, and registered results you just learned to create.
Ansible Playbooks, In Depth — the play and task keywords (vars, vars_files, register) that this lesson’s precedence table references.
Ansible Jinja2 Templating, In Depth — the filters and tests (default, int, bool, from_json) you use to shape variables and facts.
Ansible Vault, In Depth — encrypt the sensitive variables that must never live in plaintext group_vars.

Ansible Variables & Facts, In Depth: the 22-Level Precedence, Facts, register & set_fact

Learning objectives

Prerequisites & where this fits

Core concepts

Variable types and how to reference them

The bare-variable and unquoted-`{{ }}` gotchas

The full variable precedence: the ~22-level ordered table

Specificity within the inventory: groups, child groups, and priority

Facts: what Ansible knows about a host

Gathering: `gather_facts` and where facts appear

`gather_subset`: gather less (or more)

Fact caching (teaser)

Custom facts: `facts.d`

`register`: capturing a task’s result

`set_fact`: computing variables at run time

Magic variables: inventory and play state

Hands-on lab

Step 0 — Set up a tiny inventory

Step 1 — Watch precedence resolve

Step 2 — Gather and read facts

Step 3 — Write and read a custom fact

Step 4 — `register` and `set_fact`

Step 5 — Magic variables (optional, with a container)

Validation

Cleanup

Cost note

Common mistakes & troubleshooting

Best practices

Security notes

Interview & exam questions

Quick check

Answers

Exercise

Certification mapping

Glossary

Next steps

Written by Vinod

Comments

Ansible Variables & Facts, In Depth: the 22-Level Precedence, Facts, register & set_fact

Learning objectives

Prerequisites & where this fits

Core concepts

Variable types and how to reference them

The bare-variable and unquoted-{{ }} gotchas

The full variable precedence: the ~22-level ordered table

Specificity within the inventory: groups, child groups, and priority

Facts: what Ansible knows about a host

Gathering: gather_facts and where facts appear

gather_subset: gather less (or more)

Fact caching (teaser)

Custom facts: facts.d

register: capturing a task’s result

set_fact: computing variables at run time

Magic variables: inventory and play state

Hands-on lab

Step 0 — Set up a tiny inventory

Step 1 — Watch precedence resolve

Step 2 — Gather and read facts

Step 3 — Write and read a custom fact

Step 4 — register and set_fact

Step 5 — Magic variables (optional, with a container)

Validation

Cleanup

Cost note

Common mistakes & troubleshooting

Best practices

Security notes

Interview & exam questions

Quick check

Answers

Exercise

Certification mapping

Glossary

Next steps

Written by Vinod

Comments

The bare-variable and unquoted-`{{ }}` gotchas

Gathering: `gather_facts` and where facts appear

`gather_subset`: gather less (or more)

Custom facts: `facts.d`

`register`: capturing a task’s result

`set_fact`: computing variables at run time

Step 4 — `register` and `set_fact`