How to Design an End-to-End Ansible Automation Lab with Playbooks, Inventories, Roles, Vault, Dynamic Inventory, and Custom Modules

Source: MarkTechPost

In this tutorial, we build a complete Ansible lab that runs end-to-end in Google Colab or any Linux environment. We start by installing ansible-core, setting up a local workspace, creating an Ansible configuration file, and defining both static and dynamic inventories. We then explore key Ansible concepts, including group variables, host variables, variable precedence, ad hoc commands, playbooks, loops, conditionals, registered outputs, facts, templates, custom filters, custom modules, roles, handlers, tags, dry runs, idempotency, and Ansible Vault. Since every host runs locally, we practice these concepts safely without needing SSH keys, remote servers, or cloud infrastructure.

import os, sys, subprocess, textwrap, stat BASE = "https://www.marktechpost.com/content/ansible_lab" if os.path.isdir("https://www.marktechpost.com/content") else os.path.expanduser("~/ansible_lab") os.makedirs(BASE, exist_ok=True) ENV = os.environ.copy() ENV["ANSIBLE_CONFIG"]      = os.path.join(BASE, "ansible.cfg") ENV["ANSIBLE_FORCE_COLOR"] = "1" ENV["PY_COLORS"]           = "0" def banner(title):    print("n" + "=" * 78 + f"n  {title}n" + "=" * 78) def write(relpath, content):    """Write a dedented file under BASE, creating parent dirs."""    path = os.path.join(BASE, relpath)    os.makedirs(os.path.dirname(path), exist_ok=True)    with open(path, "w") as f:        f.write(textwrap.dedent(content).lstrip("n"))    return path def sh(cmd, title=None):    """Run a shell command from BASE, stream stdout, never raise."""    if title:        banner(title)    print(f"$ {cmd}n")    p = subprocess.run(cmd, shell=True, cwd=BASE, env=ENV,                       stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True)    print(p.stdout)    return p.returncode banner("STEP 1 — Installing ansible-core") subprocess.run([sys.executable, "-m", "pip", "install", "-q", "ansible-core"], check=True) sh("ansible --version") write("ansible.cfg", """    [defaults]    inventory              = ./inventory.ini    roles_path             = ./roles    library                = ./library    filter_plugins         = ./filter_plugins    vault_password_file    = ./vault_pass.txt    host_key_checking      = False    retry_files_enabled    = False    interpreter_python     = auto_silent    callback_result_format = yaml    deprecation_warnings   = False    localhost_warning      = False    nocows                 = 1    [privilege_escalation]    become = False """) write("inventory.ini", """    [webservers]    web1 ansible_connection=local    web2 ansible_connection=local    [dbservers]    db1 ansible_connection=local    [datacenter:children]    webservers    dbservers """)

We start by preparing the Ansible workspace, setting environment variables, and defining helper functions that make the tutorial easier to run. We install ansible-core, verify the installation, and create the main Ansible configuration file. We also define a static inventory with local web and database host groups so that we can practice Ansible concepts without using remote servers.

write("group_vars/all.yml", """    ---    app_name: "Colab Demo App"    app_version: "2.0.1"    admin_email: "[email protected]"    packages:      - nginx      - git      - htop    feature_flags:      enable_cache: true      enable_metrics: false """) write("host_vars/web1.yml", """    ---    server_id: 101    max_connections: 512 """) write("filter_plugins/custom_filters.py", '''    import re    def to_slug(value):        return re.sub(r"[^a-z0-9]+", "-", str(value).lower()).strip("-")    def human_bytes(value):        n = float(value)        for unit in ["B", "KB", "MB", "GB", "TB"]:            if n < 1024:                return f"{n:.1f}{unit}"            n /= 1024        return f"{n:.1f}PB"    class FilterModule(object):        def filters(self):            return {"to_slug": to_slug, "human_bytes": human_bytes} ''') write("library/system_report.py", '''    #!/usr/bin/python    from ansible.module_utils.basic import AnsibleModule    import platform, os    def main():        module = AnsibleModule(            argument_spec=dict(                label=dict(type="str", required=True),                threshold=dict(type="int", required=False, default=80),            ),            supports_check_mode=True,        )        report = {            "label": module.params["label"],            "system": platform.system(),            "release": platform.release(),            "python": platform.python_version(),            "cpu_count": os.cpu_count(),            "threshold": module.params["threshold"],        }        module.exit_json(changed=False,                         report=report,                         message="Report generated for %s" % module.params["label"])    if __name__ == "__main__":        main() ''')

We define shared group variables and host-specific variables to show how Ansible manages configuration data and applies variable precedence. We then create a custom Jinja2 filter plugin that converts text into slugs and formats byte values into readable units. We also built a custom Python-based Ansible module that generates a simple system report for each host.

write("roles/webserver/defaults/main.yml", """    ---    listen_port: 8080 """) write("roles/webserver/vars/main.yml", """    ---    doc_root: "https://www.marktechpost.com/tmp/www" """) write("roles/webserver/tasks/main.yml", """    ---    - name: Ensure docroot exists      ansible.builtin.file:        path: "{{ doc_root }}"        state: directory        mode: "0755"    - name: Deploy index.html from a Jinja2 template      ansible.builtin.template:        src: index.html.j2        dest: "{{ doc_root }}/index.html"      notify: Restart web service    - name: Run handlers immediately (instead of end of play)      ansible.builtin.meta: flush_handlers """) write("roles/webserver/handlers/main.yml", """    ---    - name: Restart web service      ansible.builtin.debug:        msg: "(simulated) restarting web service on port {{ listen_port }}" """) write("roles/webserver/templates/index.html.j2", """              {{ app_name }}              {{ app_name }} v{{ app_version }}
        Served on port {{ listen_port }} from {{ doc_root }}
        Host: {{ inventory_hostname }}
           """) write("templates/report.txt.j2", """    Deployment Report    =================    App:        {{ app_name }} ({{ app_version }})    Host:       {{ inventory_hostname }}    Generated:  {{ ansible_date_time.iso8601 | default('n/a') }}    Slug:       {{ app_name | to_slug }}    Packages:    {% for p in packages %}      - {{ p }}    {% endfor %}    Cache enabled:   {{ feature_flags.enable_cache }}    Metrics enabled: {{ feature_flags.enable_metrics }} """) dyn = write("dynamic_inventory.py", '''    #!/usr/bin/env python3    import json, sys    INV = {        "webservers": {"hosts": ["web1", "web2"], "vars": {"role": "frontend"}},        "dbservers":  {"hosts": ["db1"],          "vars": {"role": "backend"}},        "_meta": {            "hostvars": {                "web1": {"ansible_connection": "local", "tier": "gold"},                "web2": {"ansible_connection": "local", "tier": "silver"},                "db1":  {"ansible_connection": "local", "tier": "gold"},            }        },    }    if "--host" in sys.argv:        print(json.dumps({}))    else:        print(json.dumps(INV, indent=2)) ''') os.chmod(dyn, os.stat(dyn).st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)

We create a complete web server role with defaults, variables, tasks, handlers, and templates to demonstrate how to build reusable Ansible automation. We use Jinja2 templates to generate an HTML page and a deployment report from Ansible variables. We also add a dynamic inventory script that returns host and group information in JSON format.

write("playbook.yml", """    ---    - name: Advanced concepts demo      hosts: webservers      gather_facts: true      vars:        deploy_user: colab      tasks:        - name: Merged variables (group_vars + host_vars precedence)          ansible.builtin.debug:            msg: "App={{ app_name }} v{{ app_version }} | server_id={{ server_id | default('n/a') }}"        - name: CUSTOM filter -> to_slug          ansible.builtin.debug:            msg: "slug => {{ app_name | to_slug }}"        - name: CUSTOM filter -> human_bytes          ansible.builtin.debug:            msg: "size => {{ 1536000 | human_bytes }}"        - name: LOOP with an index variable          ansible.builtin.debug:            msg: "package #{{ idx + 1 }} = {{ item }}"          loop: "{{ packages }}"          loop_control:            index_var: idx        - name: CONDITIONAL (when) — only if caching is enabled          ansible.builtin.debug:            msg: "cache is ON"          when: feature_flags.enable_cache | bool        - name: Run a command and REGISTER its output          ansible.builtin.command: date +%Y-%m-%d          register: date_out          changed_when: false        - name: SET a derived fact from the registered value          ansible.builtin.set_fact:            deploy_stamp: "{{ app_name | to_slug }}-{{ date_out.stdout }}"        - name: Show the derived fact          ansible.builtin.debug:            var: deploy_stamp        - name: Run our CUSTOM MODULE (system_report)          system_report:            label: "{{ inventory_hostname }}"            threshold: 90          register: sysrep        - name: Show custom module output          ansible.builtin.debug:            var: sysrep.report        - name: BLOCK with rescue/always (error handling)          block:            - name: This fails on purpose              ansible.builtin.command: /bin/false            - name: Never reached              ansible.builtin.debug:                msg: "unreachable"          rescue:            - name: Recover gracefully              ansible.builtin.debug:                msg: "caught the failure — recovering"          always:            - name: Always run cleanup              ansible.builtin.debug:                msg: "cleanup runs no matter what"        - name: Use a VAULT-encrypted secret (decrypted at runtime)          ansible.builtin.debug:            msg: "token prefix={{ api_secret_token[:3] }}*** len={{ api_secret_token | length }}"        - name: TEMPLATE a report file (tagged 'report')          ansible.builtin.template:            src: report.txt.j2            dest: "https://www.marktechpost.com/tmp/{{ inventory_hostname }}_report.txt"          tags: [report]    - name: Role demo      hosts: web1      gather_facts: false      roles:        - role: webserver """)

We write the main playbook that brings together variables, custom filters, loops, conditionals, registered outputs, derived facts, and a custom module. We intentionally include a failing command to demonstrate error handling through block, rescue, and always. We also use a Vault-encrypted secret and apply the web server role to demonstrate how role-based automation works in a real workflow.

banner("STEP 2 — Ansible Vault: encrypting an inline secret") write("vault_pass.txt", "colab-demo-vault-passn") enc = subprocess.run(    "ansible-vault encrypt_string 'S3cr3t-Token-42' --name 'api_secret_token'",    shell=True, cwd=BASE, env=ENV, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True ).stdout with open(os.path.join(BASE, "group_vars/webservers.yml"), "w") as f:    f.write("---n")    f.write(enc) print("group_vars/webservers.yml now contains:n") print(open(os.path.join(BASE, "group_vars/webservers.yml")).read()) sh("ansible-inventory -i inventory.ini --graph",      "STEP 3 — Static inventory graph") sh("ansible-inventory -i dynamic_inventory.py --list", "STEP 4 — Dynamic inventory (JSON)") sh("ansible all -m ping",                 "STEP 5 — Ad-hoc: ping all hosts") sh("ansible web1 -m setup -a 'filter=ansible_python_version'",   "STEP 6 — Ad-hoc: gather a single fact")

We create a Vault password file and encrypt an inline secret that Ansible decrypts automatically when the playbook runs. We inspect both the static and dynamic inventories to understand how Ansible reads hosts, groups, and metadata. We then run ad-hoc commands to ping all hosts and gather a specific Python version fact from web1.

sh("ansible-playbook playbook.yml --check --diff", "STEP 7 — Dry run (--check)") sh("ansible-playbook playbook.yml",                "STEP 8 — Real run") sh("ansible-playbook playbook.yml",                "STEP 9 — Re-run (idempotency: expect 0 changed)") sh("ansible-playbook playbook.yml --tags report",  "STEP 10 — Run only tasks tagged 'report'") sh("echo '--- /tmp/www/index.html ---'; cat /tmp/www/index.html; "   "echo; echo '--- /tmp/web1_report.txt ---'; cat /tmp/web1_report.txt",   "STEP 11 — Generated files") sh('ansible webservers --limit web1 -m debug -a "var=api_secret_token"',   "STEP 12a — Inline vault secret decrypted at runtime") write("secrets.yml", """    ---    db_password: full-file-secret-99    api_key: abc123 """) sh("ansible-vault encrypt secrets.yml",        "STEP 12b — Encrypt a WHOLE file") sh("head -c 60 secrets.yml; echo ' ...'") sh("ansible-vault view secrets.yml",           "STEP 12c — View the fully-encrypted file") banner("DONE — you now have a working advanced Ansible lab in Colab") print(f"Workspace: {BASE}nEdit any file there and re-run a step with the sh() helper.")

We run the playbook in check mode, execute it for real, and rerun it to confirm that the workflow is idempotent. We use tags to run only the report-related task and then inspect the generated HTML and text report files. We also demonstrate full-file Vault encryption, safely view the encrypted file, and complete the advanced Ansible lab.

In conclusion, we have a working Ansible lab that demonstrates how automation workflows are structured and executed in real projects. We created reusable roles, generated files from Jinja2 templates, ran custom Python-based Ansible modules, handled errors with rescue and always blocks, encrypted secrets with Ansible Vault, and validated our setup through dry runs and repeated idempotent executions. We also learned how static and dynamic inventories work, how tags help us run selected tasks, and how Ansible organizes infrastructure automation in a clean, repeatable, and production-friendly way.

Check out the Full Codes with Notebook here. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

Sana Hassan

Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.