
Great — below is a concrete, production-ready playbook for building a Network Automation pipeline using Ansible + NetBox + GitHub Actions. It’s written as a hands-on blueprint you can follow step-by-step, with example files (playbooks, inventory plugin config, GitHub Actions workflows), best practices (testing, safety, rollbacks), and operational guidance (monitoring, RBAC, audits).
I’ll assume you manage modern network devices (Cisco/Arista/Juniper, Linux routers) and want a GitOps-style workflow where NetBox is the source-of-truth (SoT) and Ansible enforces state.
Network Automation Pipeline Playbook
Stack: NetBox (SoT) → Git repo (configs & playbooks) → GitHub Actions CI → Ansible (NAPALM/Netmiko) → Network devices Goals: safe, auditable, testable, promotable changes across environments (dev → staging → prod)
Contents
- Architecture & components
- Repo layout (recommended)
- NetBox: modelling & integration (SoT)
- Ansible: config patterns & examples
- Inventory: NetBox dynamic inventory config
- GitHub Actions: CI/CD workflows (lint → plan/dry-run → apply)
- Testing & validation (ansible-lint, molecule, dry-run, check_mode)
- Safety & rollback strategies
- Secrets management & security
- Observability, auditing & compliance
- Deployment checklist & runbook
1 — Architecture (high level)
- NetBox: authoritative inventory, IPAM, device roles, sites, structured variables (custom fields / device config snippets).
- Git repo:
infrastructure/networkholds playbooks, roles, templates, and device config templates; PRs are the change mechanism. - GitHub Actions: CI runs lint/tests and produces plan/dry-run output; merging to
maintriggers staged deploy (with approvals). - Ansible Controller: runs in CI (or separate runner) using credentials (stored in Vault/GitHub Secrets); uses NAPALM/Netmiko modules for device config.
- Monitoring: Prometheus + Grafana + SNMP exporter; NetBox webhooks for change notifications.
- Audit/tracking: all changes pushed to Git & NetBox change logs; device config backups saved to repo or network config vault.
2 — Repo layout (recommended)
1network-automation/
2├── ansible.cfg
3├── requirements.txt # ansible collections (napalm, netbox)
4├── inventories/
5│ └── netbox.yml # ansible-netbox inventory plugin config (dynamic)
6├── roles/
7│ ├── base_config/
8│ │ ├── tasks/
9│ │ ├── templates/
10│ │ └── vars/
11│ └── vlan_provision/
12│ ├── tasks/
13│ ├── templates/
14│ └── defaults/
15├── playbooks/
16│ ├── site.yml # main orchestration playbook
17│ ├── vlan.yml # example change playbook
18│ └── backup-configs.yml
19├── templates/
20│ └── device_config.j2
21├── tests/
22│ ├── molecule/ # optional molecule tests
23│ └── lint/ # sample test cases
24├── .github/
25│ └── workflows/
26│ ├── ci.yml # lint & dry-run
27│ └── deploy.yml # gated deploy
28└── docs/
29 └── runbook.md3 — NetBox: modelling & integration (SoT)
NetBox is the single source of truth for:
- Devices (name, device role, platform, site)
- Interfaces
- IPAM (prefixes, IP addresses)
- VLANs, VRFs, prefix assignments
- Custom fields for device-level config variables (e.g.,
device_config_template,os_version,management_ip)
Recommended NetBox setup
- Device roles:
leaf,spine,router,firewall,switch - Platforms:
cisco_ios,eos,junos, etc. (used to pick NAPALM driver) - Custom Fields (device-level, JSON):
config_vars— JSON blob for role-specific vars - Tags for environment:
env:dev,env:staging,env:prod - Secrets / credentials: do not store credentials in NetBox. Use Vault or GitHub Secrets.
Webhooks & automation
- Configure NetBox webhook to notify CI or run discovery jobs on device creation/changes.
- Use NetBox change logs for auditing.
4 — Ansible: patterns & example playbook
Key practices
- Use Ansible collections:
napalm,netcommon,network_cli,community.generalas needed. - Prefer NAPALM or network_cli modules over
rawCLI when possible. - Use idempotent templates (Jinja2) and named blocks.
- Use
check_mode: truefor dry-run planning and NAPALMget_config+compare_configpatterns where supported.
ansible.cfg (important defaults)
1[defaults]
2inventory = ./inventories/netbox.yml
3host_key_checking = False
4forks = 20
5timeout = 30
6retry_files_enabled = False
7command_warnings = False
8roles_path = ./roles
9nocows = 1
10stdout_callback = yaml
11deprecation_warnings=False
12
13[ssh_connection]
14pipelining=TrueExample playbooks/vlan.yml (adds a VLAN across a set of switches)
1---
2- name: Provision VLAN across edge switches
3 hosts: tag_env_prod:&platform_eos|platform_cisco_ios # example targeting
4 gather_facts: no
5 connection: network_cli
6
7 vars:
8 vlan_id: 100
9 vlan_name: "users"
10
11 tasks:
12 - name: Ensure VLAN exists (IOS/NXOS)
13 napalm_install_config:
14 hostname: "{{ inventory_hostname }}"
15 config: |
16 vlan {{ vlan_id }}
17 name {{ vlan_name }}
18 when: ansible_network_os in ['ios', 'nxos']
19
20 - name: Configure VLAN on Arista EOS (example)
21 eos_config:
22 lines:
23 - "vlan {{ vlan_id }}"
24 - " name {{ vlan_name }}"
25 when: ansible_network_os == 'eos'Note: Use
napalm_configornapalm_install_configas your device supports. Many devices supportcandidatecommit patterns — use them.
Playbook for config backup (backup-configs.yml)
1- name: Backup device configs
2 hosts: all
3 gather_facts: no
4 connection: network_cli
5 tasks:
6 - name: Get running-config
7 napalm_get:
8 getters:
9 - config
10 register: config
11
12 - name: Save config to repo-like structure
13 copy:
14 content: "{{ config['ansible_facts']['netconf_config']['running'] | default(config['config']) }}"
15 dest: "backups/{{ inventory_hostname }}-{{ ansible_date_time.iso8601 }}.cfg"5 — Inventory: NetBox dynamic inventory plugin
Use Ansible’s NetBox inventory plugin (or community netbox plugin). Example inventories/netbox.yml:
1plugin: netbox
2api_endpoint: https://netbox.example.com/api/
3token: "{{ lookup('env', 'NETBOX_API_TOKEN') }}"
4validate_certs: false
5group_by:
6 - site
7 - device_role
8 - platform
9compose:
10 ansible_host: primary_ip.address
11 ansible_network_os: platform.slug
12 management_ip: primary_ip.address
13filters:
14 # only include managed devices
15 status: ``active``Secrets: store NETBOX_API_TOKEN in GitHub Secrets or runner environment — never in repo.
6 — GitHub Actions: CI & CD workflows
CI workflow: lint + dry-run (.github/workflows/ci.yml)
- Run on PR to
main. - Steps: checkout, setup python, install deps, ansible-lint, yamllint, run
ansible-playbook --syntax-check, runansible-playbookin--check(dry-run) and capture changed/failed hosts.
1name: Network CI
2
3on:
4 pull_request:
5 branches: [ main ]
6
7jobs:
8 lint_and_dryrun:
9 runs-on: ubuntu-latest
10
11 steps:
12 - uses: actions/checkout@v4
13
14 - name: Set up Python
15 uses: actions/setup-python@v4
16 with:
17 python-version: '3.11'
18
19 - name: Install tools
20 run: |
21 python -m pip install --upgrade pip
22 pip install ansible ansible-lint yamllint pynetbox napalm
23
24 - name: ansible-lint
25 run: ansible-lint -v
26
27 - name: yamllint
28 run: yamllint .
29
30 - name: Syntax check
31 run: ansible-playbook playbooks/vlan.yml --syntax-check
32
33 - name: Dry-run (check mode)
34 env:
35 NETBOX_API_TOKEN: ${{ secrets.NETBOX_API_TOKEN }}
36 run: |
37 ansible-playbook playbooks/vlan.yml --check -i inventories/netbox.ymlDeploy workflow: gated deploy to staging/prod (.github/workflows/deploy.yml)
- Trigger: merge to
mainor manual workflow_dispatch. - Environments:
stagingauto deploy,productionrequires approval (GitHub Environments). - Steps: checkout, build artifacts, set up Ansible, fetch digests, run backup, run playbook (no
--check), commit config backup artifacts.
1name: Deploy Network Changes
2
3on:
4 workflow_dispatch:
5 inputs:
6 environment:
7 description: 'target env'
8 required: true
9 default: 'staging'
10 type: choice
11 options:
12 - staging
13 - production
14
15jobs:
16 deploy:
17 runs-on: ubuntu-latest
18 environment: ${{ github.event.inputs.environment }}
19
20 steps:
21 - uses: actions/checkout@v4
22
23 - name: Install packages
24 run: |
25 pip install ansible napalm pynetbox
26
27 - name: Backup configs (pre-change)
28 env:
29 NETBOX_API_TOKEN: ${{ secrets.NETBOX_API_TOKEN }}
30 run: |
31 ansible-playbook playbooks/backup-configs.yml -i inventories/netbox.yml
32
33 - name: Apply changes (live run)
34 env:
35 NETBOX_API_TOKEN: ${{ secrets.NETBOX_API_TOKEN }}
36 ANSIBLE_SSH_USER: ${{ secrets.NE_ANSIBLE_USER }}
37 ANSIBLE_SSH_PASS: ${{ secrets.NE_ANSIBLE_PASS }}
38 run: |
39 ansible-playbook playbooks/vlan.yml -i inventories/netbox.yml -e target_env=${{ github.event.inputs.environment }}Production approval: configure GitHub Environment production with required reviewers.
7 — Testing & validation
- Static tests:
ansible-lint,yamllint,molecule(for roles, usemolecule/driver/dokkenordocker). - Dry-run: Always run
ansible-playbook --checkand capturechanged/failed. - Plan output: For NAPALM-capable devices — use
napalm_compare_configpattern to produce diffs (store diffs as artifact or attach to PR). - Unit tests: For Jinja templates, test with
jinja2-cli+ sample data. - Staging gate: Apply to
stagingfirst, monitor for X minutes, then promote to prod (manual approval).
8 — Safety & rollback strategies
-
Pre-change backup: Always capture running-config and store snapshot in
backups/(or S3/Vault). -
Transaction support: Use device features (candidate/commit/rollback) when supported (JunOS, IOS-XR, NX-OS candidate config). Example with Napalm:
- fetch
get_configbefore change load_replace_candidate->compare_config->commit_configordiscard_config
- fetch
-
Rollback plan:
- If failed, run restore playbook to push previous config from backup.
- Keep rollback playbooks and test them in staging.
-
Change windows & maintenance mode: enforce maintenance window for production changes and notify stakeholders.
-
Rate limiting & ramp-up: limit parallelism via Ansible
serialparameter (e.g.,serial: 5) for gradual rollout.
9 — Secrets & security
-
Do NOT store credentials in repo.
-
Options:
- HashiCorp Vault with Ansible Vault lookup plugin.
- GitHub Secrets for GitHub Actions runners.
- Secrets manager (AWS Secrets Manager / Azure Key Vault) with role-based access.
-
Use per-device credentials when possible (service accounts with least privileges).
-
Use SSH certificates or centralized TACACS+/RADIUS for device access.
-
Secure APIs: NetBox token stored as secret, rotate periodically.
-
Audit access: restrict who can merge to
mainand require reviews.
10 — Observability, auditing & compliance
-
NetBox: Source-of-truth + change-log history.
-
Device config backups: store snapshots (S3 with lifecycle rules), link to PRs.
-
Prometheus & Grafana:
- SNMP exporter for device metrics (CPU, memory, interface counters).
- Dashboards for interface errors, CPU, temperature, and expected changes.
-
Alerting: Alertmanager for thresholds (interface errors > x, down devices).
-
Runbooks & Incident docs: store in
docs/runbook.mdand mount to cluster or as ConfigMap for quick access. -
Audit logs: Git history + NetBox audit + CI logs form audit chain.
11 — Deployment checklist & runbook (operational)
Before PR
- Update NetBox (create/modify object)
- Update playbook or vars in feature branch
- Add tests (ansible-lint fixes + unit tests)
Pull Request
- CI: ansible-lint, yamllint, syntax-check
- CI:
ansible-playbook --checkdry-run produced; review plan diffs - Peer review: network engineer signs off
- Merge to
maintriggersstagingdeploy
Staging
- Backup current configs (
backup-configs.yml) - Apply change to staging
- Monitor for 15–30 minutes (interface flaps, CPU)
- If OK, request production approval
Production
- Schedule maintenance window (if needed)
- Pre-backup production configs
- Deploy with serial (e.g.,
--limitorserial: 3) - Monitor for issues
- If failure: run restore playbook from backups & open incident
Post-change
- Tag release (e.g.,
network-change-2025-10-03) - Document in change log & NetBox notes
- Remove temporary credentials & rollout logs to secure storage
Concrete snippets & helpers
Example: Ansible role task roles/base_config/tasks/main.yml
1- name: Render configuration template
2 template:
3 src: device_config.j2
4 dest: /tmp/{{ inventory_hostname }}.cfg
5
6- name: Check connectivity (napalm)
7 napalm_get_facts:
8 register: facts
9
10- name: Push configuration (candidate/commit)
11 napalm_config:
12 commit_changes: true
13 replace: false
14 config: "{{ lookup('file', '/tmp/' + inventory_hostname + '.cfg') }}"Example: Using serial in playbook to ramp changes
1- hosts: tag_env_prod
2 serial: 3
3 tasks:
4 - name: Apply VLAN
5 import_role:
6 name: vlan_provisionExample: Generate plan/diff using NAPALM
Use device support: napalm_config with retrieve and compare.
Ansible modules / plugins vary by device — create a task that registers compare output and uploads as GitHub artifact.
Appendix: Useful commands for operators
- Dry run:
ansible-playbook playbooks/vlan.yml -i inventories/netbox.yml --check - Run live (staging):
ansible-playbook playbooks/vlan.yml -i inventories/netbox.yml -l tag_env_staging - Backup configs:
ansible-playbook playbooks/backup-configs.yml -i inventories/netbox.yml - Restore from backup (example):
ansible-playbook playbooks/restore-config.yml -e backup_file=backups/host-2025-10-03.cfg
Final notes — best practices & tips
- Start small: automate low-risk tasks first (backups, audits, VLANs) then move to routing changes.
- Enforce reviews: require two reviewers for production changes.
- Document rollback: every change PR must include rollback steps.
- Visibility: post CI dry-run diffs to PR for reviewers to inspect.
- Test fixtures: keep simulated devices (containers, virtual lab) for testing changes (Cisco VIRL, EVE-NG, or vendor simulators).
- Compliance: tag changes with ticket IDs, maintain CSRs in NetBox custom fields.