Parallel production jobs changes

Ian Wienand iwienand at redhat.com
Wed Nov 17 05:04:33 UTC 2021


Hi,

To recap: currently production deployment jobs run sequentially.  Zuul
starts the job on an executor, which is setup to log into the bastion
host.  The job sets up the system-config playbooks on the bastion host
and Ansible is run from there against the production server.

To run in parallel, each job needs to not assume it owns the
system-config playbooks on the bastion host.

Each Zuul *buildset* can use the same system-config playbook checkout
though.  To achieve this we need to rework the dependencies; each
production job needs to depend on a common source-setup job.  Once the
source is setup on the bastion host, the actual production jobs can
run in parallel.

To the changes...

Firstly, I believe we're doing the setup steps for the executor to log
into bridge twice:

 https://review.opendev.org/c/opendev/system-config/+/818190

removes this duplication, and should be safe to merge.

As pointed out in prior reviews when running in the periodic or hourly
pipelines each job overrides that bastion host checkout to master.

 https://review.opendev.org/c/opendev/base-jobs/+/818189

moves this step into base-jobs, in preparation for only being done
once by the separate source-setup job.  I believe this will be safe to
merge; system-config will just do it again in an idempotent way,
until:

 https://review.opendev.org/c/opendev/system-config/+/818191

merges, which drops this step from system-config.

We can then merge the system-config job dependency updates in

 https://review.opendev.org/c/opendev/system-config/+/807672

This should mean that all jobs not only rely on the correct base jobs,
but jobs that need certificates, etc. will be relying on the
letsencrypt job, etc.  This should be safe to merge as nothing should
actually change, we just have stricter dependencies.

After this, I think we are ready to refactor the base jobs into the
two separate steps -- firstly setup the keys on the executor to log
into the bastion host, then setup the source to use on the bastion
host:

  https://review.opendev.org/c/opendev/base-jobs/+/807807

This initial refactor should be safe to merge as it creates two new
jobs, but the existing base job keeps running both steps as-is.

Then we are ready for the penultimate change:

  https://review.opendev.org/c/opendev/system-config/+/807808

This updates the system-config jobs to all depend on
"infra-prod-setup-src" which will be the canonical job that sets up
the source repository on bridge.o.o.  All other jobs in the buildset
will depend on this job, ensuring consistency for a run.

This should also be safe, as it again doesn't actually change
ordering.

Once all this is in, we need the final change to enable parallel
running (and think about correct semaphores between periodic/hourly
and regular runs).  That is yet to be written, but we have enough to
get to that point!

-i




More information about the service-discuss mailing list