Infra Team Meeting Agenda for November 16, 2021

We will meet with this agenda on November 16, 2021 at 19:00 UTC in #opendev-meeting:

== Agenda for next meeting ==

* Announcements
** Gerrit User Summit happening December 2&3 virtually.
** clarkb out next week. Should we skip the meeting on November 23?

* Actions from last meeting

* Specs Review

* Topics
** Improving OpenDev's CD throughput (clarkb 20211116)
*** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies
**** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs.
***** Example 1: Combine service-gitea-lb and service-gitea jobs.
***** Example 2: Combine letsencrypt and nameserver jobs
***** Example 3: Have all jobs with webserver config express a dependency on the letsencrypt job
**** Suggest we document the known job dependencies in a human readable format, then encode this into zuul, then we can switch to parallel runs.
***** should list dependencies for all jobs
***** zuul doesn't trigger on this?  not sure on best approach to make it mergable
***** currently every executor adds keys for bridge, then logs in and clones system-config before running playbooks
***** this change makes split jobs to do this.  however, production remains the same as both are called.
***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning.
***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run
** Gerrit Account cleanups (clarkb 20211116)
*** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml
** Zuul multi scheduler setup (clarkb 20211116)
*** Zuul is currently running with two schedulers (zuul01.o.o and zuul02.o.o with zuul02.o.o being "primary")
*** Did first rolling restart of schedulers over the weekend.
*** Zuul-web should return consistent results now as it talk to ZooKeeper directly.
** User management on our systems (clarkb 20211116)
*** Give gerritbot and matrix-gerritbot a shared user:
*** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system.
** Caching openstack/openstack on our DIB images (clarkb 20211116)
*** There are semi frequent errors when updating the DIB cache for openstack/openstack
*** Seems related to verifying or updating submodule content.
*** Should we simply stop caching this repo entirely? It isn't really used for much.

* Open discussion

