From cboylan at sapwetik.org Mon Nov 1 22:43:17 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 01 Nov 2021 15:43:17 -0700 Subject: Infra Team Meeting Agenda for November 2, 2021 Message-ID: <5f9aacb6-ba5b-4871-8280-e319975ed792@www.fastmail.com> We will meet on November 2, 2021 at 19:00 UTC in #opendev-meeting with this agenda: == Agenda for next meeting == * Announcements ** Gerrit User Summit details arriving soon. I've been told they would be interested to hear from us on how we do automated testing and management of Gerrit. * Actions from last meeting * Specs Review ** Mailman 3 spec https://review.opendev.org/810990 * Topics ** Improving OpenDev's CD throughput (clarkb 20211102) *** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies **** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs. ***** Example 1: Combine service-gitea-lb and service-gitea jobs. ***** Example 2: Combine letsencrypt and nameserver jobs ***** Example 3: Have all jobs with webserver config express a dependency on the letsencrypt job **** Suggest we document the known job dependencies in a human readable format, then encode this into zuul, then we can switch to parallel runs. **** https://review.opendev.org/c/opendev/system-config/+/807672 ***** should list dependencies for all jobs ***** zuul doesn't trigger on this? not sure on best approach to make it mergable **** https://review.opendev.org/c/opendev/base-jobs/+/807807 ***** currently every executor adds keys for bridge, then logs in and clones system-config before running playbooks ***** this change makes split jobs to do this. however, production remains the same as both are called. **** https://review.opendev.org/c/opendev/system-config/+/807808 ***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning. ***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run ** Gerrit Account cleanups (clarkb 20211102) *** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml ** Fedora 34 test node booting problems (clarkb 20211102) *** Changes to Fedora's kernel packaging broke Xen *** Not yet sure if that may have also somehow broken OVH and iweb. ** Zuul multi scheduler setup (clarkb 20211102) *** Zuul ran with two schedulers for the first time over the last weekend. *** First jobs started by one scheduler and completed by another ran. *** Had to revert due to bugs in caching. *** Expect Zuul restarts as the scale out scheduler work in Zuul progresses. ** FIPS testing in our CI system (clarkb 20211102) *** There is interest in testing various pieces of software against FIPS enabled systems in our Zuul. *** We are not building special FIPS images instead a Zuul role exists to update supported platforms and boot them into FIPS mode. *** Some tests have problems with ephemeral state being lost if the FIPS role runs too late. Then tests fail for unexpected reasons. Worth checking on reboot ordering relative to other test setup if there are problems. * Open discussion From cboylan at sapwetik.org Tue Nov 9 00:13:53 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 08 Nov 2021 16:13:53 -0800 Subject: Team Meeting Agenda for November 9, 2021 Message-ID: <3f64fa8d-a915-46be-9aa4-9bcd30c0dd0c@www.fastmail.com> Hello, we will meet on November 9, 2021 at 19:00UTC in #opendev-meeting with this agenda (note the DST change in many parts of the world): == Agenda for next meeting == * Announcements * Actions from last meeting * Specs Review * Topics ** Improving OpenDev's CD throughput (clarkb 20211109) *** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies **** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs. ***** Example 1: Combine service-gitea-lb and service-gitea jobs. ***** Example 2: Combine letsencrypt and nameserver jobs ***** Example 3: Have all jobs with webserver config express a dependency on the letsencrypt job **** Suggest we document the known job dependencies in a human readable format, then encode this into zuul, then we can switch to parallel runs. **** https://review.opendev.org/c/opendev/system-config/+/807672 ***** should list dependencies for all jobs ***** zuul doesn't trigger on this? not sure on best approach to make it mergable **** https://review.opendev.org/c/opendev/base-jobs/+/807807 ***** currently every executor adds keys for bridge, then logs in and clones system-config before running playbooks ***** this change makes split jobs to do this. however, production remains the same as both are called. **** https://review.opendev.org/c/opendev/system-config/+/807808 ***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning. ***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run ** Gerrit Account cleanups (clarkb 20211109) *** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml ** Zuul multi scheduler setup (clarkb 20211109) *** Zuul is currently running with two schedulers (zuul01.o.o and zuul02.o.o with zuul02.o.o being "primary") *** We have tracked down a number of bugs Monday with corvus fixing many of them. *** Overall seems stable enough. *** Note the "flapping" status page can be weird. ** User management on our systems (clarkb 20211109) *** Be explicit about uid/gid ranges: https://review.opendev.org/c/opendev/system-config/+/816869/ **** 0-999 system, 1000-1999 unallocated, 2000-2999 for infra-root users, 3000-9999 host level users, 10k - 64k container users that need uids on the host as well for bind mounts. *** Clean up unused bootstrapping users: https://review.opendev.org/c/opendev/system-config/+/816771 *** Give gerritbot and matrix-gerritbot a shared user: https://review.opendev.org/c/opendev/system-config/+/816769/ *** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system. * Open discussion From cboylan at sapwetik.org Tue Nov 16 00:13:53 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 15 Nov 2021 16:13:53 -0800 Subject: Infra Team Meeting Agenda for November 16, 2021 Message-ID: We will meet with this agenda on November 16, 2021 at 19:00 UTC in #opendev-meeting: == Agenda for next meeting == * Announcements ** Gerrit User Summit happening December 2&3 virtually. ** clarkb out next week. Should we skip the meeting on November 23? * Actions from last meeting * Specs Review * Topics ** Improving OpenDev's CD throughput (clarkb 20211116) *** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies **** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs. ***** Example 1: Combine service-gitea-lb and service-gitea jobs. ***** Example 2: Combine letsencrypt and nameserver jobs ***** Example 3: Have all jobs with webserver config express a dependency on the letsencrypt job **** Suggest we document the known job dependencies in a human readable format, then encode this into zuul, then we can switch to parallel runs. **** https://review.opendev.org/c/opendev/system-config/+/807672 ***** should list dependencies for all jobs ***** zuul doesn't trigger on this? not sure on best approach to make it mergable **** https://review.opendev.org/c/opendev/base-jobs/+/807807 ***** currently every executor adds keys for bridge, then logs in and clones system-config before running playbooks ***** this change makes split jobs to do this. however, production remains the same as both are called. **** https://review.opendev.org/c/opendev/system-config/+/807808 ***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning. ***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run ** Gerrit Account cleanups (clarkb 20211116) *** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml ** Zuul multi scheduler setup (clarkb 20211116) *** Zuul is currently running with two schedulers (zuul01.o.o and zuul02.o.o with zuul02.o.o being "primary") *** Did first rolling restart of schedulers over the weekend. *** Zuul-web should return consistent results now as it talk to ZooKeeper directly. ** User management on our systems (clarkb 20211116) *** Give gerritbot and matrix-gerritbot a shared user: https://review.opendev.org/c/opendev/system-config/+/816769/ *** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system. ** Caching openstack/openstack on our DIB images (clarkb 20211116) *** There are semi frequent errors when updating the DIB cache for openstack/openstack *** Seems related to verifying or updating submodule content. *** Should we simply stop caching this repo entirely? It isn't really used for much. * Open discussion From iwienand at redhat.com Wed Nov 17 05:04:33 2021 From: iwienand at redhat.com (Ian Wienand) Date: Wed, 17 Nov 2021 16:04:33 +1100 Subject: Parallel production jobs changes Message-ID: Hi, To recap: currently production deployment jobs run sequentially. Zuul starts the job on an executor, which is setup to log into the bastion host. The job sets up the system-config playbooks on the bastion host and Ansible is run from there against the production server. To run in parallel, each job needs to not assume it owns the system-config playbooks on the bastion host. Each Zuul *buildset* can use the same system-config playbook checkout though. To achieve this we need to rework the dependencies; each production job needs to depend on a common source-setup job. Once the source is setup on the bastion host, the actual production jobs can run in parallel. To the changes... Firstly, I believe we're doing the setup steps for the executor to log into bridge twice: https://review.opendev.org/c/opendev/system-config/+/818190 removes this duplication, and should be safe to merge. As pointed out in prior reviews when running in the periodic or hourly pipelines each job overrides that bastion host checkout to master. https://review.opendev.org/c/opendev/base-jobs/+/818189 moves this step into base-jobs, in preparation for only being done once by the separate source-setup job. I believe this will be safe to merge; system-config will just do it again in an idempotent way, until: https://review.opendev.org/c/opendev/system-config/+/818191 merges, which drops this step from system-config. We can then merge the system-config job dependency updates in https://review.opendev.org/c/opendev/system-config/+/807672 This should mean that all jobs not only rely on the correct base jobs, but jobs that need certificates, etc. will be relying on the letsencrypt job, etc. This should be safe to merge as nothing should actually change, we just have stricter dependencies. After this, I think we are ready to refactor the base jobs into the two separate steps -- firstly setup the keys on the executor to log into the bastion host, then setup the source to use on the bastion host: https://review.opendev.org/c/opendev/base-jobs/+/807807 This initial refactor should be safe to merge as it creates two new jobs, but the existing base job keeps running both steps as-is. Then we are ready for the penultimate change: https://review.opendev.org/c/opendev/system-config/+/807808 This updates the system-config jobs to all depend on "infra-prod-setup-src" which will be the canonical job that sets up the source repository on bridge.o.o. All other jobs in the buildset will depend on this job, ensuring consistency for a run. This should also be safe, as it again doesn't actually change ordering. Once all this is in, we need the final change to enable parallel running (and think about correct semaphores between periodic/hourly and regular runs). That is yet to be written, but we have enough to get to that point! -i From fungi at yuggoth.org Mon Nov 22 20:15:05 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 22 Nov 2021 20:15:05 +0000 Subject: UbuntuOne/Launchpad two-factor authentication Message-ID: <20211122201504.xqwvpt6t575wsjq2@yuggoth.org> For a little over a year Ian, Clark and I have been using the multi-factor authentication feature of UbuntuOne SSO (i.e. Launchpad) in order to more strongly secure the accounts we rely on for OpenID logins to the Web interfaces of our services like Gerrit and StoryBoard. It's gone smoothly, I think, and so we're probably overdue on our plan to offer this capability to other OpenDev users. Support for this is not enabled by default, your SSO account needs to be a member of a group which is granted the feature. We have one such group authorized for this purpose, which can be found here: https://launchpad.net/~opendev-2fa Please see the information and important caveats documented in the group description. I expect the process would be something like using the LP group members page to request membership for your account, and then one of the group administrators would approve the request, after which you would be able to proceed with configuration of your token or other HOTP/TOTP authenticator. I'm bringing it up here first for discussion, in order to see if anyone has any concerns or related suggestions, but barring none I'd like to move forward with a "soft" (quiet) call for wider testing the first week of December. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From mnaser at vexxhost.com Mon Nov 22 20:26:10 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Mon, 22 Nov 2021 15:26:10 -0500 Subject: UbuntuOne/Launchpad two-factor authentication In-Reply-To: <20211122201504.xqwvpt6t575wsjq2@yuggoth.org> References: <20211122201504.xqwvpt6t575wsjq2@yuggoth.org> Message-ID: On Mon, Nov 22, 2021 at 3:15 PM Jeremy Stanley wrote: > > For a little over a year Ian, Clark and I have been using the > multi-factor authentication feature of UbuntuOne SSO (i.e. > Launchpad) in order to more strongly secure the accounts we rely on > for OpenID logins to the Web interfaces of our services like Gerrit > and StoryBoard. It's gone smoothly, I think, and so we're probably > overdue on our plan to offer this capability to other OpenDev users. > > Support for this is not enabled by default, your SSO account needs > to be a member of a group which is granted the feature. We have one > such group authorized for this purpose, which can be found here: > > https://launchpad.net/~opendev-2fa > > Please see the information and important caveats documented in the > group description. I expect the process would be something like > using the LP group members page to request membership for your > account, and then one of the group administrators would approve the > request, after which you would be able to proceed with configuration > of your token or other HOTP/TOTP authenticator. For context, I've been doing this for many years now and it's been working very well for me. > I'm bringing it up here first for discussion, in order to see if > anyone has any concerns or related suggestions, but barring none > I'd like to move forward with a "soft" (quiet) call for wider > testing the first week of December. > -- > Jeremy Stanley -- Mohammed Naser VEXXHOST, Inc. From fungi at yuggoth.org Mon Nov 22 21:02:05 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 22 Nov 2021 21:02:05 +0000 Subject: UbuntuOne/Launchpad two-factor authentication In-Reply-To: References: <20211122201504.xqwvpt6t575wsjq2@yuggoth.org> Message-ID: <20211122210205.hkbyk7iqwfpoxdsx@yuggoth.org> On 2021-11-22 15:26:10 -0500 (-0500), Mohammed Naser wrote: [...] > For context, I've been doing this for many years now and it's been > working very well for me. [...] Thanks! I should have mentioned, it's probable at least some other users are doing this already, since it's been possible to request access directly from the LP admins. Having a group we can use to help our users opt into the feature offloads some of the burden from the LP volunteer maintainers and may also mean faster turn-around time on such requests. Another thing that would be good to know is what authenticators people are having luck using. I'm personally doing TOTP with a Librem Key (Purism branded NitroKey derivative with some custom features) and accessing it with the nitrocli utility, though I had to compile my own from its Rust sources since the version in Debian is too old to recognize my device. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From marcin.juszkiewicz at linaro.org Tue Nov 23 09:40:48 2021 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Tue, 23 Nov 2021 10:40:48 +0100 Subject: UbuntuOne/Launchpad two-factor authentication In-Reply-To: <20211122201504.xqwvpt6t575wsjq2@yuggoth.org> References: <20211122201504.xqwvpt6t575wsjq2@yuggoth.org> Message-ID: <97e0aa43-ab05-7325-2bd8-745910213226@linaro.org> W dniu 22.11.2021 o?21:15, Jeremy Stanley pisze: > For a little over a year Ian, Clark and I have been using the > multi-factor authentication feature of UbuntuOne SSO (i.e. > Launchpad) in order to more strongly secure the accounts we rely on > for OpenID logins to the Web interfaces of our services like Gerrit > and StoryBoard. That brings memories... In 2011 Canonical worked on adding multi-factor auth to Launchpad. I was member of beta testers group then and got my first Yubikey. During May 2012 there was Ubuntu Developer Summit in Oakland, USA. Each attendee got Yubikey to be able to use that functionality. In 2013 I left Canonical and most of groups on LP which resulted in disabling multi-factor. Will apply for new membership. From fungi at yuggoth.org Tue Nov 23 17:57:57 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 23 Nov 2021 17:57:57 +0000 Subject: UbuntuOne/Launchpad two-factor authentication In-Reply-To: <97e0aa43-ab05-7325-2bd8-745910213226@linaro.org> References: <20211122201504.xqwvpt6t575wsjq2@yuggoth.org> <97e0aa43-ab05-7325-2bd8-745910213226@linaro.org> Message-ID: <20211123175757.gehcqgaw6rfgsf3v@yuggoth.org> On 2021-11-23 10:40:48 +0100 (+0100), Marcin Juszkiewicz wrote: [...] > Will apply for new membership. Thanks! To be clear though, per my original message, I'm not planning to approve any new member requests for that group until at least next week (let's say December 1), in order to give others opportunity to object to my proposal here on the ML or in next week's IRC meeting, as this week's meeting was officially cancelled. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From cboylan at sapwetik.org Mon Nov 29 21:52:16 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 29 Nov 2021 13:52:16 -0800 Subject: Team Meeting Agenda for November 30, 2021 Message-ID: We will meet with this agenda on November 30, 2021 at 19:00 UTC in #opendev-meeting: == Agenda for next meeting == * Announcements ** Gerrit User Summit happening December 2&3 virtually. * Actions from last meeting * Specs Review * Topics ** Improving OpenDev's CD throughput (clarkb 20211130) *** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies *** https://review.opendev.org/c/opendev/system-config/+/807808 Update system-config once per buildset. *** https://review.opendev.org/c/opendev/base-jobs/+/818297/ Reduce actions needed to be taken in base-jobs. *** Next up is thinking about semaphores and actually trying to run jobs in parallel. Dependencies should be correct. ** Zuul multi scheduler setup (clarkb 20211130) *** Possible for zuul updates to break rolling system updates. Not sure if we've got signal on when that happens yet. ** User management on our systems (clarkb 20211130) *** Ran into problems with matrix-gerritbot. There is an updated image we can pull and use to try again. *** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system. ** UbuntuOne/Launchpad two-factor OpenID authentication availability (fungi 20211130) *** http://lists.opendev.org/pipermail/service-discuss/2021-November/000298.html ** Adding a lists.openinfra.dev mailman site (fungi 20211130) *** https://review.opendev.org/818826 ** Proxying and caching Ansible Galaxy in our providers (fungi 20211130) *** https://review.opendev.org/818787 ** Gerrit Account cleanups (clarkb 20211130) *** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml * Open discussion