Hello,
About a year ago Fedora 33 released and gave us a preview of OpenSSH's sha1 + RSA key deprecation fallout. Fedora 33 users noticed they could no longer use SSH RSA keys to connect to our Gerrit at review.opendev.org. This happens because Fedora 33's OpenSSH packaging has deprecated sha1 hashes for RSA, and despite both the client and server supporting rsa-sha2-* variants they couldn't negotiate their use between them. OpenSSH 8.8 released recently and did similar in the upstream software which means users with up to date OpenSSH installations are noticing similar problems (Arch Linux for example).
There are a couple of workarounds that you can use. Probably the best option is to use an ed25519 or ecdsa key with our Gerrit. Modern clients and our Gerrit SSHD negotiate these keys without issue. Less optimal is to manually re-enable the use of the ssh-rsa hash, but we recommend against this as your software providers have decided this is no longer secure enough.
On our end we've brought this up with the MINA SSHD devs [0] with the hope that the SSH implementation that Gerrit uses can be updated to negotiate the sha2 hashes properly. Also, the rsa-sha2 RFC indicates [1] clients may fallback to a sha2 variant instead of the sha1 variant which would workaround MINA's lack of support for negotiation in the protocol. If you are an OpenSSH>=8.8 or Fedora>=33 user you might consider filing bugs against your ssh clients to change the default fallback to a sha2 variant on your platforms.
[0] https://issues.apache.org/jira/browse/SSHD-1141
[1] https://datatracker.ietf.org/doc/html/rfc8332#section-3.3
Hopefully I've put enough keywords in this email that the various search engines will index it, and the next time someone runs into these problems they'll find this explanation.
Clark
Hello Fellow OpenStack and OpenDev Folks!
TL;DR click on [3] and enjoy.
I am starting this thread to not hijack the discussion happening on [1].
First of all, I would like to thank gibi (Balazs Gibizer) for hacking
a way to get the place to render the table in the first place (pun
intended).
I have been a long-time-now user of [2].
I have improved and customised it for myself but never really got to
share back the changes I made.
The new Gerrit obviously broke the whole script so it was of no use to
share at that particular state.
However, inspired by gibi's work, I decided to finally sit down and
fix it to work with Gerrit 3 and here it comes: [3].
Works well on Chrome with Tampermonkey. Not tested others.
I hope you will enjoy this little helper (I do).
I know the script looks super fugly but it generally boils down to a
mix of styles of 3 people and Gerrit having funky UI rendering.
Finally, I'd also like to thank hrw (Marcin Juszkiewicz) for linking
me to the original Michel's script in 2019.
[1] http://lists.openstack.org/pipermail/openstack-discuss/2020-November/019051…
[2] https://opendev.org/x/coats/src/commit/444c95738677593dcfed0cfd9667d4c4f0d5…
[3] https://gist.github.com/yoctozepto/7ea1271c299d143388b7c1b1802ee75e
Kind regards,
-yoctozepto
Hi,
To recap: currently production deployment jobs run sequentially. Zuul
starts the job on an executor, which is setup to log into the bastion
host. The job sets up the system-config playbooks on the bastion host
and Ansible is run from there against the production server.
To run in parallel, each job needs to not assume it owns the
system-config playbooks on the bastion host.
Each Zuul *buildset* can use the same system-config playbook checkout
though. To achieve this we need to rework the dependencies; each
production job needs to depend on a common source-setup job. Once the
source is setup on the bastion host, the actual production jobs can
run in parallel.
To the changes...
Firstly, I believe we're doing the setup steps for the executor to log
into bridge twice:
https://review.opendev.org/c/opendev/system-config/+/818190
removes this duplication, and should be safe to merge.
As pointed out in prior reviews when running in the periodic or hourly
pipelines each job overrides that bastion host checkout to master.
https://review.opendev.org/c/opendev/base-jobs/+/818189
moves this step into base-jobs, in preparation for only being done
once by the separate source-setup job. I believe this will be safe to
merge; system-config will just do it again in an idempotent way,
until:
https://review.opendev.org/c/opendev/system-config/+/818191
merges, which drops this step from system-config.
We can then merge the system-config job dependency updates in
https://review.opendev.org/c/opendev/system-config/+/807672
This should mean that all jobs not only rely on the correct base jobs,
but jobs that need certificates, etc. will be relying on the
letsencrypt job, etc. This should be safe to merge as nothing should
actually change, we just have stricter dependencies.
After this, I think we are ready to refactor the base jobs into the
two separate steps -- firstly setup the keys on the executor to log
into the bastion host, then setup the source to use on the bastion
host:
https://review.opendev.org/c/opendev/base-jobs/+/807807
This initial refactor should be safe to merge as it creates two new
jobs, but the existing base job keeps running both steps as-is.
Then we are ready for the penultimate change:
https://review.opendev.org/c/opendev/system-config/+/807808
This updates the system-config jobs to all depend on
"infra-prod-setup-src" which will be the canonical job that sets up
the source repository on bridge.o.o. All other jobs in the buildset
will depend on this job, ensuring consistency for a run.
This should also be safe, as it again doesn't actually change
ordering.
Once all this is in, we need the final change to enable parallel
running (and think about correct semaphores between periodic/hourly
and regular runs). That is yet to be written, but we have enough to
get to that point!
-i
For a little over a year Ian, Clark and I have been using the
multi-factor authentication feature of UbuntuOne SSO (i.e.
Launchpad) in order to more strongly secure the accounts we rely on
for OpenID logins to the Web interfaces of our services like Gerrit
and StoryBoard. It's gone smoothly, I think, and so we're probably
overdue on our plan to offer this capability to other OpenDev users.
Support for this is not enabled by default, your SSO account needs
to be a member of a group which is granted the feature. We have one
such group authorized for this purpose, which can be found here:
https://launchpad.net/~opendev-2fa
Please see the information and important caveats documented in the
group description. I expect the process would be something like
using the LP group members page to request membership for your
account, and then one of the group administrators would approve the
request, after which you would be able to proceed with configuration
of your token or other HOTP/TOTP authenticator.
I'm bringing it up here first for discussion, in order to see if
anyone has any concerns or related suggestions, but barring none
I'd like to move forward with a "soft" (quiet) call for wider
testing the first week of December.
--
Jeremy Stanley
We will meet with this agenda on November 30, 2021 at 19:00 UTC in #opendev-meeting:
== Agenda for next meeting ==
* Announcements
** Gerrit User Summit happening December 2&3 virtually.
* Actions from last meeting
* Specs Review
* Topics
** Improving OpenDev's CD throughput (clarkb 20211130)
*** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies
*** https://review.opendev.org/c/opendev/system-config/+/807808 Update system-config once per buildset.
*** https://review.opendev.org/c/opendev/base-jobs/+/818297/ Reduce actions needed to be taken in base-jobs.
*** Next up is thinking about semaphores and actually trying to run jobs in parallel. Dependencies should be correct.
** Zuul multi scheduler setup (clarkb 20211130)
*** Possible for zuul updates to break rolling system updates. Not sure if we've got signal on when that happens yet.
** User management on our systems (clarkb 20211130)
*** Ran into problems with matrix-gerritbot. There is an updated image we can pull and use to try again.
*** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system.
** UbuntuOne/Launchpad two-factor OpenID authentication availability (fungi 20211130)
*** http://lists.opendev.org/pipermail/service-discuss/2021-November/000298.html
** Adding a lists.openinfra.dev mailman site (fungi 20211130)
*** https://review.opendev.org/818826
** Proxying and caching Ansible Galaxy in our providers (fungi 20211130)
*** https://review.opendev.org/818787
** Gerrit Account cleanups (clarkb 20211130)
*** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml
* Open discussion
We will meet with this agenda on November 16, 2021 at 19:00 UTC in #opendev-meeting:
== Agenda for next meeting ==
* Announcements
** Gerrit User Summit happening December 2&3 virtually.
** clarkb out next week. Should we skip the meeting on November 23?
* Actions from last meeting
* Specs Review
* Topics
** Improving OpenDev's CD throughput (clarkb 20211116)
*** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies
**** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs.
***** Example 1: Combine service-gitea-lb and service-gitea jobs.
***** Example 2: Combine letsencrypt and nameserver jobs
***** Example 3: Have all jobs with webserver config express a dependency on the letsencrypt job
**** Suggest we document the known job dependencies in a human readable format, then encode this into zuul, then we can switch to parallel runs.
**** https://review.opendev.org/c/opendev/system-config/+/807672
***** should list dependencies for all jobs
***** zuul doesn't trigger on this? not sure on best approach to make it mergable
**** https://review.opendev.org/c/opendev/base-jobs/+/807807
***** currently every executor adds keys for bridge, then logs in and clones system-config before running playbooks
***** this change makes split jobs to do this. however, production remains the same as both are called.
**** https://review.opendev.org/c/opendev/system-config/+/807808
***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning.
***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run
** Gerrit Account cleanups (clarkb 20211116)
*** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml
** Zuul multi scheduler setup (clarkb 20211116)
*** Zuul is currently running with two schedulers (zuul01.o.o and zuul02.o.o with zuul02.o.o being "primary")
*** Did first rolling restart of schedulers over the weekend.
*** Zuul-web should return consistent results now as it talk to ZooKeeper directly.
** User management on our systems (clarkb 20211116)
*** Give gerritbot and matrix-gerritbot a shared user: https://review.opendev.org/c/opendev/system-config/+/816769/
*** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system.
** Caching openstack/openstack on our DIB images (clarkb 20211116)
*** There are semi frequent errors when updating the DIB cache for openstack/openstack
*** Seems related to verifying or updating submodule content.
*** Should we simply stop caching this repo entirely? It isn't really used for much.
* Open discussion
Hello, we will meet on November 9, 2021 at 19:00UTC in #opendev-meeting with this agenda (note the DST change in many parts of the world):
== Agenda for next meeting ==
* Announcements
* Actions from last meeting
* Specs Review
* Topics
** Improving OpenDev's CD throughput (clarkb 20211109)
*** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies
**** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs.
***** Example 1: Combine service-gitea-lb and service-gitea jobs.
***** Example 2: Combine letsencrypt and nameserver jobs
***** Example 3: Have all jobs with webserver config express a dependency on the letsencrypt job
**** Suggest we document the known job dependencies in a human readable format, then encode this into zuul, then we can switch to parallel runs.
**** https://review.opendev.org/c/opendev/system-config/+/807672
***** should list dependencies for all jobs
***** zuul doesn't trigger on this? not sure on best approach to make it mergable
**** https://review.opendev.org/c/opendev/base-jobs/+/807807
***** currently every executor adds keys for bridge, then logs in and clones system-config before running playbooks
***** this change makes split jobs to do this. however, production remains the same as both are called.
**** https://review.opendev.org/c/opendev/system-config/+/807808
***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning.
***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run
** Gerrit Account cleanups (clarkb 20211109)
*** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml
** Zuul multi scheduler setup (clarkb 20211109)
*** Zuul is currently running with two schedulers (zuul01.o.o and zuul02.o.o with zuul02.o.o being "primary")
*** We have tracked down a number of bugs Monday with corvus fixing many of them.
*** Overall seems stable enough.
*** Note the "flapping" status page can be weird.
** User management on our systems (clarkb 20211109)
*** Be explicit about uid/gid ranges: https://review.opendev.org/c/opendev/system-config/+/816869/
**** 0-999 system, 1000-1999 unallocated, 2000-2999 for infra-root users, 3000-9999 host level users, 10k - 64k container users that need uids on the host as well for bind mounts.
*** Clean up unused bootstrapping users: https://review.opendev.org/c/opendev/system-config/+/816771
*** Give gerritbot and matrix-gerritbot a shared user: https://review.opendev.org/c/opendev/system-config/+/816769/
*** Eventually convert mariadb container's from uid 999 to something that makes more sense on the system.
* Open discussion
We will meet on November 2, 2021 at 19:00 UTC in #opendev-meeting with this agenda:
== Agenda for next meeting ==
* Announcements
** Gerrit User Summit details arriving soon. I've been told they would be interested to hear from us on how we do automated testing and management of Gerrit.
* Actions from last meeting
* Specs Review
** Mailman 3 spec https://review.opendev.org/810990
* Topics
** Improving OpenDev's CD throughput (clarkb 20211102)
*** We can run many of our jobs in parallel in all of our CD pipelines. But this requires we properly document/address dependencies
**** Need to understand our job dependencies and properly note them in Zuul config or address them by combining jobs.
***** Example 1: Combine service-gitea-lb and service-gitea jobs.
***** Example 2: Combine letsencrypt and nameserver jobs
***** Example 3: Have all jobs with webserver config express a dependency on the letsencrypt job
**** Suggest we document the known job dependencies in a human readable format, then encode this into zuul, then we can switch to parallel runs.
**** https://review.opendev.org/c/opendev/system-config/+/807672
***** should list dependencies for all jobs
***** zuul doesn't trigger on this? not sure on best approach to make it mergable
**** https://review.opendev.org/c/opendev/base-jobs/+/807807
***** currently every executor adds keys for bridge, then logs in and clones system-config before running playbooks
***** this change makes split jobs to do this. however, production remains the same as both are called.
**** https://review.opendev.org/c/opendev/system-config/+/807808
***** this is a follow-on that adds a base job to clone system-config, and stops the other production jobs re-cloning.
***** this job must run first, but then all other jobs can run in parallel, as they are all in the same buildset and using the same "view" of system-config for that particular run
** Gerrit Account cleanups (clarkb 20211102)
*** 33 conflicts remain. Clarkb has written notes on proposed plans for each user in the comments of review02:~clarkb/gerrit_user_cleanups/audit-results-annotated.yaml
** Fedora 34 test node booting problems (clarkb 20211102)
*** Changes to Fedora's kernel packaging broke Xen
*** Not yet sure if that may have also somehow broken OVH and iweb.
** Zuul multi scheduler setup (clarkb 20211102)
*** Zuul ran with two schedulers for the first time over the last weekend.
*** First jobs started by one scheduler and completed by another ran.
*** Had to revert due to bugs in caching.
*** Expect Zuul restarts as the scale out scheduler work in Zuul progresses.
** FIPS testing in our CI system (clarkb 20211102)
*** There is interest in testing various pieces of software against FIPS enabled systems in our Zuul.
*** We are not building special FIPS images instead a Zuul role exists to update supported platforms and boot them into FIPS mode.
*** Some tests have problems with ephemeral state being lost if the FIPS role runs too late. Then tests fail for unexpected reasons. Worth checking on reboot ordering relative to other test setup if there are problems.
* Open discussion