June 2020 PTG Recap
cboylan at sapwetik.org
Mon Jun 8 21:13:14 UTC 2020
Last week we both helped host the PTG (with the meetpad service) and attended the event as participants. Now that the event is over I'll try to recap things for us.
On the meetpad side of things we scaled up the Jitsi Video Bridge service from a single instance to five. Based on Cacti data I think this scale up worked well. Unfortunately, it seems that Jitsi Meet is also limited by client side capabilities in addition to the server itself. In particular, having many people in a room with video enabled impacts the performance of that conference in clients. It seems that using Chrome/Chromium, setting your local settings to low bandwidth mode, and simply reducing the number of people using video helps. Overall, I think the service did well considering its recency, and the teams that used it were happy. In general that seemed to be groups with 20 or fewer participants.
Since then we've scaled the Jitsi Video Bridge service down to two instances. This should give us reasonable capacity for non PTG periods of time, while consuming a reasonable set of resources. We may also want to look into running the Jibri service which would enable us to stream a conference. This way only active participants would need to fit into that ~20 person limit and lurkers can view via a live stream.
Our time at the PTG as participants also went well. We had three 2 hour blocks across a range of time zones which enabled us to talk about a variety of subjects. I'll try to call out notable items in this email, but you can find all of the notes we kept at https://etherpad.opendev.org/p/opendev-virtual-ptg-june-2020
We are starting to dig into what is required to upgrade Gerrit now that it is running on containers. We expect we'll upgrade from 2.13 to 2.16 and then sit there for a while. One reason for this is 2.16 is the last version with the existing UI. 3.0 and beyond drop the existing UI in favor of PolyGerrit UI. Since 2.16 has both UIS this gives us the chance to have people transition over to the new UI. Once we are ready to go to 3.0 that should be a trivial upgrade as the only change was removal of the current existing UI.
We do still need to figure out if one large single outage is our best option or if we should upgrade to 2.14, perform online migrations, then upgrade to 2.15, perform online migrations, and finally repeat with an online upgrade to 2.16. To determine which option is best for us we plan to spin up a copy of the production server and try these different options to see how reliable they are and how much time is required to perform each one.
To enable more updates of configuration management from Puppet to Ansible, Monty plans to set up Ansible driven testing of our existing puppetry with testinfra. Using that, we can then replace the Puppet with Ansible and be confident things continue to work with our testinfra tests. Hopefully, this makes it easier for more people to get involved in the configuration management updates.
Another place where we have "legacy" code that needs updating is porting any existing python2 utilities to python3. I'll try to get a list of those put up so that people can volunteer to do the conversions. The conversions themselves should be straightforward but not everyone knows what needs porting. We hope the list addresses this.
Using the data collected in the previous two items we may identify services and tools that should simply be retired rather than ported. One example could be pbx.openstack.org in favor of meetpad.opendev.org. Once we've got lists of work to be done we can evaluate if any of them need to be deleted and can make individual arguments for each one.
Over time we've become more and more aware that having a diverse set of identity options for login would make OpenDev more appealing to a broader set of users. One of the big speed bumps to making that change has been enacting a transition that doesn't break existing users. In particular the existing identity service in use, Ubuntu One, requires OpenID v1 and none of the modern identity brokers support this protocols. Jeremy has started to put a spec together on how we might do this, https://review.opendev.org/#/c/731838/, and during the PTG Kristi Nikolla made an excellent suggestion for using SimplySAMLphp to front OpenID v1 for identify brokers that don't speak OpenID v1. With that we now have a plan to move forward on making this a reality.
The topic of statistics, metrics, and logging came up again. We asked a few difficult questions that likely need further discussion in separate threads:
* How do we make it easier for people to build grafana dashboards?
* Should we replace Cacti with something like collectd + graphite/grafana?
* Should we consider shutting down the ELK stack since the main user (OpenStack) of this service is not really using it much anymore?
The future of Bup for backups was discussed briefly. Bup is still very python2 specific though I think they've recently started to work on a python3 port. I mentioned that I use borg backup for home backups happily, and like Bup, borg claims to support append only backups. Whether or not we want to switch to another tool and what that tool should be if so will require more investigation. Input is welcome.
Finally, we discussed generating a bit more involvement in OpenDev from our users. On the Advisory Board side of things I'll be sending out some gentle reminders in the near future to see if we can get anyone else to sign up. Then in a couple weeks move forward with those who have volunteered already. The thought is that starting small and leading by example will drive involvement. We also discussed advertising tools, like the docker image build tooling that OpenDev uses, towards audiences that might be inclined to use them, like Airship.
Involvement isn't limited to the Advisory Board or Zuul Jobs. If any of the topics you've just read sound interesting to you and you'd like to get involved please reach out. We're more than happy to incorporate others in what we are doing.
Thank you for reading this very long email,
More information about the service-discuss