On Mon, Apr 03, 2023 at 10:29:39AM -0700, Clark Boylan wrote:
Rebuilding on some platforms is important as .so files can move due to library updates and incompatibilities. I have this problem with libre2 on tumbleweed. That said for the CI system I don't think any of the platforms have this problem except maybe centos stream (theoretical and I suspect they won't do this to stream either)?
I think the re2 problem might be from pip caches? It builds a version against an old .so and then doesn't realise the packaged .so has bumped its version and never rebuilds it, and keeps grabbing it from its cache. But I certainly take the general point that libraries don't always make seamless updates and we need to handle that.
I'd like to solicit opinions on what we want this cache to do?
Thinking out loud here: what if we switched to maintaining an explicit list of packages we care about having wheels: libvirt-python, cryptography, cffi, lxml, etc. Some of these will already have wheels on pypi, some will only have wheels for x86 and not arm, and some won't have wheels at all. We could build those that are not available for the current platform and publish those.
If we use the audit tool to clear out everything we've unnecessarily cached from upstream, we could effectively do this by basically grepping for -py3-none-any.whl or some variation on. These are wheels we're making that seem to be not doing much other than zipping up source.
The downside is that this will probably require occasional maintenance on the OpenDev side to approve new packages to the list.
I think this is probably too big of a downside; I think if we have to introspect the job trying to decide what needs to be built where, we'll never maintain it.
I think a perfect solution here might involve making the entire publishing pipeline driven by changes to openstack/requirements.
My only concern with having openstack/requirements drive this is that we have talked about decoupling openstack from these builds in the past so that other projects can more easily take advantage of them. Driving this from requirements would probably kill those dreams.
This is a good point, and probably part of why it is like this... but OTOH it is also quite tied to openstack/requirements now. Good to keep in mind.
I'm not sure I understand how Zuul semaphores help with the handling of build failures? Instead maybe we should just always publish since any wheel we do build should be valid. If we don't build a wheel for X and Y depends on X the pypi indexes upstream of us will already cause X to be used. We aren't really gaining much by not having the wheels we do build in the downstream published wheel cache index.
So all the jobs run and copy their wheels to their R/W partitions, but the release-wheel-cache job has a dependency on all of them [1] and so doesn't "vos release" the whole lot if any fail. I am guessing it was done like this to serialise the volume release and avoid having multiple "vos release" operations in flight at once. It seems like we could rework this now to use a semaphore, and have each platform run its own release job and have Zuul make sure they don't run together? -i [1] https://opendev.org/openstack/project-config/src/commit/d35367ededf6671ab982...