Recent nodepool label changes

James E. Blair jim at acmegating.com
Wed Apr 7 01:55:27 UTC 2021


Hi,

I recently spent some time trying to figure out why a job worked as
expected during one run and then failed due to limited memory on the
following run.  It turns out that back in February this change was
merged on an emergency basis, which caused us to start occasionally
providing nodes with 32G of ram instead of the typical 8G:

  https://review.opendev.org/773710

Nodepool labels are designed to represent the combination of an image
and set of resources.  To the best of our ability, the images and
resources they provide should be consistent across different cloud
providers.  That's why we use DIB to create consistent images and that's
why we use "-expanded" labels to request nodes with additional memory.
It's also the case that when we add new clouds, we generally try to
benchmark performance and adjust flavors as needed.

Unfortunately, providing such disparate resources under the same
Nodepool labels makes it impossible for job authors to reliably design
jobs.

To be clear, it's fine to provide resources of varying size, we just
need to use different Nodepool labels for them so that job authors get
what they're asking for.

The last time we were in this position, we updated our Nodepool images
to add the mem= Linux kernel command line parameter in order to limit
the total available RAM.  I suspect that is still possible, but due to
the explosion of images and flavors, doing so will be considerably more
difficult this time.

We now also have the ability to reboot nodes in jobs after they come
online, but doing that would add additional run time for every job.

I believe we need to address this.  Despite the additional work, it
seems like the "mem=" approach is our best bet; unless anyone has other
ideas?

-Jim



More information about the service-discuss mailing list