Recent nodepool label changes

Wed Apr 7 17:33:22 UTC 2021

Sean Mooney <smooney at redhat.com> writes:

> im not sure why the issue is with allowing vms to have 32GB of ram.
> as job authors we should basically talor our jobs to fit the minium
> avaiable and if we get more ram then that a bonus.
> we should not be writing tempest jobs in particarl in such a way that
> more ram would break things out side of very speciric jobs.
> for example the whitebox tempest plug that litally ssh into the host
> vms to validate thing in the libvirt xml makes some assumiton about
> the env but i would consider it a bug in our plugin if it could not
> work with more ram.

I tried really hard to make it clear I have no problem with the idea
that we could have flavors with more ram.  I absolutely don't object to
that.

What I am saying is that there is definitely a problem with using a
label that has different amounts of ram in different providers.  It
causes jobs to behave differently.  Jobs that pass in one provider will
fail in another because of the ram difference.  I agree with you that as
job authors we should tailor our jobs to fit the minimum available ram.
The problem is that is nearly impossible if Nodepool randomly gives us
nodes with more ram.  We won't realize we have exceeded the minimum ram
until we hit a job on a provider with less ram after having exceeded it
on a provider with more ram.  This is not a theoretical issue -- you are
reading this message because I hit this problem after two test runs on a
recently started project.

> less ram we may have issue but more should not break any of our test
> or we should fix them.

There is an inherent contradiction in saying that more ram is okay but
less ram is not.  They are two sides of the same coin.  A job will not
break because it had more ram the first time, it will break because it
had less ram the second time.

The fundamental issue is that a Nodepool label describes an image plus a
flavor.  That flavor must be as consistent as possible across providers
if we expect job authors to be able to write predictable jobs.

> it seam very wasteful to me to boot a 32G vm and only use 8G of it.

It may seem that way, but the infrastructure provider has told us that
they have tuned their hardware purchases to that ratio of CPU/RAM, and
so we're helping out by doing this.

The more wasteful thing is people issuing rechecks because their jobs
pass in some providers and not others.

-Jim