Re: Proposal to create zuul-operator project

7 Mar 2019

      On Thu, Mar 07, 2019 at 11:32:39AM +0900, Tristan Cacqueray wrote:
...
On Wed, Mar 06, 2019 at 19:06 Monty Taylor wrote:
...
On 3/6/19 6:12 PM, Paul Belanger wrote:
...
On Wed, Mar 06, 2019 at 05:31:11PM +0000, Monty Taylor wrote:
...
Hey everybody,
It seems that kubernetes operators are all the rage these days, and in the
last 2 weeks I've had multiple conversations that have all come back around
to someone wanting a zuul operator to exist.
Hello,
Thank you for starting this thread, It seems like a great oportunity to
converge the different deployment and operation tools.
Ideally, we would grow a hierarchy of operators to manage the gating
system, and then you could have an opendev-operator to compose:
* log-processor-operator
* elastic-search-operator
* paste-operator, etherpad-operator, ...
* gerrit-operator
I think these are fine to have, but I am on the fence if they are under
the scope of what zuul should do.  As an example, I can totally see
opendev.org wanting doing this. As long as everything is all
disconnected from zuul-operator, I would be okay. Personally, I don't
want zuul to managed a gerrit.
...
We also started to investigate operators for Software Factory component
and perhaps we could join force?
...
...
...
SO ...
I would like to propose that we create a zuul-operator project as part of
Zuul.
Given our close ties with Ansible, I think we should use ansible-operator
[0] to write it.
Since we have zuul in production on both vanilla k8s and on OpenShift, I
think we should set up gate jobs to make sure it works on both. Once we're
happy with it - we should publish it to operatorhub. [1]
Does anyone object? It not, I'll get a repo started.
I think it is a fine idea, I've also seen some talk about it recently
too.
I do think we should have some consensus on the bits we want to support
for the deployment. Maybe going as far as saying it is an opinionated
deployment, for better or worse. As an example, graphite / statsd is
that in play or not?
I think that's a great question.
I do believe it should be an opinionated deployment - but there are 
probably places where people will want to make some choices from a 
specific set of choices. Like - by default I think it should spin up a 
percona-xtradb cluster as provided by the percona operator. BUT - I 
could see a setting in teh main Zuul CRD like "use-existing-db" that 
takes a sqlalchemy url. That way if there is someone who has an existing 
db infrastructure, they could take advantage of it - but the Zuul 
operator gets you a working system out of the box.
Obviously we'll need to spin up a zookeeper. Not sure if there is an 
operator for that yet - we might want to make one or work with someone 
to make one.
There are actually 2 operators for ZooKeeper already listed in this page:
https://github.com/operator-framework/awesome-operators
...
I could see a graphite/statsd option with the ability to point to an 
existing graphite. Maybe we just start with CRD field that indicates the 
stats_host like we have now?
But - yes - opinionated, mostly batteries included - maybe some amount 
of optional things.
Should we discuss this with a spec? For example:
How are we going to configure the connections and the tenants'
configuration?
Software Factory batteries include a zuul-reconfigure job to manage the
main.yaml file. Should the zuul-operator setup such job, or perhaps
should we also need a zuul-quick-start-operator?
...
...
Also, could we maybe say a single distro for all container things? I
guess by default, that is what the current containers by zuul project is
running today. I don't much have the energy to get into the weeds of pip
/ dpkg / rpm for how things maybe get installed (if needed).
Yes. All of our container images as we are publishing them now are 
awesome, and are what I expect us to use.
I am categorically opposed to us making and publishing different images. 
One of the nice things about publishing container images is that you can 
do them once and do them well. We're using the python base image. It 
happens to be based on debian-strech - but that's irrelevant. It's 
running python.
That sounds great, but how are we going to cope with the next heartbleed
or shellshock? IIUC, we would need to wait for a fix to land in the
debian-stretch image, then wait for a new opendevorg/python-builder
image and finally wait for a new zuul image to be published.
Or what happen when the zuul image gets published with an incompatible
openstacksdk requirement?
Thus I think the base image and how the zuul bits gets bundled are
relevant. For example OpenShift users may want to use BuildConfig
instead so that a parent layer update automatically trigger childs'
layer rebuild.
Yah, this is what I was hoping we'd discuss more when we dove into the
weeds. Can a BuildConfig work with k8s? Or does some other k8s items
work with openshift? If not, I think it is fine if we only suppoert one
and not try to do everything. Which one we support, I don't really know,
hopefully others can say which one is better :) We can also support
both, but I feel that is a lot of work for minimal gain.
...
Regards,
-Tristan
...
I DO think we'll likely need to support specifying an image location in 
the CRD. (rook does this for ceph too) That's because I'd like for 
Tobias to be able to use the operator if he chooses, but he uses local 
builds of the images. And I think that's fine.
...
I am likely showing how much I don't know about k8s / openshift here,
but wanting to limit the amount of different ways to do a deployment
based on underlying distro bits.
+10000000000
I believe there is no value in producing divergent images.

...
_______________________________________________
Zuul-discuss mailing list
Zuul-discuss@lists.zuul-ci.org
http://lists.zuul-ci.org/cgi-bin/mailman/listinfo/zuul-discuss