Hi,
Sorry for the delay, I was a bit distracted earlier this week.
Tobias has fixed issue with implicit branch matchers[1].
I've mostly completed the effort to store logs in swift. We still need
to add the icons if we want them to show up in the generated HTML
indexes, but otherwise, it should be ready. For those interested, I
expect OpenStack to start using this at scale shortly.
The changes to add line comments to Gerrit (and add the framework for
GitHub to use later) are in. I still …
[View More]need to see if we can map line
numbers across speculative changes in the mergers.
We released Zuul 3.2.0.
Folks also may be interested to know that in OpenStack, we've seen a
significant performance improvement on the executors by upgrading to
more recent kernels (4.15.0).
Please help write next week's email by updating this etherpad:
https://etherpad.openstack.org/p/zuul-update-email
-Jim
[1] https://review.openstack.org/588201
[View Less]
Hey all,
Podcast.__init__, which is a python related podcast, recently
interviewed me about Zuul:
https://www.podcastinit.com/zuul-with-monty-taylor-episode-172/
If you think you can handle listening to me talk for an hour, check it out.
For those of you who think you've heard all of my old grandpa stories
... this podcast contains the story of when Jim and I first met.
Monty
Hi,
This past week has seen:
Paul doing some work to fix and better test the ansible synchronize
task. Also fixing an issue related to the child-jobs changes from last
week.
I made progress on both the swift log uploads and file comments efforts.
I should be able to test swift uploads this week.
Tobias has been working on the ability to pause a job while its children
run (this will also be helpful for container work).
Tristan has staged several changes implementing the containers spec.
I …
[View More]plan on releasing Zuul 3.2.0 shortly.
Please help write next week's email by updating this etherpad:
https://etherpad.openstack.org/p/zuul-update-email
-Jim
[View Less]
Hi,
I've published what I hope is the final revision of the container spec.
Please take a look and leave feedback and a vote.
I spent part of last week working on a new zuul-jobs role to upload logs
to swift. I will continue that this week. I'm hoping that with two
viable log upload roles, we'll end up with less ambiguity around log
uploading. Paul is also performing work to improve roles around log
handling.
Once that's done, I'll return to my work on line comments.
I believe Tristan …
[View More]has fixed the problem with zuul-web which was blocking
our release last week. However, we did notice a memory leak in
OpenStack-Infra. It's possible that it's already fixed by patches that
Tobias happened to already be working on, or it may not. I'd like to
observe our instance for a few more days to see how it behaves. But
hopefully we can release Zuul this week.
We're also planning on burning in the current master version of Nodepool
with the idea of making a release there before Shrews goes on vacation.
Please help write next week's email by updating this etherpad:
https://etherpad.openstack.org/p/zuul-update-email
-Jim
[View Less]
Greetings,
I'd like to open a discussion around the idea of changing how base jobs work,
today base jobs are stored in a trusted config project. Mostly, this is done
because certain aspects of a base jobs need a secret, for example uploading logs
to logs.exexmple.org. We also have other tasks in a base job, lets say
configure-mirrors, which pre-populates the mirror info on a node from nodepool.
As a recap, the main thing to remember about a config project, or trusted job,
is we cannot …
[View More]leverage the speculative testing around jobs. We first need to
merge a patch, then hope the job continues to work.
In openstack-infra, we've create an idea of base / base-test jobs and some
instructions[1] on how to interact with them.
Now back to configure-mirrors role above, I believe there isn't really a reason
for the role to live in a trusted job, nothing it does will affect zuul security
(think secrets) or interact with zuul-executor (delagate_localhost). And given
that changes to the role need to do the base / base-test dance outlined above,
I'd like to make the following suggestion.
We keep the idea of base job, but we split it into base (trusted) and base
(untrusted). Given that today zuul cannot support 2 jobs with the same name, I
think it would look more like.
foo-zuul-jobs (untrusted project)
- name: base
parent: base-minimal
config-project (trusted)
- name: base-minimal
parent: null
This means all jobs, now parent to an untrusted project job, with the main
benefit that speculative to base are more easily tested. And things that really
need to be locked down (eg secrets), still live in the trusted base-minimal job
(NOTE: open to renames).
In the case of openstack-infra[2], I actually think a lot of the roles in
pre.yaml can be moved into untrusted jobs, if we followed this idea. The
post-run tasks, are almost all trusted context.
Mind you, we don't really need to make this change in zuul, but I'm more
thinking we should use it as the default recommendation for new users. I'm
looking to start implementing the change in rdoproject.org but wanted to get the
idea out before I proposed changes.
[1] http://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul.d/jo…
[2] http://git.openstack.org/cgit/openstack-infra/project-config/tree/playbooks…
[View Less]
Hi,
Some recent (and not-so-recent) issues with log hosting in OpenStack
Infra have caused a renewed interest in reworking our log storing
infrastructure. Some of the motivation and much of the initial work is
OpenStack-specific, so I have written a post to the openstack-infra list
describing that. Technically, none of what is in there affects the
wider Zuul community, but folks reading this may want to read that
message for background (or, if you follow our patterns closely, it may
affect …
[View More]you as well).
That message is here:
http://lists.openstack.org/pipermail/openstack-infra/2018-July/006020.html
Also, like that message, this one is the result of several
conversations, especially with Monty; many of these ideas are his.
Essentially, that message describes a process for us to begin hosting
our logs in swift instead of on a file server. I'll note here that Zuul
is (and will continue to be, at least for the foreseeable future)
agnostic about where logs are hosted. Log storage is implemented as
roles in base jobs, so any site can store logs where and how they wish.
We can help each other by sharing the roles which perform these tasks.
OpenStack moving its log storage to swift doesn't mean anyone else needs
to -- storing logs on a sever (or even the executor) will still be
supported. As would storing in S3, or any other system for which
someone is motivated to write a role.
That message also describes decommissioning our log-hosting server which
happens to provide a number of useful services today. Among the these
are dynamic generation of directory indexes, and HTMLification of text
logs. Our plan is to pre-generate that data, however, that's not ideal,
as it expands and duplicates the amount of data stored, and it also
creates more work for the executors.
We can expand the role of the Zuul dashboard to take on some of those
duties, thereby reducing the storage requirements, and offloading the
computation to the browser of the user requesting it. In the process,
we can make a better experience for users navigating log files, all
without compromising Zuul's independence from backend storage
mechanisms.
All of what follows should apply equally to file or swift-based storage.
Let's start by creating a per-build page in the dashboard. Currently,
we show a list of builds, and link to the log locations for those
builds. In the future, we should link to a new page with details about
the build. Not only so that we can show more info (which we already
collect) than can be shown on one line, but also so that we have a place
for the additional functionality which follows.
If we have the uploader also create and upload a file manifest (say,
zuul-manifest.json), then the build page can fetch[1] this file from the
log storage and display an index of artifacts for the build. This can
allow for access to artifacts directly within the Zuul web user
interface, without the discontinuity of hitting an artifact server for
the index. Since swift (and other object storage systems) may not be
capable of dynamically creating indexes, this allows the functionality
to work in all cases and removes the need to pre-generate index pages
when using swift.
We can also extend Zuul to store additional artifact information about a
build. For example, in our docs build jobs, we override the success-url
to link to the generated doc content. If we return the build location
as additional information about the build, we can highlight that in a
prominent place on the build page. In the future, we may want to leave
the success-url alone so that users visit the build page and then follow
a link to the preview. It would be an extra click compared to what we
do today, but it may be a better experience in the long run. Either way
would still be an option, of course, according to operator preference.
Another important feature we currently have in OpenStack is a piece of
middleware we call "os-loganalyze" (or OSLA) which HTMLifies text logs.
It dynamically parses log files and creates anchors for each line (for
easy hyperlink references) and also highlights and allows filtering by
severity.
We could implement this in javascript, so that when a user browses from
the build page to a text file, the javascript component could fetch[1]
the file and perform the HTMLification as OSLA does today. If it can't
parse the file, it will just pass it through unchanged. And in the
virtual file listing on the build page, we can add a "raw" link for
direct download of the file without going through the HTMLification.
This would eliminate the need for us to pre-render content when we
upload to swift, and by implementing this in a generic manner in
javascript in Zuul, more non-OpenStack Zuul users would benefit from
this functionality.
Any unknown files, or existing .html files, should just be direct links
which leave the Zuul dashboard interface (we don't want to attempt to
proxy or embed everything).
Finally, even today Zuul creates a job-output.json, with the idea that
we would eventually use it as a substitute for job-output.txt when
reviewing logs. The build page would be a natural launching point for a
page which read this json file and created a more interactive and
detailed version of the streaming job output.
In summary, by moving some of the OpenStack-specific log processing into
Zuul-dashboard javascript, we can make it more generic and widely
applicable, and at the same time provide better support for multiple log
storage systems.
-Jim
[1] Swift does support specifying CORS headers. Users would need to do
so to support this, and users of static file servers would need to do
something similar.
[View Less]
Hi
Last week Tristan fixed issues with direct enqueueing of changes with
ambiguous project names, Improved timer trigger to skip un-needed
projects, and improving Github Event filter debugging.
Tobias fixed a potentially serious bug with contamination of
configuration. He's working on improving job freeze performance, and
parent jobs which pause and continue to run while their children run
(for resources and artifact exchange during a buildset).
This week I plan on finishing the container …
[View More]spec. I've also written up
some ideas for log storage which I may do some work on.
We're waiting on a resolution to an issue with multi-tenancy in the new
Javascript code. Once that's resolved, or we revert the new JS code, we
should be in position to make a new release.
Please help write next week's email by updating this etherpad:
https://etherpad.openstack.org/p/zuul-update-email
-Jim
[View Less]
Hi,
I think the container spec is just about ready, but one part of it
recently prompted a discussion that I think could use some wider input.
Specifically, this is regarding the "container-as-machine" use-case. So
think "run pep8 in a container" when you're thinking about this.
There's a whole bunch of stuff in the spec about container-native
workloads. This isn't it.
The issue is whether or not Nodepool should be able to build container
images, much as we do today for machines.
Note …
[View More]that regardless, Nodepool will be able to use "upstream" container
images from the local Kubernetes registry or Docker Hub. So this would
be an optional feature (just as Nodepool can use extant cloud images).
There are a few reasons to build local images (of any kind):
* To provide a git cache.
* To have more up to date packages than those provided externally.
* To have a backup of a working image if something upstream changes.
* To have a standardized image.
* To add local requirements.
To be fair, the last two might be the same thing said different ways.
In an offline discussion prompted by the spec, Monty has suggested that
running a git mirror in Kubernetes may obviate the need for using custom
images for a git cache [related: applying this to VM images as well
could reduce image sizes there; he's planning on writing up thoughts
on this more fully later]. So if we take that out of the equation, the
question becomes:
Are the remaining reasons for building an image locally sufficient to
build this functionality into Nodepool?
A quick clarification: we're obviously not going to write a new image
builder either way. We're talking about having Nodepool run some
existing container image process on a periodic schedule and upload the
result to the local registry. The main benefit to Nodepool doing this
is using the same images across different registries, having a daemon
performing the image builds automatically, and the ability to roll-back
to previous versions.
If you have thoughts on whether this is useful, or if it should be
omitted in favor of relying more heavily on base images or external
systems, please let me know. I'll try to collect feedback and update
the spec early next week. Then we can merge it. :)
-Jim
[View Less]