<div dir="ltr"><div class="gmail_default" style="font-size:small">Wow, that's some good open data. </div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Is there a single entry page, from which I could've discovered that myself?</div><div class="gmail_default" style="font-size:small">And is that all related to a single OpenStack cluster that's being run by the community or is it build related infrastructure?</div><div class="gmail_default" style="font-size:small"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 30, 2020 at 6:02 PM Jeremy Stanley <<a href="mailto:fungi@yuggoth.org">fungi@yuggoth.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 2020-04-16 12:52:28 +0200 (+0200), Marcel Hild wrote:<br>
> I would start with monitoring data, i.e. all prometheus metrics<br>
> being collected.<br>
<br>
I agree that's a good route to take initially. Using OpenDev as an<br>
example, while obviously not the same systems and not necessarily<br>
constrained by the same risks, we try to keep all non-sensitive<br>
monitoring and trending data public, like:<br>
<br>
<a href="http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=25&rra_id=all" rel="noreferrer" target="_blank">http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=25&rra_id=all</a><br>
<br>
<a href="http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1" rel="noreferrer" target="_blank">http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1</a><br>
<br>
> But those in itself are not really useful without a picture of the<br>
> infrastructure and access to the infrastructure.<br>
<br>
One thing we've done in that vein is to try to keep all our<br>
deployment and systems management notes (runbooks or whatever you<br>
like to term them) on a public site:<br>
<br>
<a href="https://docs.opendev.org/opendev/system-config/" rel="noreferrer" target="_blank">https://docs.opendev.org/opendev/system-config/</a><br>
<br>
We also try to publish logs for some services of interest if we can<br>
be certain they won't leak things like PII or credentials:<br>
<br>
<a href="https://nb01.openstack.org/" rel="noreferrer" target="_blank">https://nb01.openstack.org/</a> (sorry, self-signed cert on that one)<br>
<br>
> And you might need access to ticket systems, since you want to<br>
> correlate the metrics with incidents.<br>
<br>
Absolutely, though relying on a public incident/defect tracker does<br>
also mean you need to train your users to leverage privacy features<br>
for anything which may contain sensitive data, or to avoid putting<br>
such information in tickets and instead forwarding it via more<br>
confidential channels. It also means getting better about redacting<br>
certain classes of data either automatically or on request (and then<br>
dealing with fallout for the latter, treating that material as<br>
though it has leaked even if you've masked or deleted it after the<br>
fact to limit the damage).<br>
<br>
> My point is, we need to open up _all_ aspects of operations to<br>
> make it actually useful, otherwise it'll be just a pile of data.<br>
[...]<br>
<br>
For more general day-to-day operations, as well as scheduled<br>
maintenance or similar change activity, we've found it's useful to<br>
keep discussion on publicly-archived mailing lists and in publicly<br>
logged IRC channels so that they're easy to refer back to later.<br>
<br>
<a href="http://lists.opendev.org/pipermail/service-discuss/2020-April/thread.html" rel="noreferrer" target="_blank">http://lists.opendev.org/pipermail/service-discuss/2020-April/thread.html</a><br>
<br>
<a href="http://eavesdrop.openstack.org/irclogs/%23opendev/%23opendev.2020-04-30.log.html" rel="noreferrer" target="_blank">http://eavesdrop.openstack.org/irclogs/%23opendev/%23opendev.2020-04-30.log.html</a><br>
<br>
<a href="http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-04-10-17.00.log.html" rel="noreferrer" target="_blank">http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-04-10-17.00.log.html</a><br>
<br>
We're also using an IRC bot which our sysadmins can command to log<br>
important status information to a Web page:<br>
<br>
<a href="https://wiki.openstack.org/wiki/Infrastructure_Status" rel="noreferrer" target="_blank">https://wiki.openstack.org/wiki/Infrastructure_Status</a><br>
<br>
That allows us to easily publish notes about what we're doing which<br>
might have public impact, and we also tie it into a notification<br>
system which can echo messages in subscribed IRC channels or<br>
temporarily update their channel topics to reflect important service<br>
status information.<br>
<br>
Another major choice we've made is to perform installation and<br>
life-cycle management of a lot of our services through continuous<br>
deployment jobs in a public-facing CI/CD system:<br>
<br>
<a href="https://zuul.opendev.org/t/openstack/builds?pipeline=deploy" rel="noreferrer" target="_blank">https://zuul.opendev.org/t/openstack/builds?pipeline=deploy</a><br>
<br>
It does require careful thought, however, to make sure you're not<br>
exposing anything which could compromise the integrity or security<br>
of those systems.<br>
-- <br>
Jeremy Stanley<br>
_______________________________________________<br>
Openinfralabs mailing list<br>
<a href="mailto:Openinfralabs@lists.opendev.org" target="_blank">Openinfralabs@lists.opendev.org</a><br>
<a href="http://lists.opendev.org/cgi-bin/mailman/listinfo/openinfralabs" rel="noreferrer" target="_blank">http://lists.opendev.org/cgi-bin/mailman/listinfo/openinfralabs</a><br>
</blockquote></div>