New subject: [service-announce] October 20 Gerrit Outage Update

20 Oct 2020

      Hi everyone,

I'm happy to see that things are back in order, however, I *hate* to
be that person,
but I think there are still some hard questions that we need to answer together
transparently.

I am especially concerned because of how this affects the workflow of developers
overall but also the security measures we have in place, should something like
Zuul be targeted instead.

I've added some questions in-line below

Thanks,
Mohammed

On Tue, Oct 20, 2020 at 8:33 PM Ian Wienand <iwienand@redhat.com> wrote:
...
As of this mail, Gerrit access has been restored.  Please read on for
important information, especially around change verification.
Background
-----------
On 2020-10-20 at 01:30 a user unexpectedly added a workflow approval
to a change that they were not expected to have access to.  At 02:06
UTC an alert was raised via IRC.  Administrators found the account had
added themselves to a core group and made the +W vote.  The account
was disabled, and removed from the groups it had added itself to by
02:55 UTC.  Administrators began to analyse the situation and Gerrit
was taken offline at 04:02 UTC to preserve state and allow for
analysis.
From this time, administrators were working on log collection and
analysis, along with restoring backups for comparison purposes.
By around 08:45 UTC it was clear that the privilege escalation had
been achieved by gaining control of a Launchpad SSO account with
Gerrit administrator privileges.  By this time, we had ruled out
software vulnerabilities.  Logs showed the first unauthorized access
of the administrator account in Gerrit on 2020-10-06.  Communication
with Launchpad admins agrees with this analysis.  We saw one session
opened as the administrator user to StoryBoard on this same day, but
logs show no data was modified or hidden stories viewed.
So, just to be clear, someone who had root access to our Gerrit installation
had their account compromised which resulted in this (and not something
that occurred as a by-product of some other service -- say storyboard -- leaking
some sort of information?)

I see two issues in this at the moment:

- There is no need for us to have anyone with admin powers to Gerrit
at all times,
we've done enough automation to sustain us and a manual 'circuit breaker' of
adding a user *IF* necessary should be put in place.
- If the above is not possible, anyone who is part of this group should have 2FA
enabled inside Launchpad's SSO.

I would very much prefer the first option rather than the second one.

If it was an individual's account that was accessed and not a system account,
have we audited that there are not other things that might have been accessed
such as resources relating to Zuul, other systems and potentially
rotating/auditing all
our infrastructure?
...
Analysis has been performed on the Gerrit database and git trees from
October 1st, pre-dating any known unauthorized access.
Access was restored at around 2020-10-21 00:00 UTC
Outcomes
-----------
The following has been verified:
The administrator account used has been disabled and credentials
 updated
We have verified that all group and user addition/removals since
 Oct 1 are valid.  The only invalid additions were made by the
 compromised administrator account to add a single user account to
 the Administrators group; and then that account added itself to
 another known group.
The account given administrator privilege has been removed from
 the groups it added itself to and is disabled.
There is no evidence of any unauthorized access via methods other
 than Gerrit HTTP and Gerrit SSH access.
No commits have been pushed to git trees bypassing code review.
 Every git tree has been compared to the Oct 1 version and all
 commits have been correctly inserted via Gerrit changes.
I saw this artifact, I have no idea if it was put into consideration, but,
food for thought:

https://review.opendev.org/#/c/758881/
...
The version of Gerrit we use stores HTTP API passwords in
 plain-text.  We know that a limited number of passwords were
 gathered via the HTTP API and it is possible passwords were
 gathered via the database.  We thus have assumed that all HTTP API
 passwords have been disclosed.  This password needs to be
 explicitly enabled by users, and many users do not have it
 enabled.
Remediation
-----------
This leaves us with the following remediation actions:
Users should double-check their Launchpad recent activity at
 https://login.launchpad.net/activity for any suspicious logins.  If
 found, please notify the OpenDev admins in Freenode #opendev and
 Launchpad admins in #launchpad immediately.
All HTTP API passwords have been cleared.  If you push changes via
 HTTPS (instead of typical SSH), are a gertty user, or run a CI
 system or something else that communicates with the Gerrit HTTP
 API, you will need to regenerate a password.
Any SSH keys added to accounts since 2020-10-01 have been removed.
 This affects only a limited number of accounts.  This is done in
 an abundance of caution, and we do not believe any accounts had
 unauthorized SSH keys added
We should audit all changes for projects since 2020-10-01.
We have no evidence that any account had its ssh keys compromised,
thus we can rule out any unauthorized changes being uploaded via SSH.
However we can not conclusively rule out that compromised HTTP API
passwords were used to push a change through Gerrit. For example, a
change could be uploaded that looks like it came from a user, or the
API key of a core team member may have been used to approve a change
without authorization.
Given our extensive analysis we consider it exceedingly unlikely that
this vector was used.  We have had no notifications of users seeing
unexpected changes either uploaded by them, or approved by them in
projects they work on.  This said, we believe it is important to
inform the community of this very unlikely, but still possible,
vulnerability of the source code.
To this end, we have prepared a list of all changes from the known
affected period which should be audited for correctness.  These are
available at
https://static.opendev.org/project/opendev.org/gerrit-diffs/
Team members should browse these changes and make sure they were
correctly approved in Gerrit.  If any change looks suspicious you
should notify OpenDev administrators in Freenode #opendev immediately.
Further actions
----------------
We are planning the following for the short term future:
The Opendev administrators will be looking at alternative models
    for Gerrit admin account management.
We are already well into planning and testing a coming upgrade to
    a version of Gerrit which does not store plain-text API keys.
Longer term, we've written a spec for replacing Launchpad SSO as
    our authentication provider.
We thank you for your patience during this trying time, and we look
forward to returning to supporting the community doing what it does
best -- working together to create great things.
Thank you for this.  I'd also like to raise the question of moving forward, how
to be able to track these things.  We had a user that had full root
access to our
Gerrit installation for ~2 weeks without our knowledge entirely, only uncovered
when they did something (that, in the grand scheme of things, was relatively
trivial, compared to what could have happened).

What can we do to set up the necessary infrastructure to ensure that
these things
are monitored.  OpenDev is considered to be critical infrastructure
for this entire
community and there's not much that an outsider can do other than the
'keyholders'
for the resources.

We've historically refused to have any monitoring and now things like this have
slipped up, I'm just worried that we have a big looming thing coming
up ahead of us
that will catch us off guard and we'll be completely unprepared for it...
...
_______________________________________________
service-announce mailing list
service-announce@lists.opendev.org
http://lists.opendev.org/cgi-bin/mailman/listinfo/service-announce
-- 
Mohammed Naser
VEXXHOST, Inc.

Re: [service-announce] October 20 Gerrit Outage Update

Mohammed Naser

Jeremy Stanley

Ghanshyam Mann

Jeremy Stanley

Jeremy Stanley

Ian Wienand

Mohammed Naser

Jeremy Stanley

tags

participants (4)