[Rust-VMM] [PTG] Meeting Notes

Tue May 7 13:43:50 UTC 2019

Hi everyone!

Here are some notes about the PTG meeting that we had in Denver:

Licensing
---------

The dual licensing purpose is to make sure that Apache2 will not
conflict with GPLv2 licensed projects such as QEMU, which could
eventually use rust-vmm. The decision is to move from a dual
MIT+Apache2 proposal to a dual 3-clause BSD+Apache2. 3-clause BSD is
not incompatible with GPLv2, the same way MIT is not incompatible with
GPLv2. But the benefit of having 3-clause BSD instead of MIT is to not
conflict with the Crosvm existing code that uses 3-clause BSD.

CI
--

We currently have Buildkite running on kvm-ioctls crate. Buildkite runs
on x86-64 and aarch64. We need some Windows testing, to test the
abstraction patches proposed by Crowdstrike folks. Cloudbase will
provide the Windows server to run the Windows CI.

Proposal about having a dedicated “test” repo in the rust-vmm
organization. This would allow every crate to rely on this common
“test” repo to centralize the tests.
Also, we talked about creating a “dummy” VMM that would be a superset
VMM since it would pull every rust-vmm crate. This VMM would allow full
integration testing, additionally to the unit tests already running on
each crate.

The CI should rely on top of tree crate on every pull request, as we
want to test the latest master version. Because the CI will ensure that
a pull request is merged only if the CI passes, we can be sure that
master will never be broken. This is the reason why we can safely run
integration tests based on master branch from each crate.

Testing will be slightly different on a “release” pull request because
it will modify the Cargo.toml file to make sure we’re testing the right
version of every crate, before we can give the green light before
publishing.

Release/Crate publishing
------------------------

How will the crates.io publishing be done?

“All at once” vs “each crate can be updated at any time”. Because we
want to keep the CI simple, which means it will always test on top of
tree, we chose to go with “all at once” solution. Releases are cheap,
hence we will bump the versions of every crate every time we want to
update one or more crates.

We took the decision not to have any stable branches on our
repositories. The project is not mature enough to increase the
complexity of having one or more stable branches. With the same idea in
mind, we took the decision not to have any stable releases to
crates.io.

How to publish on crates.io with some human gatekeeping?

We didn’t take a decision regarding this question, but here are the two
discussed approaches:
We would create a bot having a crates.io key stored on Github with a
set of maintainers that can push the button to let the bot do the work.
OR
We would have manual publishing from any maintainer from the set of
maintainers keys.

The concern regarding the bot is about the key which needs to be stored
on Github (security concern about having the key being stolen).

Crosvm/Firecracker consuming rust-vmm crates
--------------------------------------------

Both projects expect to consume crates directly from crates.io, because
this means the crates are mature enough to be consumed.

The criteria for a crate to be published on crates.io is to have proper
documentation, tests, … as documented here: 
https://github.com/rust-vmm/community/issues/14#issue-408351841

QEMU’s interest in rust-vmm
---------------------------

QEMU could benefit from low level parts such as vm-memory crate to make
QEMU core parts more secure.

The other aspect is about vhost-user backends, as they should be able
to consume rust-vmm crates to be implemented in Rust, and be reused by
any VMM.

vm-sys-utils
------------

The first PR is ready and waiting for more review from Firecracker
folks. From Crosvm perspective, the PR is alright, but internal
projects (not only about VMM) are using sys-utils too. That’s the
reason why it’s not straightforward to replace their sys-utils crate
with the vm-sys-utils one, as they will have to write a sys-utils
wrapper on top of vm-sys-utils, so that other internal projects can
still consume the right set of utility functions. 

Vhost and vhost-user
--------------------

Vhost pull request has been submitted by Gerry and needs to be
reviewed. We didn’t spend time reviewing it during the PTG.

The vhost-user protocol needs to implement the protocol both for the
slave and the master. This way, the master can be consumed from the VMM
side, and the slave can be consumed from any vhost-user daemon
(interesting for any VMM that could directly reuse the vhost-user
backend). In particular, there is some ongoing work from Redhat about
writing a virtiofsd vhost-user daemon in Rust. This should be part of
the rust-vmm project, pulling the vhost-user-protocol.

The remaining question is to determine if the vhost backends, including
vhost-user (hence including the protocol itself) should live under a
single crate?

vm-allocator
------------

We had a discussion about using a vm-allocator crate as a helpful
component to decide about memory ranges (MMIO or PIO) for any memory
region related to a device.

Based on the feedback from Paolo, Alex and Stefan, we need to design
carefully this allocator if we want to be able to support PCI BAR
programming from the firmware or the guest OS. This means we should be
able to handle any sort of PCI reprogramming to update the ranges
chosen by the vm-allocator, since this is how the PCI spec is defined. 

Summary of the priorities
-------------------------

We could maintain a list of priorities in an etherpad (or any sort of
public/shared document), and at the end of each week, send the list of
hot topics and priorities to make sure everyone from the community is
on the same page.

Code coverage
-------------

Which tool should we use to control code coverage?

Kcov is one alternative but it seems like it’s not running on aarch64,
which is a blocker since the project wants to support multiple CPU
architectures. Kcov might be the one being picked up as at first, but
we need to investigate for other solutions.

Do we need to gate pull requests based on the coverage result?

The discussion went both ways on this topic, but I think the solution
we agreed upon was to gate based on a code coverage value. Now, an
important point about this value, it is not immutable, and based on the
manual review from the maintainers, we can lower this value if it makes
sense.
For instance, if some new piece of code is being added, it does not
mean that we have to implement test for the sake of keeping the
coverage at the same level. If maintainers, as they are smart people,
realize it makes no sense to test this new piece of code, then the
decision will be made to reduce the threshold.

Linters
-------

Part of the regular CI running on every pull request, we want to run
multiple linters to maintain a good code quality.

Fuzzing
-------

We want some fuzzing on the rust-vmm crates. Now the question is to
identify which one are the most unsafe crates. For instance, being able
to fuzz the virtqueues (part of the virtio crate) should be very
interesting to validate their proper behavior.

Also, fuzzing vhost-user backends when they will be part of the rust-
vmm project will be one very important task if we want to provide
secure backends for any VMM that could reuse them. 

Security process
----------------

At some point, the project will run into some nasty bugs considered as
real security threats. In order to anticipate when this day will come,
we should define a clear process on how to limit the impact on the
rust-vmm users, and to describe how to handle this issue (quick fix,
long term plan, etc...).

vmm-vcpu
--------

After a lot of discussions about the feasibility of having a trait for
Vcpu, we came up to the conclusion that without further proof and
justification the trait will provide any benefit, we should simply
split HyperV and KVM into separate packages. The reason is, we don’t
think those two hypervisors have a lot in common, and it might be more
efforts to try to find similarities rather than splitting them into
distincts pieces.

One interesting data point that we would like to look at in the context
of this discussion is about the work that Alessandro has been doing to
port Firecracker on HyperV. Being able to look at his code might be
helpful in understanding the fundamental differences between HyperV and
KVM.

vm-memory
---------

The pull request #10 is splitting the mmap functionality coming from
Linux, and it adds the support for the Windows mmap equivalent. The
code has been acknowledged by everybody as ready to be merged once the
comments about squashing and reworking the commit message will be
addressed.

vm-device
---------

We discussed about the vm-device issue that has been opened for some
time now. Some mentioned that it is important to keep the Bus trait
generic so that any VMM could still reuse it, adapting some wrappers
for devices if necessary.
Based on the comments on the issue, it was pretty confusing where
things will go with this crate, and that’s why we agreed on waiting for
the pull request to be submitted before going further into hypothetical
reviews and comments.

Samuel will take care of submitting the pull request for this.

Community README about rust-vmm goals
-------------------------------------

We listed the main points we wanted to mention on the README from the
community repository. Andreea took the AR to write the documentation
describing the goals and motivation behind the project, based on the
defined skeleton.

We also mentioned that having a github.io webpage for the project would
be a better way to promote the project. We will need to create a
dedicated repo for that, as part of the rust-vmm Github organization.
We will need to put some effort into putting this webpage together at
some point, the first step being to duplicate more or less the content
of the README.

Thanks,
Sebastien