From sstabellini at kernel.org Fri Oct 1 23:58:16 2021 From: sstabellini at kernel.org (Stefano Stabellini) Date: Fri, 1 Oct 2021 16:58:16 -0700 (PDT) Subject: [Rust-VMM] [Stratos-dev] Xen Rust VirtIO demos work breakdown for Project Stratos In-Reply-To: References: <87pmsylywy.fsf@linaro.org> <874ka68h96.fsf@linaro.org> Message-ID: On Tue, 28 Sep 2021, Oleksandr Tyshchenko wrote: > On Tue, Sep 28, 2021 at 9:26 AM Stefano Stabellini wrote: > > Hi Stefano, all > > [Sorry for the possible format issues] > > > On Mon, 27 Sep 2021, Christopher Clark wrote: > > On Mon, Sep 27, 2021 at 3:06 AM Alex Benn?e via Stratos-dev wrote: > > > >? ? ? ?Marek Marczykowski-G?recki writes: > > > >? ? ? ?> [[PGP Signed Part:Undecided]] > >? ? ? ?> On Fri, Sep 24, 2021 at 05:02:46PM +0100, Alex Benn?e wrote: > >? ? ? ?>> Hi, > >? ? ? ?> > >? ? ? ?> Hi, > >? ? ? ?> > >? ? ? ?>> 2.1 Stable ABI for foreignmemory mapping to non-dom0 ([STR-57]) > >? ? ? ?>> ??????????????????????????????????????????????????????????????? > >? ? ? ?>> > >? ? ? ?>>? ?Currently the foreign memory mapping support only works for dom0 due > >? ? ? ?>>? ?to reference counting issues. If we are to support backends running in > >? ? ? ?>>? ?their own domains this will need to get fixed. > >? ? ? ?>> > >? ? ? ?>>? ?Estimate: 8w > >? ? ? ?>> > >? ? ? ?>> > >? ? ? ?>> [STR-57] > >? ? ? ?> > >? ? ? ?> I'm pretty sure it was discussed before, but I can't find relevant > >? ? ? ?> (part of) thread right now: does your model assumes the backend (running > >? ? ? ?> outside of dom0) will gain ability to map (or access in other way) > >? ? ? ?> _arbitrary_ memory page of a frontend domain? Or worse: any domain? > > > >? ? ? ?The aim is for some DomU's to host backends for other DomU's instead of > >? ? ? ?all backends being in Dom0. Those backend DomU's would have to be > >? ? ? ?considered trusted because as you say the default memory model of VirtIO > >? ? ? ?is to have full access to the frontend domains memory map. > > > > > > I share Marek's concern. I believe that there are Xen-based systems that will want to run guests using VirtIO devices without > extending > > this level of trust to?the backend domains. > > >From a safety perspective, it would be challenging to deploy a system > with privileged backends. From a safety perspective, it would be a lot > easier if the backend were unprivileged. > > This is one of those times where safety and security requirements are > actually aligned. > > > Well, the foreign memory mapping has one advantage in the context of Virtio use-case > which is that Virtio infrastructure in Guest doesn't require any modifications to run on top Xen. > The only issue with foreign memory here is that Guest memory actually mapped without its agreement > which doesn't perfectly fit into?the security model. (although there is one more issue with XSA-300, > but I think it will go away sooner or later, at least there are some attempts to eliminate it). > While the ability to map any part of Guest memory is not an issue for the backend running in Dom0 > (which we usually trust), this will certainly violate Xen security model if we want to run it in other > domain, so I completely agree with the existing concern. Yep, that's what I was referring to. > It was discussed before [1], but I couldn't find any decisions regarding that. As I understand, > the one of the possible ideas is to have some entity in Xen (PV IOMMU/virtio-iommu/whatever) > that works in protection mode, so it denies all foreign mapping requests from the backend running in DomU > by default and only allows requests with mapping which were *implicitly* granted by the Guest before. > For example, Xen could be informed which MMIOs hold the queue PFN and notify registers > (as it traps the accesses to these registers anyway) and could theoretically parse the frontend request > and retrieve descriptors to make a decision which GFNs are actually *allowed*. > > I can't say for sure (sorry not familiar enough with the topic), but implementing the virtio-iommu device > in Xen we could probably avoid Guest modifications at all. Of course, for this to work > the Virtio infrastructure in Guest should use DMA API as mentioned in [1]. > > Would the ?restricted foreign mapping? solution retain the Xen security model and be accepted > by the Xen community? I wonder, has someone already looked in this direction, are there any > pitfalls here or is this even feasible? > > [1] https://lore.kernel.org/xen-devel/464e91ec-2b53-2338-43c7-a018087fc7f6 at arm.com/ The discussion that went further is actually one based on the idea that there is a pre-shared memory area and the frontend always passes addresses from it. For ease of implementation, the pre-shared area is the virtqueue itself so this approach has been called "fat virtqueue". But it requires guest modifications and it probably results in additional memory copies. I am not sure if the approach you mentioned could be implemented completely without frontend changes. It looks like Xen would have to learn how to inspect virtqueues in order to verify implicit grants without frontend changes. With or without guest modifications, I am not aware of anyone doing research and development on this approach. From olekstysh at gmail.com Sat Oct 2 17:55:28 2021 From: olekstysh at gmail.com (Oleksandr Tyshchenko) Date: Sat, 2 Oct 2021 20:55:28 +0300 Subject: [Rust-VMM] [Stratos-dev] Xen Rust VirtIO demos work breakdown for Project Stratos In-Reply-To: References: <87pmsylywy.fsf@linaro.org> <874ka68h96.fsf@linaro.org> Message-ID: On Sat, Oct 2, 2021 at 2:58 AM Stefano Stabellini wrote: Hi Stefano, all [Sorry for the possible format issues] [I have CCed Julien] On Tue, 28 Sep 2021, Oleksandr Tyshchenko wrote: > > On Tue, Sep 28, 2021 at 9:26 AM Stefano Stabellini < > sstabellini at kernel.org> wrote: > > > > Hi Stefano, all > > > > [Sorry for the possible format issues] > > > > > > On Mon, 27 Sep 2021, Christopher Clark wrote: > > > On Mon, Sep 27, 2021 at 3:06 AM Alex Benn?e via Stratos-dev < > stratos-dev at op-lists.linaro.org> wrote: > > > > > > Marek Marczykowski-G?recki < > marmarek at invisiblethingslab.com> writes: > > > > > > > [[PGP Signed Part:Undecided]] > > > > On Fri, Sep 24, 2021 at 05:02:46PM +0100, Alex Benn?e > wrote: > > > >> Hi, > > > > > > > > Hi, > > > > > > > >> 2.1 Stable ABI for foreignmemory mapping to non-dom0 > ([STR-57]) > > > >> > ??????????????????????????????????????????????????????????????? > > > >> > > > >> Currently the foreign memory mapping support only > works for dom0 due > > > >> to reference counting issues. If we are to support > backends running in > > > >> their own domains this will need to get fixed. > > > >> > > > >> Estimate: 8w > > > >> > > > >> > > > >> [STR-57] > > > > > > > > I'm pretty sure it was discussed before, but I can't > find relevant > > > > (part of) thread right now: does your model assumes the > backend (running > > > > outside of dom0) will gain ability to map (or access in > other way) > > > > _arbitrary_ memory page of a frontend domain? Or worse: > any domain? > > > > > > The aim is for some DomU's to host backends for other > DomU's instead of > > > all backends being in Dom0. Those backend DomU's would > have to be > > > considered trusted because as you say the default memory > model of VirtIO > > > is to have full access to the frontend domains memory map. > > > > > > > > > I share Marek's concern. I believe that there are Xen-based > systems that will want to run guests using VirtIO devices without > > extending > > > this level of trust to the backend domains. > > > > >From a safety perspective, it would be challenging to deploy a > system > > with privileged backends. From a safety perspective, it would be a > lot > > easier if the backend were unprivileged. > > > > This is one of those times where safety and security requirements > are > > actually aligned. > > > > > > Well, the foreign memory mapping has one advantage in the context of > Virtio use-case > > which is that Virtio infrastructure in Guest doesn't require any > modifications to run on top Xen. > > The only issue with foreign memory here is that Guest memory actually > mapped without its agreement > > which doesn't perfectly fit into the security model. (although there is > one more issue with XSA-300, > > but I think it will go away sooner or later, at least there are some > attempts to eliminate it). > > While the ability to map any part of Guest memory is not an issue for > the backend running in Dom0 > > (which we usually trust), this will certainly violate Xen security model > if we want to run it in other > > domain, so I completely agree with the existing concern. > > Yep, that's what I was referring to. > > > > It was discussed before [1], but I couldn't find any decisions regarding > that. As I understand, > > the one of the possible ideas is to have some entity in Xen (PV > IOMMU/virtio-iommu/whatever) > > that works in protection mode, so it denies all foreign mapping requests > from the backend running in DomU > > by default and only allows requests with mapping which were *implicitly* > granted by the Guest before. > > For example, Xen could be informed which MMIOs hold the queue PFN and > notify registers > > (as it traps the accesses to these registers anyway) and could > theoretically parse the frontend request > > and retrieve descriptors to make a decision which GFNs are actually > *allowed*. > > > > I can't say for sure (sorry not familiar enough with the topic), but > implementing the virtio-iommu device > > in Xen we could probably avoid Guest modifications at all. Of course, > for this to work > > the Virtio infrastructure in Guest should use DMA API as mentioned in > [1]. > > > > Would the ?restricted foreign mapping? solution retain the Xen security > model and be accepted > > by the Xen community? I wonder, has someone already looked in this > direction, are there any > > pitfalls here or is this even feasible? > > > > [1] > https://lore.kernel.org/xen-devel/464e91ec-2b53-2338-43c7-a018087fc7f6 at arm.com/ > > The discussion that went further is actually one based on the idea that > there is a pre-shared memory area and the frontend always passes > addresses from it. For ease of implementation, the pre-shared area is > the virtqueue itself so this approach has been called "fat virtqueue". > But it requires guest modifications and it probably results in > additional memory copies. > I got it. Although we would need to map that pre-shared area anyway (I presume it could be done at once during initialization), I think it much better than map arbitrary pages at runtime. If there is a way for Xen to know the pre-shared area location in advance it will be able to allow mapping this region only and deny other attempts. > > I am not sure if the approach you mentioned could be implemented > completely without frontend changes. It looks like Xen would have to > learn how to inspect virtqueues in order to verify implicit grants > without frontend changes. I looked through the virtio-iommu specification and corresponding Linux driver but I am sure I don't see all the challenges and pitfalls. Having a limited knowledge of IOMMU infrastructure in Linux, below is just my guess, which might be wrong. 1. I think, if we want to avoid frontend changes the backend in Xen would need to fully conform to the specification, I am afraid that besides just inspecting virtqueues, the backend needs to properly and completely emulate the virtio device, handle shadow page tables, etc. Otherwise we might break the guest. I expect a huge amount of work to implement this properly. 2. Also, if I got the things correctly, it looks like when enabling virtio-iommu, all addresses passed in requests to the virtio devices behind the virtio-iommu will be in guest virtual address space (IOVA). So we would need to find a way for userspace (if the backend is IOREQ server) to translate them to guest physical addresses (IPA) via these shadow page tables in the backend in front of mapping them via foreign memory map calls. So I expect Xen, toolstack and Linux privcmd driver changes and additional complexity taking into account how the data structures could be accessed (data structures being continuously in IOVA, could be discontinuous in IPA, indirect table descriptors, etc). I am wondering, would it be possible to have identity IOMMU mapping (IOVA == GPA) at the guest side but without bypassing an IOMMU, as we need the virtio-iommu frontend to send map/unmap requests, can we control this behaviour somehow? I think this would simplify things. 3. Also, we would probably want to have a single virtio-iommu device instance per guest, so all virtio devices which belong to this guest will share the IOMMU mapping for the optimization purposes. For this to work all virtio devices inside a guest should be attached to the same IOMMU domain. Probably, we could control that, but I am not 100% sure. > With or without guest modifications, I am not > aware of anyone doing research and development on this approach. -- Regards, Oleksandr Tyshchenko -------------- next part -------------- An HTML attachment was scrubbed... URL: From fandree at amazon.com Mon Oct 4 10:33:51 2021 From: fandree at amazon.com (Florescu, Andreea) Date: Mon, 4 Oct 2021 10:33:51 +0000 Subject: [Rust-VMM] rust-vmm sync meeting Message-ID: <133626ae2f1540739cd13c116491cb63@EX13D10EUB003.ant.amazon.com> Update: refreshing the series. Meeting Agenda: https://etherpad.opendev.org/p/rust-vmm-sync-2021 ==============Conference Bridge Information============== You have been invited to an online meeting, powered by Amazon Chime. Chime meeting ID: 6592165432 Join via Chime clients (manually): Select 'Meetings > Join a Meeting', and enter 6592165432 Join via Chime clients (auto-call): If you invite auto-call as attendee, Chime will call you when the meeting starts, select 'Answer' Join via browser screen share: https://chime.aws/6592165432 Join via phone (US): +1-929-432-4463,,,6592165432# Join via phone (US toll-free): +1-855-552-4463,,,6592165432# International dial-in: https://chime.aws/dialinnumbers/ In-room video system: Ext: 62000, Meeting PIN: 6592165432# ================================================= ================Before your meeting:================ * Learn how to use the touch panel. * Prefer a video? Watch these touch panel how-to videos. * Find out more about room layouts. * Get more information at it.amazon.com/meetings. ================================================ Created with Amazon Meetings (fandree@, edit this series) Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/calendar Size: 9671 bytes Desc: not available URL: From sstabellini at kernel.org Mon Oct 4 21:53:26 2021 From: sstabellini at kernel.org (Stefano Stabellini) Date: Mon, 4 Oct 2021 14:53:26 -0700 (PDT) Subject: [Rust-VMM] [Stratos-dev] Xen Rust VirtIO demos work breakdown for Project Stratos In-Reply-To: References: <87pmsylywy.fsf@linaro.org> <874ka68h96.fsf@linaro.org> Message-ID: On Sat, 2 Oct 2021, Oleksandr Tyshchenko wrote: > On Sat, Oct 2, 2021 at 2:58 AM Stefano Stabellini wrote: > > Hi Stefano, all > > [Sorry for the possible format issues] > [I have CCed Julien] > > > On Tue, 28 Sep 2021, Oleksandr Tyshchenko wrote: > > On Tue, Sep 28, 2021 at 9:26 AM Stefano Stabellini wrote: > > > > Hi Stefano, all > > > > [Sorry for the possible format issues] > > > > > >? ? ? ?On Mon, 27 Sep 2021, Christopher Clark wrote: > >? ? ? ?> On Mon, Sep 27, 2021 at 3:06 AM Alex Benn?e via Stratos-dev wrote: > >? ? ? ?> > >? ? ? ?>? ? ? ?Marek Marczykowski-G?recki writes: > >? ? ? ?> > >? ? ? ?>? ? ? ?> [[PGP Signed Part:Undecided]] > >? ? ? ?>? ? ? ?> On Fri, Sep 24, 2021 at 05:02:46PM +0100, Alex Benn?e wrote: > >? ? ? ?>? ? ? ?>> Hi, > >? ? ? ?>? ? ? ?> > >? ? ? ?>? ? ? ?> Hi, > >? ? ? ?>? ? ? ?> > >? ? ? ?>? ? ? ?>> 2.1 Stable ABI for foreignmemory mapping to non-dom0 ([STR-57]) > >? ? ? ?>? ? ? ?>> ??????????????????????????????????????????????????????????????? > >? ? ? ?>? ? ? ?>> > >? ? ? ?>? ? ? ?>>? ?Currently the foreign memory mapping support only works for dom0 due > >? ? ? ?>? ? ? ?>>? ?to reference counting issues. If we are to support backends running in > >? ? ? ?>? ? ? ?>>? ?their own domains this will need to get fixed. > >? ? ? ?>? ? ? ?>> > >? ? ? ?>? ? ? ?>>? ?Estimate: 8w > >? ? ? ?>? ? ? ?>> > >? ? ? ?>? ? ? ?>> > >? ? ? ?>? ? ? ?>> [STR-57] > >? ? ? ?>? ? ? ?> > >? ? ? ?>? ? ? ?> I'm pretty sure it was discussed before, but I can't find relevant > >? ? ? ?>? ? ? ?> (part of) thread right now: does your model assumes the backend (running > >? ? ? ?>? ? ? ?> outside of dom0) will gain ability to map (or access in other way) > >? ? ? ?>? ? ? ?> _arbitrary_ memory page of a frontend domain? Or worse: any domain? > >? ? ? ?> > >? ? ? ?>? ? ? ?The aim is for some DomU's to host backends for other DomU's instead of > >? ? ? ?>? ? ? ?all backends being in Dom0. Those backend DomU's would have to be > >? ? ? ?>? ? ? ?considered trusted because as you say the default memory model of VirtIO > >? ? ? ?>? ? ? ?is to have full access to the frontend domains memory map. > >? ? ? ?> > >? ? ? ?> > >? ? ? ?> I share Marek's concern. I believe that there are Xen-based systems that will want to run guests using VirtIO devices > without > >? ? ? ?extending > >? ? ? ?> this level of trust to?the backend domains. > > > >? ? ? ?>From a safety perspective, it would be challenging to deploy a system > >? ? ? ?with privileged backends. From a safety perspective, it would be a lot > >? ? ? ?easier if the backend were unprivileged. > > > >? ? ? ?This is one of those times where safety and security requirements are > >? ? ? ?actually aligned. > > > > > > Well, the foreign memory mapping has one advantage in the context of Virtio use-case > > which is that Virtio infrastructure in Guest doesn't require any modifications to run on top Xen. > > The only issue with foreign memory here is that Guest memory actually mapped without its agreement > > which doesn't perfectly fit into?the security model. (although there is one more issue with XSA-300, > > but I think it will go away sooner or later, at least there are some attempts to eliminate it). > > While the ability to map any part of Guest memory is not an issue for the backend running in Dom0 > > (which we usually trust), this will certainly violate Xen security model if we want to run it in other > > domain, so I completely agree with the existing concern. > > Yep, that's what I was referring to. > > > > It was discussed before [1], but I couldn't find any decisions regarding that. As I understand, > > the one of the possible ideas is to have some entity in Xen (PV IOMMU/virtio-iommu/whatever) > > that works in protection mode, so it denies all foreign mapping requests from the backend running in DomU > > by default and only allows requests with mapping which were *implicitly* granted by the Guest before. > > For example, Xen could be informed which MMIOs hold the queue PFN and notify registers > > (as it traps the accesses to these registers anyway) and could theoretically parse the frontend request > > and retrieve descriptors to make a decision which GFNs are actually *allowed*. > > > > I can't say for sure (sorry not familiar enough with the topic), but implementing the virtio-iommu device > > in Xen we could probably avoid Guest modifications at all. Of course, for this to work > > the Virtio infrastructure in Guest should use DMA API as mentioned in [1]. > > > > Would the ?restricted foreign mapping? solution retain the Xen security model and be accepted > > by the Xen community? I wonder, has someone already looked in this direction, are there any > > pitfalls here or is this even feasible? > > > > [1] https://lore.kernel.org/xen-devel/464e91ec-2b53-2338-43c7-a018087fc7f6 at arm.com/ > > The discussion that went further is actually one based on the idea that > there is a pre-shared memory area and the frontend always passes > addresses from it. For ease of implementation, the pre-shared area is > the virtqueue itself so this approach has been called "fat virtqueue". > But it requires guest modifications and it probably results in > additional memory copies. > > ? > I got it. Although we would need to map that?pre-shared area anyway (I presume it could be done at once during initialization), I think it > much better than > map arbitrary pages at runtime. Yeah that's the idea > If there is a way for Xen to know the pre-shared area location in advance it will be able to allow mapping > this region only and deny other attempts. No, but there are patches (not yet upstream) to introduce a way to pre-share memory regions between VMs using xl: https://github.com/Xilinx/xen/commits/xilinx/release-2021.1?after=4bd2da58b5b008f77429007a307b658db9c0f636+104&branch=xilinx%2Frelease-2021.1 So I think it would probably be the other way around: xen/libxl advertises on device tree (or ACPI) the presence of the pre-shared regions to both domains. Then frontend and backend would start using it. ? > I am not sure if the approach you mentioned could be implemented > completely without frontend changes. It looks like Xen would have to > learn how to inspect virtqueues in order to verify implicit grants > without frontend changes. > > ? > I looked through the virtio-iommu specification and corresponding Linux driver but I am sure I don't see all the challenges?and pitfalls. > Having a limited knowledge of IOMMU infrastructure?in Linux, below is just my guess, which might be wrong. > > 1. I think, if we want to avoid?frontend changes the backend in Xen would need to fully conform to the specification, I am afraid that > besides just inspecting virtqueues, the backend needs to properly and completely emulate the virtio device, handle shadow page tables, etc. > Otherwise we might?break the guest. I expect a huge amount of work to?implement this properly. Yeah, I think we would want to stay away from shadow pagetables unless we are really forced to go there. > 2. Also, if I got the things correctly, it looks like when enabling virtio-iommu, all addresses passed in?requests to the virtio devices > behind the virtio-iommu will be in guest virtual address space (IOVA). So we would need to find a way for userspace (if the backend is > IOREQ server) to translate them to guest physical addresses (IPA) via these shadow page tables in the backend in front of mapping them via > foreign memory map calls. So I expect Xen, toolstack and Linux privcmd driver changes and additional complexity taking into account how the > data structures could be accessed (data structures?being continuously in IOVA, could be discontinuous in IPA, indirect table descriptors, > etc).? > I am wondering, would it be possible to have identity IOMMU mapping (IOVA == GPA) at the guest side but without bypassing an IOMMU, as we > need the virtio-iommu frontend to send map/unmap requests, can we control this behaviour somehow? > I think this would simplify things. None of the above looks easy. I think you are right that we would need IOVA == GPA to make the implementation feasible and with decent performance. But if we need a spec change, then I think Juergen's proposal of introducing a new transport that uses grant table references instead of GPAs is worth considering. > 3. Also, we would probably want to have a single virtio-iommu device instance per guest, so all virtio devices which belong to this guest > will share the IOMMU mapping for the optimization purposes. For this to work all virtio devices inside a guest should be attached to the > same IOMMU domain. Probably, we could?control?that, but I am not 100% sure.?? From olekstysh at gmail.com Wed Oct 6 16:43:03 2021 From: olekstysh at gmail.com (Oleksandr) Date: Wed, 6 Oct 2021 19:43:03 +0300 Subject: [Rust-VMM] [Stratos-dev] Xen Rust VirtIO demos work breakdown for Project Stratos In-Reply-To: References: <87pmsylywy.fsf@linaro.org> <874ka68h96.fsf@linaro.org> Message-ID: <1d6382b6-ddf8-494c-4f7b-afc50a4269a4@gmail.com> On 05.10.21 00:53, Stefano Stabellini wrote: Hi Stefano, all > On Sat, 2 Oct 2021, Oleksandr Tyshchenko wrote: >> On Sat, Oct 2, 2021 at 2:58 AM Stefano Stabellini wrote: >> >> Hi Stefano, all >> >> [Sorry for the possible format issues] >> [I have CCed Julien] >> >> >> On Tue, 28 Sep 2021, Oleksandr Tyshchenko wrote: >> > On Tue, Sep 28, 2021 at 9:26 AM Stefano Stabellini wrote: >> > >> > Hi Stefano, all >> > >> > [Sorry for the possible format issues] >> > >> > >> >? ? ? ?On Mon, 27 Sep 2021, Christopher Clark wrote: >> >? ? ? ?> On Mon, Sep 27, 2021 at 3:06 AM Alex Benn?e via Stratos-dev wrote: >> >? ? ? ?> >> >? ? ? ?>? ? ? ?Marek Marczykowski-G?recki writes: >> >? ? ? ?> >> >? ? ? ?>? ? ? ?> [[PGP Signed Part:Undecided]] >> >? ? ? ?>? ? ? ?> On Fri, Sep 24, 2021 at 05:02:46PM +0100, Alex Benn?e wrote: >> >? ? ? ?>? ? ? ?>> Hi, >> >? ? ? ?>? ? ? ?> >> >? ? ? ?>? ? ? ?> Hi, >> >? ? ? ?>? ? ? ?> >> >? ? ? ?>? ? ? ?>> 2.1 Stable ABI for foreignmemory mapping to non-dom0 ([STR-57]) >> >? ? ? ?>? ? ? ?>> ??????????????????????????????????????????????????????????????? >> >? ? ? ?>? ? ? ?>> >> >? ? ? ?>? ? ? ?>>? ?Currently the foreign memory mapping support only works for dom0 due >> >? ? ? ?>? ? ? ?>>? ?to reference counting issues. If we are to support backends running in >> >? ? ? ?>? ? ? ?>>? ?their own domains this will need to get fixed. >> >? ? ? ?>? ? ? ?>> >> >? ? ? ?>? ? ? ?>>? ?Estimate: 8w >> >? ? ? ?>? ? ? ?>> >> >? ? ? ?>? ? ? ?>> >> >? ? ? ?>? ? ? ?>> [STR-57] >> >? ? ? ?>? ? ? ?> >> >? ? ? ?>? ? ? ?> I'm pretty sure it was discussed before, but I can't find relevant >> >? ? ? ?>? ? ? ?> (part of) thread right now: does your model assumes the backend (running >> >? ? ? ?>? ? ? ?> outside of dom0) will gain ability to map (or access in other way) >> >? ? ? ?>? ? ? ?> _arbitrary_ memory page of a frontend domain? Or worse: any domain? >> >? ? ? ?> >> >? ? ? ?>? ? ? ?The aim is for some DomU's to host backends for other DomU's instead of >> >? ? ? ?>? ? ? ?all backends being in Dom0. Those backend DomU's would have to be >> >? ? ? ?>? ? ? ?considered trusted because as you say the default memory model of VirtIO >> >? ? ? ?>? ? ? ?is to have full access to the frontend domains memory map. >> >? ? ? ?> >> >? ? ? ?> >> >? ? ? ?> I share Marek's concern. I believe that there are Xen-based systems that will want to run guests using VirtIO devices >> without >> >? ? ? ?extending >> >? ? ? ?> this level of trust to?the backend domains. >> > >> >? ? ? ?>From a safety perspective, it would be challenging to deploy a system >> >? ? ? ?with privileged backends. From a safety perspective, it would be a lot >> >? ? ? ?easier if the backend were unprivileged. >> > >> >? ? ? ?This is one of those times where safety and security requirements are >> >? ? ? ?actually aligned. >> > >> > >> > Well, the foreign memory mapping has one advantage in the context of Virtio use-case >> > which is that Virtio infrastructure in Guest doesn't require any modifications to run on top Xen. >> > The only issue with foreign memory here is that Guest memory actually mapped without its agreement >> > which doesn't perfectly fit into?the security model. (although there is one more issue with XSA-300, >> > but I think it will go away sooner or later, at least there are some attempts to eliminate it). >> > While the ability to map any part of Guest memory is not an issue for the backend running in Dom0 >> > (which we usually trust), this will certainly violate Xen security model if we want to run it in other >> > domain, so I completely agree with the existing concern. >> >> Yep, that's what I was referring to. >> >> >> > It was discussed before [1], but I couldn't find any decisions regarding that. As I understand, >> > the one of the possible ideas is to have some entity in Xen (PV IOMMU/virtio-iommu/whatever) >> > that works in protection mode, so it denies all foreign mapping requests from the backend running in DomU >> > by default and only allows requests with mapping which were *implicitly* granted by the Guest before. >> > For example, Xen could be informed which MMIOs hold the queue PFN and notify registers >> > (as it traps the accesses to these registers anyway) and could theoretically parse the frontend request >> > and retrieve descriptors to make a decision which GFNs are actually *allowed*. >> > >> > I can't say for sure (sorry not familiar enough with the topic), but implementing the virtio-iommu device >> > in Xen we could probably avoid Guest modifications at all. Of course, for this to work >> > the Virtio infrastructure in Guest should use DMA API as mentioned in [1]. >> > >> > Would the ?restricted foreign mapping? solution retain the Xen security model and be accepted >> > by the Xen community? I wonder, has someone already looked in this direction, are there any >> > pitfalls here or is this even feasible? >> > >> > [1] https://lore.kernel.org/xen-devel/464e91ec-2b53-2338-43c7-a018087fc7f6 at arm.com/ >> >> The discussion that went further is actually one based on the idea that >> there is a pre-shared memory area and the frontend always passes >> addresses from it. For ease of implementation, the pre-shared area is >> the virtqueue itself so this approach has been called "fat virtqueue". >> But it requires guest modifications and it probably results in >> additional memory copies. >> >> >> I got it. Although we would need to map that?pre-shared area anyway (I presume it could be done at once during initialization), I think it >> much better than >> map arbitrary pages at runtime. > Yeah that's the idea > > >> If there is a way for Xen to know the pre-shared area location in advance it will be able to allow mapping >> this region only and deny other attempts. > > No, but there are patches (not yet upstream) to introduce a way to > pre-share memory regions between VMs using xl: > https://github.com/Xilinx/xen/commits/xilinx/release-2021.1?after=4bd2da58b5b008f77429007a307b658db9c0f636+104&branch=xilinx%2Frelease-2021.1 > > So I think it would probably be the other way around: xen/libxl > advertises on device tree (or ACPI) the presence of the pre-shared > regions to both domains. Then frontend and backend would start using it. Thank you for the explanation. I remember this series has already appeared in ML. If I got the idea correctly this way we won't need to map the foreign memory from the backend at all (I assume this eliminates security concern?). It looks like the every pre-shared region (described in config file) is mapped by the toolstack at the domains creation time and the details of this region are also written to the Xenstore. All what backend needs to do is to map the region into its address space (via mmap). For this to work the guest should allocate virtqueue from Xen specific reserved memory [1]. [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/reserved-memory/xen%2Cshared-memory.txt > >> I am not sure if the approach you mentioned could be implemented >> completely without frontend changes. It looks like Xen would have to >> learn how to inspect virtqueues in order to verify implicit grants >> without frontend changes. >> >> >> I looked through the virtio-iommu specification and corresponding Linux driver but I am sure I don't see all the challenges?and pitfalls. >> Having a limited knowledge of IOMMU infrastructure?in Linux, below is just my guess, which might be wrong. >> >> 1. I think, if we want to avoid?frontend changes the backend in Xen would need to fully conform to the specification, I am afraid that >> besides just inspecting virtqueues, the backend needs to properly and completely emulate the virtio device, handle shadow page tables, etc. >> Otherwise we might?break the guest. I expect a huge amount of work to?implement this properly. > Yeah, I think we would want to stay away from shadow pagetables unless > we are really forced to go there. > > >> 2. Also, if I got the things correctly, it looks like when enabling virtio-iommu, all addresses passed in?requests to the virtio devices >> behind the virtio-iommu will be in guest virtual address space (IOVA). So we would need to find a way for userspace (if the backend is >> IOREQ server) to translate them to guest physical addresses (IPA) via these shadow page tables in the backend in front of mapping them via >> foreign memory map calls. So I expect Xen, toolstack and Linux privcmd driver changes and additional complexity taking into account how the >> data structures could be accessed (data structures?being continuously in IOVA, could be discontinuous in IPA, indirect table descriptors, >> etc). >> I am wondering, would it be possible to have identity IOMMU mapping (IOVA == GPA) at the guest side but without bypassing an IOMMU, as we >> need the virtio-iommu frontend to send map/unmap requests, can we control this behaviour somehow? >> I think this would simplify things. > None of the above looks easy. I think you are right that we would need > IOVA == GPA to make the implementation feasible and with decent > performance. Yes. Otherwise, I am afraid, the implementation is going to be quite difficult with questionable performance at the end. I found out that IOMMU domain in Linux can be identity mapped (IOMMU_DOMAIN_IDENTITY - DMA addresses are system physical addresses) and this can be controlled via cmd line. I admit I didn't test, but from the IOMMU framework code it looks like that driver's map/unmap callback won't be called in this mode and as the result the IOMMU mapping never reaches the backend. Unfortunately, this is not what we want as we won't have any understating what the GFNs are... > But if we need a spec change, then I think Juergen's > proposal of introducing a new transport that uses grant table references > instead of GPAs is worth considering. Agree, if we the spec changes cannot be avoided then yes. > > >> 3. Also, we would probably want to have a single virtio-iommu device instance per guest, so all virtio devices which belong to this guest >> will share the IOMMU mapping for the optimization purposes. For this to work all virtio devices inside a guest should be attached to the >> same IOMMU domain. Probably, we could?control?that, but I am not 100% sure. -- Regards, Oleksandr Tyshchenko