[Openinfralabs] Project Caerus Code Release

Mon May 10 18:33:05 UTC 2021

Dear all,

As you may know, the Caerus project of the Open Infra Labs investigates
techniques such as near-data processing and semantic caching to optimize
the performance of disaggregated data lakes. I am pleased to announce that
an initial version of the project code is now available in the open infra
lab repo (https://github.com/open-infrastructure-labs/caerus). The initial
code base enables the pushing down of Spark SQL operations to the data
nodes of HDFS. Attached is a preliminary evaluation of the work using TPCH
benchmark. You are all welcome to check out and, better yet, contribute to
the work. We will be using the project wiki (
https://github.com/open-infrastructure-labs/caerus/wiki) for discussions
and questions.

- Hui Lei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opendev.org/pipermail/openinfralabs/attachments/20210510/4568c82f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Caerus NDP evaluation with Spark and HDFS - May 2021.pdf
Type: application/pdf
Size: 147719 bytes
Desc: not available
URL: <http://lists.opendev.org/pipermail/openinfralabs/attachments/20210510/4568c82f/attachment-0001.pdf>