<div dir="ltr">Marcel,<div><br></div><div>This item is already on our backlog. We just haven't been able to get to it. We certainly welcome other community members to join us and help accelerate this item.<br></div><div><br></div><div>Hui</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Aug 5, 2021 at 9:04 AM Marcel Hild <<a href="mailto:mhild@redhat.com">mhild@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_default" style="font-size:small">Hey Hui,</div><div class="gmail_default" style="font-size:small">thanks for the update.</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Have you considered to implement a running version for your POC to the Operate First Community Cloud at <a href="https://www.operate-first.cloud/" target="_blank">https://www.operate-first.cloud/</a> ?</div><div class="gmail_default" style="font-size:small">There's also a spark cluster available and JupyterNotebooks. </div><div class="gmail_default" style="font-size:small">If the community could try out your work, that might increase the awarnes</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 4, 2021 at 5:02 AM Hui Lei <<a href="mailto:dr.huilei@gmail.com" target="_blank">dr.huilei@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Dear all,<div><br></div><div>I would like to take this opportunity to give you another update on Project Caerus. As you may remember, the project develops techniques such as near-data processing and semantic caching to optimize the performance of disaggregated data lakes. On the front of near data processing, we have implemented the pushdown of a wide range of SQL operators from a Spark cluster to a storage cluster that deploys either HDFS (CSV format) or S3.  Our evaluation using TCPH has shown significant improvements in application latency, network I/O and compute-side CPU time. You can check out our <a href="https://github.com/open-infrastructure-labs/caerus-dike/blob/master/doc/ndp_design.pdf" target="_blank">design document</a> and latest <a href="https://github.com/open-infrastructure-labs/caerus-dike/blob/master/doc/s3_hdfs_results_6_1_2021.pdf" target="_blank">evaluation results</a> in GitHub.</div><div><br></div><div>On the front of semantic cache, which explores opportune caching of a variety of data and metadata, we have the core functionality working, with 4x-5x improvement in execution time and CPU time. Again the <a href="https://github.com/open-infrastructure-labs/caerus-semantic-cache/blob/master/Design.docx" target="_blank">design document</a> and the <a href="https://github.com/open-infrastructure-labs/caerus-semantic-cache/blob/master/Evaluation.docx" target="_blank">initial evaluation results</a> are available in GitHub.</div><div><br></div><div>As always, your comments and contributions are welcome.</div><div><br></div><div>- Hui</div></div>

_______________________________________________<br>

Openinfralabs mailing list<br>

<a href="mailto:Openinfralabs@lists.opendev.org" target="_blank">Openinfralabs@lists.opendev.org</a><br>

<a href="http://lists.opendev.org/cgi-bin/mailman/listinfo/openinfralabs" rel="noreferrer" target="_blank">http://lists.opendev.org/cgi-bin/mailman/listinfo/openinfralabs</a><br>

</blockquote></div>

</blockquote></div>