Gradient Flow #36: Model Monitoring, Hydrofoils, Data Portability

Subscribe • Previous Issues

This edition has 372 words which will take you about 2 minutes to read.

“Preferences are optional and subject to constraints, whereas constraints are neither optional nor subject to preferences.” - Marko Papic

Data Exchange podcast


Upcoming Free Virtual Event

As the external co-chair for the Ray Summit, I’m excited about the outstanding program we’ve put together for developers, machine learning practitioners, data scientists, DevOps professionals, and architects. See you online in a few weeks!

Register Now


Data & Machine Learning Tools and Infrastructure

  • Model Monitoring Enables Robust Machine Learning Applications    Paco Nathan and I detail key challenges in monitoring ML models, and we outlined key components of a model monitoring platform.  This is a very active area with many startups rolling out new offerings. We believe that companies will gravitate towards holistic MLOps platforms that include model monitoring, as opposed to stitching together disparate components.

  • Introducing Delta Live Tables   Through a combination of declarative pipeline development, improved data reliability and cloud-scale production operations, DLT makes the ETL lifecycle easier.  Data engineers will be able to leverage existing data pipelines by building production ETL pipelines while writing only SQL queries.

  • Greykite: Linkedin’s new open source library for time series forecasting I’ve been experimenting with Greykite (paper, code) and I love its speed and flexibility. This is a relatively new release and the documentation can be somewhat overwhelming, but if you invest time learning it I believe you’ll end up using this library in production. At the very least you should add it to your toolbox alongside more mature options like Prophet.

  • A gentle introduction to knowledge graphs, with sample use cases from search, data integration, and AI.

  • immudb → Blockchain Concepts  ∪ SQL    A new TimeTravel feature allows you to run queries across your data’s change history.

  • Ray Clusters provide users with a serverless experience  Ray Clusters can automatically scale up and down based on an application’s resource demands while maximizing utilization and minimizing costs.


Funding Updates

[Image: Berlin from pxhere.]

Recommendations


Closing Short:


If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe:


Ben Lorica edits the Gradient Flow newsletter. He is co-chair of the Ray Summit, external chair of the NLP Summit, and host of the Data Exchange podcast. You can follow him on Twitter @BigData. This newsletter is produced by Gradient Flow.

Loading more posts…