Designing Machine Learning Systems

Review of Chip Huyen's book on ML systems design

Chip Huyen is an all around badass that got into machine learning after writing books about her adventures travelling solo around the world after high-school. She's now writing technical books, and her newest book, Designing Machine Learning Systems, I've read recently with great joy. If I could summarise it in one sentence, it would be: most of ML work is engineering, not research, so you should learn how to productionise your models.

It's hard to disagree with that when it's reported that around 80% of data science projects end up not getting productionised [click]. I believe this number is high because the background of most data scientists is more academic than for a usual techie. They come from various different backgrounds and the thing they have in common is having done quantitative research before coupled with little coding experience outside of Jupyter notebooks. I wish I read this book before starting my first job. It's describing so many engineering challenges I learnt the hard way and it would have saved me a lot of time and frustration.

Chapters related to preparing training data would likely be familiar to people, while the ones about online monitoring and deployment perhaps not so much. Some more advanced topics are likely not applicable to a randomly sampled data scientist, but are interesting to think about. For example, model compression techniques or doing ML on edge devices. Both of which have pretty uncommon use cases, yet they introduce some additional complexities that make system design more interesting.

It does feel though that generative language models have turned the table upside down, as they introduced many concepts (prompt engineering, different evaluation techniques, small control over the model itself etc.) that are not applicable to classical machine learning methods. It's hard to write a book on something that changes every month, but once the technology matures I hope Chip will revisit the topic.

What this book did to me is it made me excited to think about machine learning systems design instead of thinking about productionisation as the boring part. That's a nice mindset shift and I see myself coming back to it in the future.

Thanks Chip, please continue being awesome.