Published inTowards Data ScienceMachine Learning Orchestration vs MLOpsThere is a saying that I’ve heard from the ML engineers that I’ve worked with that “most of machine learning operations (MLOps) is just…Jun 5, 2023Jun 5, 2023
Published inApache AirflowHow to Best Use DuckDB with Apache AirflowDuckDB has been making waves in the data market as a robust and fast OLAP database. It’s a replacement for SQLite for analytics heavy…Apr 14, 20233Apr 14, 20233
Published inApache Airflow8 Things I Wish I Knew About Airflow Before I Started Orchestrating Machine Learning WorkflowsApache Airflow is a ubiquitous tool used for orchestrating data pipelines and workflows in many organizations. Its rich ecosystem of…Apr 11, 20231Apr 11, 20231
Published inApache AirflowPassing Data Between Tasks with the KubernetesPodOperator in Apache AirflowTL;DR: Use the @task.kubernetes decorator!Mar 21, 20231Mar 21, 20231
Published inDev GeniusSentiment Analysis of the Simpson with Apache SparkOr, who is the happiest Simpsons Character?Mar 17, 2023Mar 17, 2023
Adding Tables to Medium StoriesI’ve been looking for a good way to add tables to a Medium story for a while. There are a couple of options I found while searching for…Oct 13, 20201Oct 13, 20201
Making Better Business Decisions with Machine LearningA version of this originally appeared on cloudera.comMar 19, 2020Mar 19, 2020
Putting Machine Learning Models into ProductionA version of this originally appeared on cloudera.comJun 17, 2019Jun 17, 2019
D3 v4 in Jupyter NotebooksI have recently started doing some data visualization work with Jupyter Notebooks. The specific requirement has been getting data from…Feb 9, 2018Feb 9, 2018
The Hard Problem of Data Analytics in AfricaOne of the attributes of really good technology is that it hides the complexity of what goes on in the background, while still being…Oct 19, 2016Oct 19, 2016