top of page


ODSC West virtual conference graphic

I recently had the opportunity to attend the Open Data Science Conference (ODSC) West - one of the premier conferences on data science and the largest applied international data science conference since 2015. Focus areas include open source programming languages (R, Python, Julia, Scala, etc), the latest Machine Learning (ML) techniques, Predictive Analytics, Deep Learning & Neural Networks, DataOps (machine learning pipelines for production applications), Natural Language Processing (NLP), Data Visualization, Artificial Intelligence business use cases, computer vision, and voice recognition. ODSC touches upon virtually all things data.

With so many topics and presenters, you really have to focus on what sessions you need to attend, along with the mindset that any sessions missed will have been recorded. Because of the shift to virtual conferences, most material is readily available after the conference. A great added benefit to this data science conference is that most presenters post their code and material on GitHub. It is very easy to go back and catch material that you may have missed earlier. Slack was a great tool for tracking all the presentations and really allowed me to make the most of the conference. Slack with GitHub is a powerful combination!

Am example graphic of Slack and GitHub used together
Slack and GitHub together

My intent for this conference was to focus on topics relevant to several professional projects involving Natural Language Processing (NLP), serverless computing (AWS Lambda), and my current school research in Personalized Medicine. My starting point was to look for anything to advance my R/python skills, new natural language processing techniques, an overview of Bayesian statistics, and healthcare Artificial Intelligence (AI) use cases for personalized medicine. I was very happy to find a robust selection within the ODSC West, to include:

Health AI: What's Possible Now and What's Hard (Suchi Saria - John Hopkins University)

Hands-on Reinforcement Learning with Ray RLib (Paco Nathan - Derwen, Inc)

Modern Machine Learning in R (Jared Lander - Lander Analytics/Columbia Business School)

Deep Learning for NLP with PyTorch (Ravi Ilango - Stealth Startup)

The State of Serverless and Applications to AI (Joe Hellerstein - TRIFACTA)

The Bayesians are Coming to Time Series (Aric LaBarr - NC State)

Bayesian Statistics Made Simple (Allen Downey - Olin College)

I took away several highlights from the conference, to name a few:

- Patient data and digital health apps. Connecting patient health apps to electronic health records and physician notes is a current technological challenge. Some researchers are using AI with screenshots from digital health apps to train and develop healthcare-related models.

- Reinforcement Learning. I have participated in a lot of discussions as to whether to choose simulation or optimization for various business challenges. Optimization is not very flexible, so I have typically leaned more towards simulation. Reinforcement learning is the combination of these two worlds, and I am very excited to see where this technique can go and how it can be applied.

- Serverless Computing. I was not able to watch the original presentation by Joe Hellerstein (TRIFECTA) on the State of Serverless and Applications to AI, but after downloading the material, I saw it was an excellent overview of AWS Lambda, functional computing, its strengths, limitations, and challengers. This was an excellent presentation on this technology and where it is going.

I highly recommend this conference to both aspiring and seasoned data science professionals. It is good for both sharpening and learning skills, and for tracking the latest and greatest tools in this space. It was very well done in the virtual format and enabled me to see more material than I could have in person. That being said, I definitely look forward to when we can all connect live at the next conference.

The list of all the presentations and keynotes are here:

In most cases, you can check out their GitHub repositories for presentations. Enjoy!


Jerome Dixon

Is a Senior Operations Research Analyst at CANA Advisors and can be reached through his LinkedIn profile and via his email at


bottom of page