PyData Miami 2022

To see our schedule with full functionality, like timezone conversion and personal scheduling, please enable JavaScript and go here.
09:30
09:30
30min
Registration / Coffee
Main Room
10:00
10:00
15min
Opening Remarks
Main Room
10:15
10:15
30min
How to Enhance Your Machine Learning Models with Genetic Algorithms
Eyal Wiransky

How to Enhance Your Machine Learning Models with Genetic Algorithms

Main Room
10:50
10:50
30min
State-of-the-art Text Mining with Spark NLP
David Talby

State-of-the-art Text Mining with Spark NLP

Main Room
11:25
11:25
30min
Finding new drugs with AI and Reinforcement Learning
Aleksandra Kalisz

Aleksandra Kalisz

Main Room
12:00
12:00
60min
Lunch
Main Room
13:00
13:00
30min
Panel Discussion - Lessons in Diversity and Leadership: Driving Positive Change
Brittany Fox, Noelle Silver Russell, Mark Moyou, Colleen Farrelly

Panel Discussion - Diversity in Data Science

Main Room
13:35
13:35
35min
NLP: Challenges and Opportunities in the Developing World
Colleen Farrelly

This talk will overview how NLP is being used in research and industry to preserve at-risk languages, power technologies to solve pressing problems (like employment matches), and create culturally-attuned NLP tools (like sentiment analysis). Current challenges include data ownership and local population rights to their data. Examples come from partnerships in Sub-Saharan Africa, but they apply to other regions of the world, as well.

Main Room
14:10
14:10
30min
Responsible AI at Scale
Noelle Silver Russell

Noelle Silver Russell

Main Room
14:45
14:45
15min
Break
Main Room
15:00
15:00
55min
Speed up your Machine Learning Applications with Lightning AI
William Falcon

Speed up your Machine Learning Applications with Lightning AI

Main Room
15:55
15:55
30min
Sailing through AIS data using MovingPandas
Ray Bell

MovingPandas is an open source python library for working with trajectory data. MovingPandas is an extremely useful tool for working with AIS data which represent the location of vessels. In this talk i'll present the methods and algorithms implemented in MovingPandas and discuss the insights it can derive for shipping companies, port operators and government agencies.

Main Room
16:30
16:30
10min
Break
Main Room
16:40
16:40
30min
ML without the Ops: Running Experiments at Scale with Ploomber on AWS Batch
Ido Michael, Eduardo Blancas

In this talk we'll see how to easily run your code at scale through Docker and AWS Batch. We'll cover how to start scaling your workloads once the laptop isn't enough and how open-source can help you achieve that. ML involves training at scale to get the best performance out of the model, and at times it requires heavy-duty GPUs, which requires infrastructure work, security, permissions and operations. We'll cover the steps to deploy it without the Ops, letting you as a data scientist to focus on the important task - getting the most out of the models!

Main Room
17:15
17:15
30min
Leveraging open source inference servers for standardizing model deployments on CPUs and GPUs
Mark Moyou

In your current production environment/data science practice do you have applications that have low latency constraints? Do you have multiple team members working in different frameworks and deploying across multiple different inference servers? Are you still using Flask to deploy your models? If your answer is yes then you can leverage the capabilities of open source inference servers to standardize your model deployment on both CPUs and GPUs across different frameworks. In order to deploy more complex ensemble models but still maintain low latency, post-training model optimization is a key factor in reducing latency times. We will also look at how to do model ensembling across multiple framework backends (PyTorch, TensorFlow, Python) along with running multiple model copies using a single inference server instance.

Main Room
17:50
17:50
30min
Enterprise Semantic Search with Python Large Language Models
Nelson Correa

Enterprise Search is a key use case in big data and business computing. In this talk we introduce Enterprise Semantic Search with Large Language Models (LLMs), and present a working demonstration in the financial domain. Semantic search is search based on meaning representations, instead of literal document and query keywords. We use the recent HuggingFace transformers library, together with related Python libraries (TensorFlow, sklearn and UMAP) for NLP and deep learning. Approaches, data visualization, metrics and datasets for search system evaluation are introduced. The talk will be of interest to developers working on text search and new unstructured data applications. Slides and a demo notebook will be available at the time of PyData Miami 2022.

Main Room
18:30
18:30
10min
Concluding Remarks, Logistics
Main Room