Case Studies

Cash App Gains Flexibility in Machine Learning Workflows with Prefect

July 16, 2024
Wendy Tang
Machine Learning Engineer, Cash App
Nick Acosta
Share
“Fraud is a high-stakes game that takes constant work to stay ahead of bad actors. Working on it at Cash App means constantly striving to be at the cutting edge of the field.” - Isaac Tamblyn, Cash App

Cash App employs teams of financial experts that are constantly on the lookout for new transaction patterns from bad actors, but identifying this behavior is just step one in preventing fraud. These experts collaborate with Machine Learning Engineers on state-of-the-art infrastructure to deploy new models that combat new fraud patterns as quickly as they arrive.

Wendy Tang is part of the ML Tools and & Training Team at Cash App responsible for building and maintaining this infrastructure. Her team started their orchestration journey a few years ago with Airflow to construct ETL pipelines. They found Airflow to be a capable option for running SQL queries that moved data from one place to another or performed transformations, but soon realized several limitations that prevented them from achieving the flexibility and scalability necessary to get up and running quickly with new fraud prevention models. As Wendy mentioned in her Prefect Summit session,

"Airflow was no longer a viable option for Machine Learning workflows.”

Moving beyond ETL to orchestrate Machine Learning workflows

Whether or not your data team is responsible for mitigating fraud, it is likely the demands of your data team are increasing. We have a unique point of view into the changing landscape of data infrastructure here at Prefect. Just a few years ago, the data warehouse was the center of a data team’s universe, but now we are finding that many see this field becoming commoditized, and as ETL pipelines become less of a differentiator and source of value, more is needed from an orchestration tool to address new areas of value. The Cash App platform team wanted more than Airflow could offer, and found limitations began appearing in the following areas:

Cash App’s ML workflow platform needs

☁️ Heterogeneous compute

Compute needs can vary significantly between Machine Learning models and at various stages of a model’s life. A one-worker-fits-all approach to orchestration can result in very large and very small tasks using the same compute. Prefect’s deployments can run across configurable work pools in a wide variety of container, AI, and even Prefect-managed infrastructure options.

🎨 Custom Python

Private and custom Python packages can be critical to ML model development. They provide a clean and easy way to deploy and reuse modular, organized, and production-ready code. Task-specific package management is difficult with some orchestrators, as they can compete with the orchestrator’s default set of packages. Prefect's dependencies and environment are entirely separate from every flow it manages and the flow’s dependencies and environment, and each flow can be deployed to a unique environment. Wendy’s team at Cash App implements flow-specific environments via Access Control Lists that enable custom packages for users across clouds.

🤝 Exchanging data between tasks

Machine Learning workflows will often reuse assets. For instance, a developer at Cash App may want to create a variety of candidate models, each as its own workflow, and pass performance data on each of the models he or she created to another workflow that promotes the best model into production. Cash App found that their Airflow wasn’t very good at exchanging data between DAGs. Prefect flows have a simple and Pythonic way to share data between flows.

Cash App created a three-fold list of requirements that led them to adopt Prefect. They needed a platform that could support the flexible needs of Machine Learning workflows. The team also wanted something that could be easily adopted by Machine Learning practitioners. Because of Prefect’s wide array of deployment options and lightweight Pythonic developer experience, they have been using Prefect ever since. The third and most important requirement, however, was an orchestration platform with a high level of data security.

Cash App’s secure data assembly lines

At Cash App, the ML Tools and Platform Team supports teams working with sensitive data across different development and production environments. Wendy likes to think of their ML workflows as an assembly line, with Prefect offering a fast way to construct and iterate on components piece-by-piece before bringing them into production with Deployments.

As an Admin, she gives users deployment options aligned to specific data privacy and compute needs of their workflows across local, Google Cloud, AWS, and Databricks environments. She easily isolates each user’s view of Cash App’s ML platform by managing Access Control Lists for each Deployment, resulting in increased data security and compute efficiency.

Cash App’s future with Prefect

Workflows are continuing to evolve at Cash App. As the team looks to move beyond tree-based models into new and larger models, they are experimenting with ControlFlow, an open-source framework built on top of Prefect to build resilient agentic AI workflows. The team wants to upgrade their compute instances to address the horizontal and vertical scaling needs of complex model types as well. We think that our integrations with distributed computing frameworks such as Ray and Dask will be critical in keeping the Pythonic experience their users are familiar with in Prefect.

Since moving to Prefect, Cash App has noticed the increased flexibility of their new orchestration platform “has generated higher interest among internal customers as the platform offers more flexibility in workflows.” We are excited to partner with Cash App to continue the adoption of their new orchestration platform, especially given the announcements made at Prefect Summit!

Discover the future of data at Prefect Summit

Organizations like Cash App are looking to build their next generation of data tools to take advantage of innovation without compromising the trust of their users. You can find more information on how Prefect is enabling this trust with transactional, flexible, and portable orchestration, including Wendy’s talk on Cash App’s journey to Prefect, on our Prefect Summit recap page.