Staying competitive in Major League Soccer (MLS) demands building and maintaining a strong squad through strategic roster planning and smart, effective navigation of the transfer market. To achieve this, MLS teams rely on Roster Composition Rules and Regulations. However, these rules are often extensive and filled with legalistic details, which can slow down decision-making processes. Recognizing this challenge, the Philadelphia Union, 2020 MLS Supporters’ Shield winners, turned to the Databricks Data Intelligence Platform to streamline decision-making. Leveraging its advanced data and AI capabilities, they implemented a GenAI chatbot to assist the front office with queries on roster composition, salary budget guidelines, and other complex regulations, improving efficiency and operational clarity.
By leveraging Databricks, we are transforming our approach to roster management, turning a complex, time-consuming process into a streamlined, data-driven operation.
— Addison Hunsicker, Senior Manager, Soccer Analytics, Philadelphia Union
The chatbot is accessed through a no-code, ChatGPT-like interface deployed via Databricks Apps, a solution for quickly building secure data and AI applications. The front office benefits from the chatbot’s conversational style, which not only provides easy access but also enables zero-shot interpretation of roster regulations in seconds. This accelerates decision-making and saves valuable time, allowing the front office to focus on more strategic, value-adding tasks.
The Solution Architecture: RAG for Rapid Rule Interpretation
The solution is built on a Retrieval-Augmented Generation (RAG) architecture, with all components fully powered by the Databricks Data Intelligence Platform. RAG works by retrieving relevant context from an ‘external’ storage mechanism, augmenting it to the user query prompt, and generating highly accurate and contextually relevant responses from a large language model.
In this case, the storage mechanism is Vector Search, a vector database provided by Databricks. To ensure new PDFs are automatically available, a continuous ingestion mechanism was set up to load roster rule PDFs into Databricks Volumes, a fully governed store for semi-structured and unstructured data on Databricks. Text is then extracted, and numerical representations (or embeddings) are generated using Embedding Models from the Databricks Foundation Model API. These embeddings are indexed and served by Vector Search for fast and efficient search and retrieval, enabling rapid access to relevant information.
Philadelphia Union also utilized Databricks’ own DBRX Instruct model, a powerful open source LLM based on a Mixture of Experts (MoE) architecture. DBRX Instruct delivers excellent performance on benchmarks such as MMLU. Conveniently, the model is also available through the Databricks Foundation Model API, eliminating the need to host or manage their own model infrastructure.
Their RAG chatbot is then deployed using the Mosaic AI Agent Framework, which enables seamless orchestration of the RAG application components into a chain that can be hosted on a Databricks Model Serving endpoint as an API. The framework also includes a review app and built-in Evaluations, which were invaluable for collecting human feedback and validating the effectiveness of the RAG solution prior to deployment. This ensured the chatbot was both reliable and optimized before being made available to the front office.
From this point, it’s easy to connect a standard Databricks Apps chat UI template to a Mosaic AI Agent Framework agent and deploy the chatbot within minutes.
Key Benefits of the Databricks RAG Solution
Next, we’ll explore the key benefits delivered by the Databricks RAG solution and highlight the relevant components that make it possible.
- Rapid Time-to-Model: The Union’s data team developed and deployed their RAG model in just days. Leveraging the Mosaic AI Agent Framework, the end-to-end LLMOps workflow enabled fast iteration, seamless testing, and deployment, significantly reducing the time typically required for such complex systems.
- Immediate Value Realization: With the RAG system in place, the team began realizing immediate value by automating the extraction and analysis of roster rules, tasks that were previously time-consuming and manual.
- Enhanced Data Management and Governance: Databricks Unity Catalog ensured robust data management and governance, providing the Union with secure, compliant handling of sensitive player and roster information while maintaining enterprise governance standards.
- Scalability and Performance: The Databricks Platform’s ability to efficiently process large volumes of data allowed the Union to analyze not only current roster rules but also historical trends and future scenarios at scale.
- Flexible and High-Quality AI Development: The team streamlined their RAG model’s lifecycle by leveraging the Mosaic AI Agent Framework. Features like trace logging, feedback capture, and performance evaluation allowed for continuous quality improvement and fine-tuning. Additionally, MLflow integration simplified experimentation with various RAG configurations, ensuring optimal performance.
- Governed, Secure, and Efficient Deployment: The Mosaic AI Agent Framework’s integration with the Databricks Data Intelligence Platform ensured all deployments adhered to governance and security standards, enabling a reliable and compliant environment for AI solutions.
Conclusion
Databricks has become Philadelphia Union’s 12th man, helping them transform into a forward-looking, data-driven organization. As the sports industry continues to evolve, the Philadelphia Union’s adoption of advanced analytics and AI demonstrates how data intelligence can be a game-changer both on and off the pitch.
The Union’s innovative use of technology not only ensures compliance with MLS Roster Rules but also provides the team with a competitive edge in player acquisition and development. With Databricks, the Union is well-positioned to navigate the complexities of MLS regulations while focusing on what matters most – building a winning team. GG!
This blog post was jointly authored by Addison Hunsicker (Philadelphia Union), Christopher Niesel (Databricks) and Samwel Emmanuel (Databricks).