Manoj Bohra

, Bank of America

Overview

pThe Data Science, Analytics Platform Services (DSAPS) organization is focused on data science, data lake, and cognitive search./p pDSAPS is leading the successful implementation of a unique data science platform named “Phoenix”, which leverages BofA’s data assets for the purpose of data science and artificial intelligence. This platform enables machine learning, natural language processing, deep learning, neural networks and data science implementations at the Bank./p pThe team’s emphasis has been on transforming data technologies including the democratization of data with data security at the core; a broad data ecosystem, and implementation of end to end data science offerings./p

Supernova Award Category

Data to Decisions

The Problem

pBank of America’s Global Technology and Operations (GTO) organization is always looking for ways to maximize the value of our data assets to further our responsible growth strategy while protecting customer privacy and ensuring appropriate AI practices. Our ability to generate actionable insights from this data is critical for our continued success in the marketplace. We do this while:/p ol liEnsuring protocols are in place to enforce data privacy and overall data security./li liEnsuring model review management process and tools are available to support model oversight and governance, particularly through the development and deployment stages./li /ol pIn the past, our data scientists spent significant effort downloading open source tools and exploring vendor products, installing them, trying to gain access to data sources and manually managing the data science lifecycle which is time consuming and costly. These individual efforts also led to tool proliferation and complexity. We wanted data scientists and engineers to focus on innovating, rather than spending valuable time installing and managing tools./p

The Solution

pThe DSAPS team developed BofA Phoenix, a data science and analytics platform providing an integrated set of open-source and vendor tools that adhere to strict enterprise governance standards and procedures./p pBofA’s Phoenix helps our data scientists create innovative applications that use data for business insights, while avoiding the long lead times associated with assembling the required tools, libraries and data connectors. Data scientists and data engineers will use BofA Phoenix to test and deploy artificial intelligence (AI) and machine learning models./p pUsing BofA’s Phoenix, data scientists can now self-serve with data ingestion, preparation, extraction, model development, model deployment and model governance. The BofA Phoenix ecosystem connects seamlessly to Hadoop and provides real-time data ingestion tools, a collection of software utilities, and Spark grids that help perform computations quickly and easily on massive amounts of data./p

The results

pAs part of a phased initiative, BofA Phoenix is enabling end-to-end data analytics to build, train and run an ecosystem with the following key capabilities:/p ul liData analytics platform with storage and compute options (such as Hadoop, containers and graphics processing units) that are fit-for-purpose and cost effective/li liData science workbench with seamless interface for data access, data preparation, modeling and visualization/li liStreaming data ingestion/li liIntegration with cognitive data extraction tools with capabilities such as optical character recognition/li liIntegration with data preparation and exploration tools/li liIntegration with data governance tools/li liConnectors to authorized data sources/li liClear and governed path to model production/li /ul pThe platform is managed by the DSAPS team, partnering with stakeholders across enterprise, to provide platform capabilities that allow innovation with agility and scale./p pThis project initially started with a simple use case and now enables analytics, data science, and artificial intelligence for petabytes of data./p p /p

Metrics

pBank of America’s Phoenix data science platform enables a large number of technologies, configured and integrated with each other, and most importantly leverages a vast amount of data assets within the bank’s data stores./p pPrior to BofA Phoenix, a typical data science project would require duplication of data and implementation of new technology for each new project, which took anywhere from six to nine months. Now a data scientist can write a model and go-live within a matter of weeks./p pKey metrics at a glance:/p ul liProject implementation time: reduced from 6-9 months to a few weeks/li liOCR (optical character recognition): increased from ≈70% to ≈90% accuracy/li liTime taken to approve access to open-source library: from 2 months to instant access/li liNumber of open-source libraries accessible inside the Bank: from 50-80 to more than 550/li li50% reduction in duplicated data because data science use does not require extraction of data/li liMore than 1500 data scientists now using a shared technology/li /ul

The Technology

ul liA shared platform leveraging vendor products and open source tools/li liUser interface for data access, extraction, preparation, modeling and visualization/li liPopular model development languages and notepads with 500+ model frameworks/li liContainerization, data lake, data platform, micro-services and multi-tenancy to deploy models/li liLarge-scale data processing engine and specialized hardware (GPU)/li liAutomated model lifecycle management, governance, interpretability and bias detection tools/li /ul

Disruptive Factor

pBofA Phoenix presented transformative capabilities that had not been previously available within a large enterprise. Typically each project is implemented with dedicated infrastructure, causing duplication and lengthy implementation. BofA’s Phoenix platform integrates the tools necessary to support an end to end data science and analytics lifecycle into a seamless end user experience, thus driving easy adoption of new technology./p pThe multi-tenant form factor of the platform allows for improved utilization of the infrastructure thereby increasing capacity available to quickly deliver new capabilities for our customers. The platform enables independent storage and processing of data. This includes a centralized event control process to facilitate compute heavy workloads that have different compute scaling requirements./p pBofA’s Phoenix was developed with citizen data scientists in mind, with a focus on democratizing access to data and tools. BofA Phoenix brought together best-of-the-breed capabilities from model writers to engineers, allowing seamless model deployment. This work is an important demonstration of the focus Bank of America puts on the use of responsible AI./p

Shining Moment

pThis platform had a profound impact on some of the bank’s most important work:/p ol liThe recent pandemic required document classification and data extraction at scale. More than 3 million pages where processed through BofA’s Phoenix NLP/ML based document extraction capability within weeks/li liAutomation of Annual Client Review replaced an immense and limited manual review process with BofA Phoenix’s cognitive data science. Implementation on BofA Phoenix only took a few weeks/li /ol

Submission Details

Year
Category
Data to Decisions
Result