Results

AWS unveils next-gen Amazon SageMaker in bid to unify data, analytics, AI

AWS unveils next-gen Amazon SageMaker in bid to unify data, analytics, AI

Amazon Web Services outlined its next-generation Amazon SageMaker platform that will combine data, analytics and AI.

The move has multiple components, but in a nutshell AWS is tightly integrating data prep, integration, big data, SQL analytics, machine learning and generative AI. The headliner was SageMaker Lakehouse, which unifies data lakes, data warehouses, databases and enterprise applications and makes them available for queries.

Constellation Research analyst Doug Henschen said the SageMaker effort is notable.

"I was very impressed by AWS's SageMaker announcements at AWS re:invent2024. The new, unified SageMaker consolidates all data workloads and puts AI at center, where it belongs today. It builds on DataBricks' original, single-platform vision and goes further to consolidate and unify data work and workloads than Microsoft's moves with Fabric and Google Cloud's moves with BigQuery."

Here's a look at what was announced in addition to SageMaker Lakehouse:

  • SageMaker Unified Studio gives enterprises the ability to find and access data and combine it with AWS analytics, machine learning and AI tools. Amazon Q Developer is also integrated.
  • SageMaker Catalog has built-in governance.
  • SageMaker Lakehouse will enable data to be queried in SageMaker Unified Studio or query engines compatible with Apache Iceberg.
  • Zero-ETL integrations with various SaaS applications so data is available in SageMaker Lakehouse and Amazon Redshift without complex data pipelines.
  • SageMaker Unified Studio offers one interface to combine a bevy of AWS services currently in SageMaker.

AWS CEO Matt Garman said:

"Over the next year, we're going to be adding a ton new capabilities to the new SageMaker--capabilities like AutoML, new low code experiences, specialized AI service integration, stream processing and search and access to more services and data in a single unified UI."

Constellation Research analyst Holger Mueller said:

"It is not long ago when former AWS CEO, now Amazon CEO, would say that large product suites and offerings would slow down innovation and this hurt customers. The upside though is that it is a reduction of complexity for enterprises for data and AI. AWS decided to merge the data and AI services into a single platform, rightfully picking the higher level offering with SageMaker as the new brand. The big news apart from the bundling is the new Lakehouse underpinning the new Amazon SageMaker Studio."

Data to Decisions Innovation & Product-led Growth Future of Work Tech Optimization Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity AWS reInvent aws amazon AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Executive Officer Chief Technology Officer Chief AI Officer Chief Data Officer Chief Analytics Officer Chief Information Security Officer Chief Product Officer

Amazon Q Business gets a story at AWS re:Invent 2024

Amazon Q Business gets a story at AWS re:Invent 2024

Amazon Q Developer has had a straightforward story in that it makes software development easier, generates code and now is aimed at legacy infrastructure--.NET migrations, VMware workloads and mainframe transformations. In comparison, Amazon Q Business typically generated blank stares. At AWS re:Invent 2024, that reality may be changing a bit.

Here's how AWS filled out the Amazon Q Business narrative at re:Invent.

  • Amazon Q Business can be directly embedded into applications. Customers can also create a cross-application index that can enhance experiences across applications. Users can use Q embedded to take actions across multiple applications.
  • Q Business can create complex automation workflows from natural language, and operating procedure documents and videos.
  • Amazon Q Business is being combined with QuickSight in a move that'll provide step-by-step instructions to drive decision-making.

AWS CEO Matt Garman said:

"What Q Business does is it connects all your different business systems, your sources of enterprise data, whether those come from AWS, third party apps, and internal sources."

What Q Business really becomes is an index that can serve as an automation base. "The power of Q business is that it creates this index of all of your enterprise data. It indexes data from Adobe, from Atlassian, from Microsoft Office, from SharePoint, from Gmail, from Salesforce, from ServiceNow and more," said Garman.

And by combining Q Business with QuickSight, AWS provided a solid analytics hook for enterprises and a customer base. Garman said:

"We're bringing together QuickSight Q and the Q Business data together. We'll use all of that data to show you one view inside of QuickSight making it much more powerful as a BI tool."

Should AWS' Q Business plan work out, the service will be a horizontal enabler of AI agents and workflow automation. Simply put, Q Business has a cleaner story today as an enterprise data index, analytics enabler and automation engine.

Constellation Research analyst Holger Mueller said:

"Amazon's outsized ambition for Q got a little more tangible today, with Amazon explaining how data layers and integration will work for third party applications. AWS has already shown it can capture the data foundation for all data of an enterprise, now it will have to show the merits that Q can unleash with GenAI."

More:

 

Data to Decisions Innovation & Product-led Growth Future of Work Tech Optimization Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity AWS reInvent aws amazon AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Executive Officer Chief Technology Officer Chief AI Officer Chief Data Officer Chief Analytics Officer Chief Information Security Officer Chief Product Officer

PagerDuty integrates with Amazon Bedrock, Q Business: Will it boost large enterprise traction?

PagerDuty integrates with Amazon Bedrock, Q Business: Will it boost large enterprise traction?

PagerDuty's increased integration with Amazon Web Services, Amazon Bedrock and Q Business is likely to give the company's strategy to target larger enterprises a lift.

At AWS re:Invent 2024, PagerDuty CEO Jennifer Tejada joined AWS CEO Matt Garman on stage to tout the company's new collaboration. PagerDuty Advance will be integrated into Amazon Q Business, Amazon Bedrock and Amazon Bedrock Guardrails.

PagerDuty provides observability and incident management tools. PagerDuty and AWS already have nearly 6,000 joint customers. PagerDuty's Operations Cloud detects and diagnoses disruptive events, coordinates response and streamlines workflows.

The company has turned up during multiple customer presentations at re:Invent. For instance, Goldman Sachs outlined a mainframe migration and had PagerDuty in an architecture slide.

According to AWS and PagerDuty, new integrations include:

  • PagerDuty Advance will be integrated into Amazon Bedrock to provide situational awareness through chat interactions. PagerDuty Advance is the company's genAI offering that features an assistant that leverages the company's data model.
  • PagerDuty Advance also will be embedded into Amazon Bedrock Guardrails to ensure accuracy of query responses from models.
  • In Amazon Q Business, PagerDuty is the first incident management platform to integrate. PagerDuty Advance customers will use one interface via Amazon Q Business plugins. PagerDuty said that early adopters said they saved an average of 30 minutes per incident with the integration.

The AWS integration comes a week after PagerDuty reported solid third quarter results and traction targeting larger enterprises.

PagerDuty reported a third quarter net loss of 7 cents a share on revenue of $118.9 million, up 9% from a year ago. Non-GAAP third quarter earnings were 25 cents a share to top estimates.

The company projected fourth quarter revenue of $118.5 million to $120.5 million, up 7% to 8% from a year ago. For fiscal 2025, PagerDuty is projecting revenue of $464.5 million to $466.5 million, up 8%.

Constellation ShortList™ Incident Management

Speaking on an earnings conference call, PagerDuty CEO Tejada said:

"We were pleased to see stabilization across all segments in the quarter, with retention improving across the board. That said, we remain focused on growth reacceleration and there is room for improvement, particularly on large deal conversions. We had an unusual number of large Q3 opportunities defer, and while they are not lost, these will delay ARR acceleration to FY '26. Nonetheless, we are encouraged by improvements in several key indicators, including dollar-based net retention, multi-product adoption, enterprise contract duration, and total pipeline growth."

Tejada noted that PagerDuty is seeing strength among technology, financial services and telecom customers. The plan for PagerDuty is to land and expand with large enterprises.

Data to Decisions Innovation & Product-led Growth Future of Work Tech Optimization Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity ML Machine Learning LLMs Agentic AI Generative AI AI Analytics Automation business Marketing SaaS PaaS IaaS Digital Transformation Disruptive Technology Enterprise IT Enterprise Acceleration Enterprise Software Next Gen Apps IoT Blockchain CRM ERP finance Healthcare Customer Service Content Management Collaboration Chief Information Officer Chief Executive Officer Chief Technology Officer Chief AI Officer Chief Data Officer Chief Analytics Officer Chief Information Security Officer Chief Product Officer

AWS launches Amazon Nova foundation models in commoditization play

AWS launches Amazon Nova foundation models in commoditization play

Amazon Web Services launched Amazon Nova, a series of foundation models available in Bedrock, in a move that aims to provide large language model choice and commoditize the market.

Think of Amazon Nova as the Trainium and Inferentia strategy applied to genAI models. AWS is betting that enterprises will follow the money and opt for Amazon Nova on Trainium with the Bedrock stack. 

The models include Amazon Nova Micro, Nova Lite, Nova Pro and Nova Premier with additional models on deck. While that's the news, it's worth thinking through the big picture of how AWS is approaching models.

AWS' bet is that LLMs will be a commodity that will be mixed and matched depending on the task at hand. Speaking during the AWS re:Invent 2024 keynote, Jassy said that Nova will be tightly integrated with AWS services to deliver lower latency and price performance.

Jassy said during the re:Invent 2024 keynote that the company is focused on choice and noted that there will be multiple models used in applications. Jassy noted that the Alexa rebuild will use multiple models. Jassy said:

"We are learning the same lesson over and over and over again, which is that there is never going to be one tool to rule the world. It's not the case in databases. It's not the case in analytics. We were talking about how everybody thought the TensorFlow was going to be the one AI framework. There were a lot of them and Pytorch ended up being the most popular one. The same is going to be true for models. Our internal builders have been asking for all sorts of things from our teams that are building models. They want better latency. They want lower cost. They want the ability to fine tuning. They want the ability to better orchestrate across their different knowledge bases, to be able to ground their data. They want to take lots of automated, orchestrated actions, or what people call agentic behavior. They want a bunch."

Key points:

  • Amazon Nova will add speech-to-speech and any-to-any models coming soon.
  • Amazon Nova Canvas will focus on image generation and Reels will generate video.
  • Nova aims to be 75% more cost effective.
  • Integrated into Bedrock, support fine tuning and be optimized for agentic AI.

Jassy concluded:

"We always provide you selection everything we do, which is that we are going to give you the broadest and best functionality you can find anywhere. It's going to mean choice. You are going to use different models for different reasons at different times, which is the way the real world works. Human beings don't go to one human being for expertise in every single area. You have different human beings who are great at different things."

Constellation Research analyst Holger Mueller said:

"AWS reverts its position on LLMs and gets in the market with its Nova models. It's a sign that Amazon / AWS realize they need LLM offerings for both in-house and customer use cases. This will temporarily affect its 'Switzerland' of AI position. Its strong appeal was for LLM vendors was to partner with Bedrock whilst there was no in-house LLM competition. But AWS knows how to partner."

Data to Decisions AWS reInvent aws amazon Chief Information Officer

AWS aims to make Amazon Bedrock your agentic AI point guard

AWS aims to make Amazon Bedrock your agentic AI point guard

AWS is expanding Amazon Bedrock to enable multi-agent collaboration to address higher complexity tasks.

With multi-agent collaboration, Bedrock will use models as a team with planning, structure, specialization and parallel work.

Collaboration and orchestration of agentic AI will be a big theme in 2025 and vendors are trying to get ahead of agent coordination before enterprises implement at scale. First, the industry may have to agree on standards so AI agents can communicate.

Speaking during his AWS re:Invent 2024 keynote, CEO Matt Garman said:

"If you think about hundreds and hundreds of agents all having to interact, come back, share data, go back, that suddenly the complexity of managing the system has balloons to be completely unmanageable.

Now Bedrock agents can support complex workflows. You create these series of individual agents that are really designed for your special and individualized tasks. Then you create this supervisor agent, and it kind of acts like the thing about it as acting as the brain for your complex workflow. It ensures all this collaboration against all your specialized agents."

Amazon CEO Andy Jassy said Alexa's overhaul will be powered by some of Bedrock's orchestration tools. He said:

"We are in the process right now rearchitecting the brains of Alexa with multiple foundation models. And it's going to not only help Alexa answer questions even better, but it's going to do what very few generative AI applications do today, which is to understand and anticipate your needs, actually take action for you. So you can expect to see this in the coming months."

Jassy said the Bedrock capabilities are all about model choices. He said Amazon uses a lot of Anthropic's Claude family of models but also leverages Meta's Llama. "Choice matters with model selection," said Jassy. "It's one of the reasons why we work on our own frontier models." 

AWS launched a series of models called Nova. Think of Nova as the LLM equivalent of what Amazon is doing with Trainium. 

Amazon Bedrock will also get the following:

  • Intelligent prompt routing, which will automatically route requests among foundation models in the same family. The aim is to provide high-quality responses with low cost and latency. The routing will be based on the predicted performance of each request. Customers can also provide ground truth data to improve predictions.
  • Model distillation so customers can create compressed and smaller models with high accuracy and lower latency and costs. Customer can distill models by providing a chosen base model latency and training data.

  • Automated reasoning check, which will validate or invalidate genAI responses using automated reasoning and proofs. The feature will explain why a generative AI response is accurate and inaccurate using provably sound mathematically techniques. These proofs are based on domain models from regulators, tax law and other documents.
  • New models from Luma AI, a specialist in creating video clips from text and images, and Poolside, which specializes in models for software engineering. Amazon Bedrock has also expanded models from its current providers such as AI21 Labs, Anthropic, Cohere, Meta, Mistral, Stability.ai and Amazon.
Data to Decisions Innovation & Product-led Growth Future of Work Tech Optimization Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity AWS reInvent aws amazon AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Executive Officer Chief Technology Officer Chief AI Officer Chief Data Officer Chief Analytics Officer Chief Information Security Officer Chief Product Officer

AWS revamps S3, databases with eye on AI, analytics workloads

AWS revamps S3, databases with eye on AI, analytics workloads

AWS outlined a series of improvements to its S3 service to manage metadata automatically, leverage Apache Iceberg tables and optimize for analytics workloads with Amazon S3 Tables. Also on the data front, AWS moved to reduce latency for its databases.

At AWS re:Invent 2024, CEO Matt Garman said services like S3 and Amazon Aurora DSQL are designed to set up enterprises to make data lakes, analytics and AI more seamless. "We'll continually optimize that query performance for you and the cost as your data lake scales," said Garman.

Garman's storage and data talk featured JPMorgan Chase CIO Lori Beer to talk about how the bank is using AWS for its data infrastructure. The upshot is that AWS is aiming to enable its enterprise customers to set up data services for AI. "Our goal is to leverage genAI at scale," said Beer.

Here's the rundown of the storage and database enhancements at AWS.

  • S3 will automatically generate metadata when captured as S3 objects. This service is in preview and data will be stored in managed Apache Iceberg tables. This move sets S3 up to improve inference workloads and data sharing with services like Bedrock.
  • Amazon S3 Tables will provide storage that's optimized for tabular data including daily purchase transactions, sensor data and other information.
  • AWS retooled its database engine. Aurora DSQL is designed to be the fastest distributed SQL database that handles management, delivers low latency reads and writes. Aurora DSQL is also Postgres compatible.
  • DynamoDB will also get global tables and the same low latency setup.
Data to Decisions Innovation & Product-led Growth Future of Work Tech Optimization Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity AWS reInvent aws amazon Big Data AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Technology Officer Chief Information Security Officer Chief Data Officer Chief Executive Officer Chief AI Officer Chief Analytics Officer Chief Product Officer

AWS scales up Trainium2 with UltraServer, touts Apple, Anthropic as customers

AWS scales up Trainium2 with UltraServer, touts Apple, Anthropic as customers

AWS launched new instances based on its Trainium2 processor, which offers 4 times the performance of Trainium1 with twice as more energy efficiency. AWS also prepped for larger training workloads with Trainium2 UltraServers that will be pooled into a cluster.

The cloud giant also set the table for AWS Trainium3.

Trainium2 provides more than 20.8 petaflops of FP8 compute, up from 6 petaflops with the first Trainium. EFA networking in Trainium2 is 3.2 Tbps, up from 1.6 Tbps and HBM is 1.5 TB, up from 512 GB. Memory bandwidth for Trainium2 is 45 TB/s, up from 9.8 TB/s.

AWS CEO Matt Garman said at the re:Invent 2024 keynote that Adobe, Poolside, Databricks, Qualcomm and Anthropic were among the companies working on Trainium2 instances.

Garman was also joined on stage by Apple, who is working on Trainium2 for training workloads. He said:

"Effectively, an UltraServer connects four Trainium2 instances, so 64 Trainium2 chips, all interconnected by the high-speed, low-latency neural link connectivity. This gives you a single ultra node with over 83 petaflops of compute from this single compute node. Now you can load one of these really large models all into a single node and deliver much better latency, much better performance for customers without having to break it up over multiple nodes."

AWS was early to using custom silicon for training and inference and is looking to provide less expensive options than Nvidia. Apple is already using Trainium and Inferentia instances to power its own models, writing tools, Siri improvements and other additions.

The cloud giant's product cadence is designed in part to enable customers to easily shift training and inference workloads to optimize costs.

More:

Garman said that AWS is stringing together Trainium instances in a cluster that will be able to provide compute for the largest models. Apple said it is currently evaluating Trainium2.

AWS' custom silicon strategy also revolves around Graviton instances as well as Inferentia. Customers on a panel at re:Invent highlighted how they were using AWS processors.

Although AWS has custom silicon, it is also rolling out instances based on other chips. Garman noted that AWS would launch new P6 instances on Nvidia's Blackwell GPUs with availability early 2025. AWS is also launching new AMD instances.

The bet is that AWS’ custom silicon will ultimately yield better price and performance for enterprise AI.

Constellation Research analyst Holger Mueller said:

"AWS is making good progress building more powerful combinations of its Trainum chips. It is showing that they have solved potential heating, electromagnetic interference and cooling issues. You can expect that Trainium will be scaled to 64 and potentially 128 chips per instance. But it all needs to be put in perspective as Google Cloud is on version 6 of its custom silicon. The announcement puts Amazon ahead of Microsoft." 

Data to Decisions Innovation & Product-led Growth Future of Work Tech Optimization Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Executive Officer Chief Technology Officer Chief AI Officer Chief Data Officer Chief Analytics Officer Chief Information Security Officer Chief Product Officer

Zscaler plays platformization game amid strong Q1, mixed outlook for Q2

Zscaler plays platformization game amid strong Q1, mixed outlook for Q2

Jay Chaudhry, CEO of Zscaler, said the company's integration with competing cybersecurity platforms such as CrowdStrike, generative AI upsells and new executive additions will fuel growth in future quarters.

"In my scores of customer conversations, CXOs are prioritizing zero trust security and AI for their IT spending. We are fighting AI with AI. We recently delivered several AI innovations and are continuing to expand our AI portfolio," said Chaudhry.

The outlook for the second quarter, however, fell short of expectations.

Zscaler's first quarter results were better than expected. The company reported first quarter non-GAAP earnings of 77 cents a share on revenue of $628 million, up 26% from a year ago. Wall Street was expecting non-GAAP earnings of 63 cents a share on revenue of $605.55 million. On a GAAP basis, Zscaler reported a net loss of $12.1 million, or 8 cents a share.

As for the outlook, Zscaler said its second quarter revenue will be between $633 million to $635 million with non-GAAP earnings of 68 cents a share to 69 cents a share. Wall Street was expecting second quarter earnings of 70 cents a share on revenue of $633.8 million.

For fiscal 2025, Zscaler projected revenue of $2.62 billion to $2.64 billion, up from its $2.6 billion to $2.62 billion range.

Chaudhry said Zscaler is focused on using AI to secure applications, enable enterprise usage of generative AI copilots for security and provide visibility across cloud and on-prem environments. He added that Zscaler was seeing larger deals due to genAI upsells with ZDX Copilot and automation.

Yes, Zscaler is also playing the platformization play too like Palo Alto Networks and CrowdStrike. Chaudhry said:

"To make up for the flawed architecture, legacy security vendors are offering disjointed point products under the pretext of a platform. This increases cost and complexity for customers. A Fortune 50 retail customer recently told me that a legacy firewall vendor sold them a so-called platform. And when they tried to implement it, they found that it was nothing more than consolidated billings. Complexity is the enemy of security and resilience. No wonder so many enterprises are getting breached despite spending billions of dollars on so-called SASE security, which is nothing more than virtual firewalls and VPNs in the cloud.

The sooner organizations move away from these disjointed security solutions to Zero Trust, the sooner they will become secure and resilient."

Chaudhry said the company's Chief Revenue Officer Mike Rich, a ServiceNow alum, has moved the company to account-based selling and has improved the pipeline since joining the company a year ago. Zscaler has also ramped sales hiring and cut attrition.

The company also recently hired Adam Geller to be Chief Product Officer. Geller was previously at Exabeam and Palo Alto Networks.

Zscaler was also upbeat about securing Office 365 and Microsoft Copilot implementations.

So, what's the problem with the outlook? Zscaler said CIOs are still scrutinizing large deals. However, the guidance is likely to be a bit conservative. Chaudhry said:

"We are seeing interest in cyber that can really reduce the chance of ransomware attacks and the like. So that's where Zero Trust comes in. And then the CIO often will say: 'I like your cyber method. But if you can reduce my cost and complexity I'm doubly motivated.' We have combined the need for Zero Trust and now AI becomes a further catalyst with cost and complexity reduction, which is helping us because most companies can't do cost reduction."

 

 

Digital Safety, Privacy & Cybersecurity Data to Decisions Innovation & Product-led Growth Future of Work Tech Optimization Next-Generation Customer Experience AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology cybersecurity Chief Information Officer Chief Information Security Officer Chief Privacy Officer Chief AI Officer Chief Experience Officer

AWS outlines new data center, server, cooling designs for AI workloads

AWS outlines new data center, server, cooling designs for AI workloads

Amazon Web Services said it will deploy simplified electrical and mechanical designs, liquid cooling, new rack designs and updated control systems to handle AI workloads sustainably.

The news, outlined at re:Invent 2024 in Las Vegas, landed ahead of CEO Matt Garman's keynote on Tuesday. AWS said the new flexible data center components will enable it to provide 12% more compute power while boosting availability and efficiency.

More re:Invent 2024:

AWS, like other hyperscale data center operators, is revamping designs and offering custom silicon to become more efficient to handle AI workloads and hit sustainability goals. AWS said the components will be modular and retrofit existing infrastructure. These additions will also support GPU-based servers, which will require liquid cooling.

Here's a look at the changes:

  • Simplified electrical distribution systems that minimize downtime and the number of racks impacted by electrical issues by 89%. AWS said it has reduced the number of failure points by 20%. AWS also brought backup power closer to the rack and reduced the number of fans.
  • AWS added configurable liquid-to-chip cooling in new and existing data centers. Updated systems will integrate air and liquid cooling for AI chips including AWS Trainium 2 and Nvidia GB200.
  • The company changed how it positions racks in a data center and optimized for high-density AI workloads. Software additions will predict the most efficient ways to place servers.
  • AWS is building out its control systems to standardize monitoring, alarms and operating tools.

As for sustainability, AWS said that it has been able to cut mechanical energy consumption by 46% with a 35% reduction in carbon used in concrete.

Tech Optimization Data to Decisions AWS reInvent aws amazon Big Data Chief Information Officer Chief Technology Officer Chief Information Security Officer Chief Data Officer

AWS re:Invent 2024: Four AWS customer vignettes with Merck, Capital One, Sprinkr, Goldman Sachs

AWS re:Invent 2024: Four AWS customer vignettes with Merck, Capital One, Sprinkr, Goldman Sachs

AWS customers are increasingly focused on using cloud management approaches on-premises, optimizing GPU costs and modernizing mainframe infrastructure.

Those were some of the customer takeaways from AWS re:Invent 2024's first day. AWS' news flow starts in earnest on Tuesday so it's worth highlighting a few tales from the buy side today.

Merck on using cloud approaches on-prem

Merck's Jeff Feist, Executive Director, Hosting Solutions, is in charge of the pharma giant's cloud environment and on-premises. Feist said it wants to simplify the hybrid infrastructure and lower total cost of ownership.

Feist also added that the company is focused on transformation with an effort called BlueSky.

"My role has been focusing on the landing zones, developing automated governance controls, making sure that we have a safe, secure and agile environment to leverage the benefits that cloud offers," said Feist. BlueSky includes the following:

  • Roll out infrastructure as code, automated deployments and APIs with software defined configurations.
  • Establish a culture that's agile. "It's probably more important than the technology itself," said Feist. "We need the culture of the company to embrace the model cloud way of working."
  • Training.
  • Focus on delivering business value. The company has modernized more than 2,000 applications with cloud native services. Merck retired more than 1,000 applications.

Going forward, Feist said the company is using Amazon Outposts to focus on cloud models with on-premises environments. Feist said it is adopting a more simple management interface where AWS is responsible for maintenance.

In a nutshell, Feist is looking to make its on-prem infrastructure run like the public cloud setup it has with AWS.

Capital One on optimization and tracking cloud costs

Ed Peters, Vice President, Distinguished Engineer at Capital One, leads an ongoing transformation to create a bank that can use data and insight to disrupt the industry. Capital One has been an AWS customer since 2016.

Capital One has adopted AWS and has a focus on optimizing the infrastructure for costs. "We have a robust FinOps practice," said Peters. "We take the billing data and marry it up with the telemetry tracking and resource tagging information."

Peters said Capital One has saved millions of dollars with optimization. He said:

"We tag everything in our AWS cloud, down to billing units, individual teams, applications. I have a dashboard I can have access to that. I can tell you the monthly spend on any given application. We can drive very, very useful insight into the usage of the cloud, and we can focus our optimization on where it needs to be."

The company is also using Graviton to save money.

Going forward, Capital One is focused on generative AI workloads and building out an infrastructure that can be optimized and automated. Peters said Capital One is in a working group with AMD and Nvidia to optimize GPU workloads.

"We will continue to push forward in generative AI and its application in financial services," said Peters.

Capital One is also focused on transferring more of its operations including financial ledgers and business operations to the cloud.

Sprinklr on benchmarking GPU costs, smaller models

Sprinklr's Jamal Mazhar, Vice President of Engineering at Splinklr, said the company invested in AI early and has been focused on scaling its data ingestion and processing in a cost efficient way.

"We have thousands of servers and petabytes of data," he said.

As a result, Mazhar said Sprinkr has been focused on experimenting with instances that have a good cost ratio for compute and storage. Mazhar said his company has optimized on Graviton and scaling its Elastic Search workloads.

Mazhar said he has also been focused on smaller large language models and cutting GPU overhead. He said:

"A lot of times people use GPUs for AI workloads. But what we found out is that several of our inference models, which are very small in size, there's an overhead of using GPUs. For a smaller models, you can do quite well with compute intensive instances."

Mazhar said Sprinkr has been benchmarking AI workloads for its inference models and AI workloads. He added that the company has seen a 20% to 30% cost reduction. He said:

"When you try use a more expensive chip, you feel like you're going to get better performance. Just benchmarking the workload makes you realize that the GPU is not necessarily overkill. You're not using it properly."

Goldman Sachs: Modernizing mainframes

Victor Balta, Managing Director at Goldman Sachs, said the investment firm was focused on moving its mainframe software, which was licensed from FIS decades ago and heavily customized. FIS has said it won't support mainframes, which supports Goldman Sachs InvestOne platform.

InvestOne is Goldman Sachs investment book of record and sits in Goldman Sachs Asset Management, which oversees $2 trillion in assets. The mainframe architecture was running more than $6 million a year ago support and had limited scaling ability and integration.

Balta said Goldman Sachs created an emulator that would allow its COBOL-based system to run on AWS. Goldman Sachs also decoupled components such as data streaming, real-time integrations and batch processing to reduce costs.

"Currently we have a team of more than 20 global engineers supporting the platform," said Balta. "It's very expensive to run on mainframe with the complexity and integration. You don't have the same number of APIs or data connects to integrate with the mainframe. We're very limited on what we do. And sourcing high skilled COBOL engineers with that financial background is difficult."

Simply put, Goldman Sachs had 30 years of custom COBOL code. Rewriting it wasn't a possibility in a quick time frame so it decided to lift and shift with an emulator and go from there.

Going forward, Goldman Sachs Technology Fellow Yitz Loriner said the company will begin to reinvent its system so it can scale and create a new software development lifecycle.

"The emulator is just the first step because we wanted to reduce the blast radius of changing the infrastructure without changing the existing interface," said Balta. "It's a pragmatic approach."

Data to Decisions Tech Optimization Innovation & Product-led Growth Future of Work Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity AWS reInvent aws amazon AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Executive Officer Chief Technology Officer Chief AI Officer Chief Data Officer Chief Analytics Officer Chief Information Security Officer Chief Product Officer