Results

Anthropic CEO Amodei on where LLMs are headed, enterprise use cases, scaling

Anthropic CEO Amodei on where LLMs are headed, enterprise use cases, scaling

Anthropic CEO Dario Amodei said large language model personality is starting to matter, argued costs to train models will come down and agents that act autonomously will need more scale and reliability.

Those were some of the takeaways from Amodei, who spoke at Google Cloud Next.

Model personalities will start to matter. Amodei covered the launch of Claude 3 and said a lot of effort was put into making the large language model personable. He said:

"One thing that we worked particularly hard on was the personality of the model. We've had this kind of chat paradigm of models for a while, but how engaging the model was hasn't had as much attention as reasoning capabilities. How much does it sound like a human? How warm and natural is it to talk to? We had an entire team devoted to making sure that Claude 3 is engaging."

Models need families. Amodei said the strategy for Claude 3 was to create a family of models. "Opus is the largest one. Sonnet is the smaller one, but faster and cheaper. Haiku is very fast and very cheap," he said. "Enterprises have different needs. Opus is very good at performing difficult tasks where you have to do exact calculations and those calculations have to be accurate. Sonnet is the workhorse model in the middle. I'm excited about Haiku because it outperforms almost all of its intelligence class while being fast and cheap."

More Anthropic: AWS ups its investment in Anthropic as giants form spheres of LLM influence | Constellation ShortList™ Cloud AI Developer Services | 

Costs of training and inference. Amodei said costs for training and inference are coming down and will continue to fall, but more will be spent on training models. He said:

"I think the cost of training a particular model is going to go down very drastically but the models are so economically valuable that the amount of money that's spent on training is going to continue growing exponentially. We'll eat up all the efficiency gains at least at the higher end of models. Within Anthropic we measure things in units we call effective compute. I think that is going to go up 10x per year. That can't last forever, and no one knows for sure how long it'll last, but that's where we are right now."

How LLMs will develop over next few years. Amodei said model intelligence will come from pure scale. Future reliability and ability to handle specialized tasks will come from more scale and multi-modality with images, video and audio inputs. There will also be interactions with the physical world, maybe even robotics.

Hallucinations will also be a key challenge. "We have substantial teams to reduce the amount of hallucination at the present in models," said Amodei.

"The final thing I expect to see in the next year or two is agents models acting in the world," he added. "We've seen lots of instantiations of agents so far, but we haven't seen anything yet."

Enterprise use cases. Amodei said as models get smarter and trained for longer, they become much better at coding tasks. Healthcare and biomedicine will also be key use cases as well as finance and legal uses. "These use cases often involve reading long documents which Claude 3 has gotten better at relative to previous models," said Amodei.

Corporate use cases appear to be split evenly between creating internal tools to make employees more productive and customer facing uses. Consumer-facing companies will enable users to do more sophisticated tasks by coupling APIs.

Amodei said the cost of models for these use cases will become less of an issue since they'll be right sized for the task at hand.

The importance of prompt engineering. Amodei said enterprises should spend time with prompt engineers to test models and make sure they work as expected.

He said:

"We are still trying to figure out how our own models work. A large language model is very complicated object. When we deploy it, there's no way for us to figure out everything that it's capable of ahead of time. One of the most important things we do is just providing good prompt engineering support. It sounds simple, but 30 minutes with the prompt engineer can often make an application work when it wasn't before, or get better at handling errors.

I always recommend to an enterprise customer just meet with one of our prompt engineers for half an hour. It might completely transform your use case. There's a big difference between demos and actual deployment."

Safety and reliability. Anthropic recently published a paper on jailbreaking models. Amodei also said partnerships with Google Cloud revolve around security and reliability. Enterprises need both reliability and security to scale deployments of generative AI.

Amodei said short term concerns for models revolve around bias and misleading answers when important decisions need to be made in industries like finance, insurance, credit and legal. Overall, Amodei said his concern is how models will become increasingly powerful. He said:

"I think it's going to be possible for folks to misuse models. I worry about misuse of biology. I worry about cyberattacks. We have something called a responsible scaling plan, that's designed to detect those threats which honestly are not really very present today. We're only starting to see the beginning of them. So, every time we release a new model, we run we run them through this. We run tests to see if we are getting any closer to the world where we would be worried about these risks being present in models. And so far, the answer has always been no, but they're a little bit better at these tasks than they were before. Someday, the answer will be yes. And then we have a prescribed set of safety procedures that we'll take on the model. When that is the case, the other side of the risks is as models become more autonomous."

When models become agents and more autonomous, they can take actions without humans overseeing them. "I think there will be very substantial risks in this area, and we'll have to have policies. We'll have to mitigate them," said Amodei, who noted that enterprises will ask about those concerns as much as they do data privacy and hallucinations today.

Large custom models. Amodei said enterprises won't be in a position in the future of choosing a small custom model or a large general one. The correct fit will be a large custom model. LLMs will be customized for biology, finance and other industries.

What needs to happen beyond LLMs to create agents that take actions on your behalf? Amodei said "it's kind of an unexplored frontier." He said:

"One of my guesses is that if you want an agent to act in the world it requires the model to engage in a series of actions. You talk to a chat bot, it only answers and maybe there's a little follow-up. With agents you might need to take a bunch of actions, see what happens in the world or with a human and then take more actions. You need to do a long sequence of things and the error rate on each of the individual things has to be pretty low. There are probably thousands of actions that go into that. Models need to get more reliable because the individual steps need to have very low error rates. Part of that will come from scale. We need another generation or two of scale before the agents will really work."

Data to Decisions Innovation & Product-led Growth Future of Work Tech Optimization Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Executive Officer Chief Technology Officer Chief AI Officer Chief Data Officer Chief Analytics Officer Chief Information Security Officer Chief Product Officer

Microsoft raises Dynamics 365 prices starting Oct. 1

Microsoft raises Dynamics 365 prices starting Oct. 1

Microsoft said it is raising the prices for its Dynamics 365 enterprise resource planning and customer relationship management applications. 

The company said that Dynamics 365 hasn't seen a price increase in more than 5 years. The price changes go into effect Oct. 1 and range from an additional $10 to $15 more a month per user for most apps, but $30 more for a select apps. 

Microsoft's Dynamics 365 price increases apply to cloud and on-premise versions. US government list prices will increase 10% Oct. 1, 2024 and then see a smaller increase Oct. 1, 2025 to be on par with commercial pricing. These price increases don't appear to affect small business customers. 

Copilot capabilities delivered in Dynamics 365 are in the core SKUs. Copilot for Service, Copilot for Sales (both GA'd), and Copilot for Finance (in preview) require separate licenses. In other words, if Copilot is part of Dynamics 365 it does not get charged as extra. There are product SKUs called CoPilot for Sales, CoPilot for Service and CoPilot for Finance that are compatible with multiple CRM systems including Salesforce, so those products are the per seat per month Copilots

Here's a look at the changes.

Product  Price before October 1, 2024  Price as of October 1, 2024 
Microsoft Dynamics 365 Sales Enterprise  $95  $105 
Microsoft Dynamics 365 Sales Device  $145  $160 
Microsoft Dynamics 365 Sales Premium  $135  $150 
Microsoft Microsoft Relationship Sales3  $162  $177 
Microsoft Dynamics 365 Customer Service Enterprise  $95  $105 
Microsoft Dynamics 365 Customer Service Device  $145  $160 
Microsoft Dynamics 365 Field Service  $95  $105 
Microsoft Dynamics 365 Field Service Device  $145  $160 
Microsoft Dynamics 365 Finance  $180  $210 
Microsoft Dynamics 365 Supply Chain Management  $180  $210 
Microsoft Dynamics 365 Commerce  $180  $210 
Microsoft Dynamics 365 Human Resources  $120  $135 
Microsoft Dynamics 365 Project Operations  $120  $135 
Microsoft Dynamics 365 Operations – Device  $75  $85 
Next-Generation Customer Experience Microsoft Chief Information Officer

HOT TAKE: Cisco completes Splunk Acquisition - Constellation’s Take.

HOT TAKE: Cisco completes Splunk Acquisition - Constellation’s Take.

Last week, executives from Cisco and Splunk, including Liz Centoni, Jeetu Patel, and Tom Casey, held a 45-minute round table where the combined entity outlined their plans for Cisco’s observability future. General opportunities and high-level customer observability pain points were communicated in that discussion. Yet, customers still seek high-level action plans and specific execution details from the merger. While generic customer pain points to observability and security were discussed, the market sought more information about how these major observability platforms would come together. The full video of this roundtable can be seen here -> https://www.youtube.com/watch?v=PsZ2z66i6JI

Tom Casey from Splunk has taken over the product ownership of the Cisco #Observability solution strategy. This move aims to reduce leadership alignment friction by not having competing priorities across #O11y divisions to drive a unified platform. The collaboration between Cisco and Splunk has the potential to provide visibility from the network to the application level, which other observability vendors lack. However, the details of how this will be accomplished are not yet clear.

Splunk has been working hard to integrate its recent acquisitions, including SignalFx, Omnition, Rigor, Flowmill, Plumbr, and VictorOps, into its Observability platform. With Cisco acquiring them, Splunk’s initial direction of keeping SignalFx as a Splunk observability cloud while maintaining the Cloud logs as Splunk Platform (as it was difficult to change the architecture completely to merge them all together) might change. We still don’t know which platform the incoming observability products, such as AppD, Thousand Eyes, and FSO (Full Stack Observability), will move into or merge with.  They also diverted investments or decommissioned some acquisitions, including VictorOps and Incident Intelligence to make things simpler (Though engineering and support teams maintain those solutions, product, and strategy teams were eliminated thereby indicating the future of these products may be short-lived).

Given Cisco's history and past experience of integrating observability products, such as AppD and ThousandEyes, and Cisco’s own organic observability platform FSO, and the time Cisco took to streamline operations, field teams, pricing, and create a combined solution, Constellation expects that this new collaboration will take even longer to come to fruition. Many existing Splunk and AppD customers have expressed concerns about how this collaboration will unfold. For example, they are worried about getting the right recommendations from the field/solution teams given many overlapping solutions. Customers are very nervous about the combined Cisco observability solution pricing structure going forward, and whether they will pay a double dip fee to Cisco, which has not been fully disclosed yet. The combination of multiple platforms, add-ons, suites, packaging, overlapping features, and licensing models may confuse the customers, and field teams until the unified pricing structure and full-stack unified platform take shape. These include DEM (Synthetic Monitoring & RUM), APM (Distributed Tracing), metric stores, tracing stores, session replay capabilities, Infrastructure monitoring, and log capabilities overlap along with Splunk having their own powerful query language (SPL) which Cisco’s observability solutions lack. Cisco should proactively take the time to clearly explain these outcomes to customers and properly execute on it with specific defined milestones.

Furthermore, both companies claim that the acquisition is to catch up with AI demands. Yet, neither of them is a leader in infusing AI into their Observability or AIOps solutions. There are other competing vendors ahead of Cisco/Splunk with their generally available AI use cases, which Cisco/Splunk both need to catch up with. For instance, Splunk AI assistant (formerly SPL Co-Pilot), introduced in .conf23, is still in preview mode and constitutes a very basic use case of using a natural language interface to produce SPL (Splunk query language) used in observability data searches. Cisco's AI does not perform any observability-related tasks yet. It will be interesting to see how many AI use cases they can support quickly to catch up with the market.

Since a significant portion of Splunk's revenue comes from their ARR, this could help Cisco launch into the ARR model, which they have been trying to expand for the last few years.

Constellation POV

Based on our conversations with existing Splunk and Cisco customers, and Splunk ex-employees, Constellation believes that the integration faces many challenges. Constellation expects that the combined entity will take at least two years to complete post-merger integration in a manner that users will see the benefits.

Although the Cisco/Splunk team has said all the right things so far, execution will be critical, and it could be painful and slow, which may cost some large accounts that are already experimenting with competing solutions. Constellation believes that the overall merger will bring benefits to customers and partners, but be prepared for a much longer than expected post-merger integration, given the different architectures, consumption models, data collected, culture, and technical debt accrued over the years.

At first glance, the idea of combining Security with Observability seems to be a good one, and it aligns well with Splunk's ongoing mission before the acquisition. Bottom line – while this high-level strategy sounds promising, it needs more details to be fully understood and value realized.

Data to Decisions Tech Optimization Innovation & Product-led Growth Future of Work Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity Splunk cisco systems ML Machine Learning LLMs Agentic AI Generative AI Robotics AI Analytics Automation Quantum Computing Cloud Digital Transformation Disruptive Technology Enterprise IT Enterprise Acceleration Enterprise Software Next Gen Apps IoT Blockchain Leadership VR Chief Digital Officer Chief Analytics Officer Chief Data Officer Chief Technology Officer Chief Executive Officer Chief Information Officer Chief AI Officer Chief Information Security Officer Chief Product Officer

HOT TAKE: Adobe’s Frame.io Serves Up a Reimagined Version and I’m Gloating

HOT TAKE: Adobe’s Frame.io Serves Up a Reimagined Version and I’m Gloating

When Adobe acquired Frame.io, it was chalked up as just another Creative Cloud solution that was so niche and specialized only people with expensive cameras and the agencies that hire them would reap the rewards. But in the wake of the announcement in 2021, I blogged a hot take:

“Imagine what happens when Adobe pulls the best of the best from BOTH Workfront AND Frame.io to reimagine what collaboration for creativity and experience really works like. Only time will tell how far collaboration will connect the two sides of the Adobe coin…If anything can bridge that gap in a meaningful way, it just might be collaboration and workflows.”

I WAS RIGHT. IT IS HAPPENING!

That gloat felt good. Now back to the news at hand.

Adobe’s Frame.io V4 takes collaboration to the next level, focused on the work creative professionals must synch, share, comment on and coordinate to create new experiences. From will.i.am creating a new music video to a brand marketer creating a new story driven transmedia campaign, V4 has both the asset and the process covered. Much like the other updates and modernizations across Creative Cloud, the reimagination of Frame.io has me feeling the rage only true jealousy can bring on.

Let me explain: Many moons ago, I worked on a rebrand for a cosmetic product that required an extensive shoot involving multiple models with unique-yet-natural looks to satisfy a year-long campaign involving photo and video assets. The shoot was booked with an agency, a videographer, a casting agent and a photographer in Cape Town, South Africa…I was in Campbell, California. Let the creative chaos games begin. Briefs were shared, mood boards, story boards and concept briefs passed around for what felt like lifetimes.

As these types of creative jobs go…the shoot happened when I was sound asleep thanks to time zones so when I got the test shots 48 HOURS later, I had to send that “delicate” email of “The brief clearly outlined casting and I approved the first round of models. All the test shots you sent back are of totally different models in completely different scenes and nowhere near what was outlined on the boards?”

Days would be lost in the name of collaboration. Chaos was the norm in the name of asset and file sharing. Budget was lost to misinterpretation.

This new version of Frame.io enables that entire chaotic scenario to become a streamlined workflow centered around an easy to view and review interface, common centralized asset storage and intentionally uncomplicated processes to consolidate the work of creation. I’m secure enough to admit that how elegantly Frame.io reframes the chaos makes me more than a little jealous. It takes hold of the process from casting through to file transfer and sharing, delivers a single pane for commenting and collaborating and intentionally works to accelerate the process with alerts and aggregated comment drawers for smooth signoffs and approvals.

Version 4 also comes with a new single metadata framework that underpins everything allowing all assets, data and collaborators come together in a single, unified platform. Now every piece of the process can exist as metadata on an asset or file. Loved working with an actor you met in casting…that lives on that video. Want to only view dailies by scene or actor…yup…that’s metadata that can live with an asset and be easily searched. Frame.io extends the power of a metadata framework with Collections that aggregates and segments by that metadata.

Let’s follow the bouncing ball of my gloating once more and close your eyes to imagine just how powerful search becomes as this metadata framework extends beyond Frame.io into, for argument’s sake, a Digital Asset Management (DAM) solution like Adobe Assets or a workflow and work management solution like Workfront?

Don’t worry…you won’t have to worry all too long as Frame.io’s integration with Workfront is expected to be released later this year, enabling a new unified review and approval workflow between cross-functional teams. For marketers, agencies and brand leaders, we are talking about visibility and work that connects CAMERA to CAMPAIGN! That’s where this is heading!

Frame.io V4 beta is rolling out in stages for Free and Pro customers across web, iPhone and iPad across 2024 with Team and Enterprise customers expected to get the V4 update later in the year. In a video blog announcing V4, Frame.io’s Founder, Emery Wells, also shared a simplification of the pricing model for the new version.

This is the fourth iteration of Frame.io since the product launched in 2015 and the biggest update the company has ever introduced, reimagining the platform from the ground up but remaining grounded in their customers asks and innovations. Clearly, this whole “expand workflows so the processes of casting, scouting, and dailies review” makes me mutter like an old lady under my breath with that “BACK IN MY DAY” lament. But it really can’t be overstated just how much this work needs this overhaul. We need to reimagine the work and workflows of creatives and creators with tools that don’t just start and stop with outputs and assets but truly connects the totality of this work we call creation.

 

Image generated by Adobe Firefly (and my sick prompt skillz)

Future of Work Marketing Transformation New C-Suite Next-Generation Customer Experience Tech Optimization Chief Customer Officer Chief Digital Officer Chief Executive Officer Chief Information Officer Chief Marketing Officer Chief People Officer Chief Revenue Officer

Google Cloud Next Takeaways from the Constellation Analyst Team

Google Cloud Next Takeaways from the Constellation Analyst Team

"You can't go one minute without hearing about hashtag#AI."

We got the Constellation crew together to hear overarching themes of hashtag#GoogleCloudNext across every coverage area: hashtag#cybersecurity, hashtag#cloud applications, hashtag#data to decisions, hashtag#observability, and hashtag#generativeAI.

Here are a few observations from Google's announcements and hashtag#market positioning:

? Google's AI hashtag#technology is making cybersecurity more accessible (i.e. copilots, agents, etc.)
? Google Cloud has a 2-3 year lead on its competition by putting custom silicon on custom chips (hashtag#TPUs)
? Google offers one AI-ready data platform (including AI, ML, and GenAI) that combines structured and unstructured data.
? Google offers a super infrastructure to train all sizes of hashtag#LLMs, customers can fine-tune and customize existing LLMs for a few hundred dollars.
? Google offers one of the only open AI stacks from one vendor.

A few takeaways for our hashtag#executive audience:

Customers should already be considering how AI technology and hashtag#cloud platforms can drive hashtag#business outcomes in their hashtag#enterprise. hashtag#CXOs must think beyond traditional data silos and invest in platforms supporting a continuum of structured and unstructured data. And finally, re: Google Cloud Next - Google offers an easier way to build models at a cheaper price.

Watch the full interview below with Holger Mueller, Doug Henschen, Andy ThurAI, Chirag Mehta, and R "Ray" Wang.

On ConstellationTV <iframe width="560" height="315" src="https://www.youtube.com/embed/VIFDclyPF8E?si=PE8Pz5XdGAWp9xzq" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
Media Name: Screenshot 2024-04-11 at 13.09.39.png

Amazon CEO Jassy's shareholder letter talks AWS' approach to generative AI

Amazon CEO Jassy's shareholder letter talks AWS' approach to generative AI

Amazon CEO Andy Jassy said AWS is underway building "primitive services," or discrete building blocks, for generative AI and that approach will ensure customers bring more workloads to the cloud service.

Jassy’s shareholder letter landed as Amazon appointed Andrew Ng to its board of directors. Ng is managing general partner of AI Fund. He was also the founder of DeepLearning.AI, co-founder of Coursera and an adjunct professor at Stanford. Ng also has worked with Baidu and Google Brain.

In his 2023 shareholder letter, Jassy spend a good amount of space talking about generative AI and AWS services. Jassy walks through how primitive services were in Amazon's 2003 Vision document and how AWS' approach emerged from a partnership with Target in the early 2000s where Amazon was the back end to Target's web site.

"Pursuing primitives is not a guarantee of success. There are many you could build, and even more ways to combine them. But a good compass is to pick real customer problems you’re trying to solve," said Jassy, who noted that this approach to primitives guides everything from logistics to supply chain to stores to Prime delivery to AWS.

Jassy said AWS is designing a set of primitives focused on the layers of generative AI. The bottom layer is compute with Nvidia and Amazon's in-house processors. SageMaker, which is for customers building their own foundational models, is another service that's driving AI workloads. He noted Workday has cut inference latency by 80% with SageMaker.

The middle layer is where Bedrock will come in. Jassy said:

"What customers have learned at this early stage of GenAI is that there’s meaningful iteration required to build a production GenAI application with the requisite enterprise quality at the cost and latency needed. Customers don’t want only one model. They want access to various models and model sizes for different types of applications. Customers want a service that makes this experimenting and iterating simple, and this is what Bedrock does, which is why customers are so excited about it."

Regarding the application layer, Jassy also outlined AWS approach. He cited services such as Amazon Q, Rufus, Alexa and other applications, but noted most applications will be built by third parties. AWS' spin on the application layer is worth noting. Jassy said:

"While we’re building a substantial number of GenAI applications ourselves, the vast majority will ultimately be built by other companies. However, what we’re building in AWS is not just a compelling app or foundation model. These AWS services, at all three layers of the stack, comprise a set of primitives that democratize this next seminal phase of AI, and will empower internal and external builders to transform virtually every customer experience that we know (and invent altogether new ones as well). We’re optimistic that much of this world-changing AI will be built on top of AWS."

Jassy also noted that AWS' move to help customers save money will pay off in the long run and deals are accelerating along with renewals and migrations.

Other takeaways from the Amazon shareholder letter:

Processes matter as Amazon has discovered in its robotics efforts in its fulfillment network. Jassy said:

"There are dozens of processes we seek to automate to improve safety, productivity, and cost. Some of the biggest opportunities require invention in domains such as storage automation, manipulation, sortation, mobility of large cages across long distances, and automatic identification of items. Many teams would skip right to the complex solution, baking in “just enough” of these disciplines to make a concerted solution work, but which doesn’t solve much more, can’t easily be evolved as new requirements emerge, and that can’t be reused for other initiatives needing many of the same components. However, when you think in primitives, like our Robotics team does, you prioritize the building blocks, picking important initiatives that can benefit from each of these primitives, but which build the tool chest to compose more freely (and quickly) for future and complex needs."

Amazon has built primitive services for everything from storage, trailer loading, pallet mobility and sortation along with AI models to optimize those parts.

Lowering the cost to serve. Jassy said Amazon has plenty of room to continue to lower costs for consumers and its margins. "We’ve challenged every closely held belief in our fulfillment network, and reevaluated every part of it, and found several areas where we believe we can lower costs even further while also delivering faster for customers," said Jassy. "Our inbound fulfillment architecture and resulting inventory placement are areas of focus in 2024, and we have optimism there’s more upside for us."

Data to Decisions Tech Optimization Innovation & Product-led Growth Future of Work Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity amazon AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Executive Officer Chief Technology Officer Chief AI Officer Chief Data Officer Chief Analytics Officer Chief Information Security Officer Chief Product Officer

Meta launches latest chip for AI workloads

Meta launches latest chip for AI workloads

Meta launched its next-generation training and inferencing processor as it optimizes models for its recommendation and ranking workloads.

The second version of the Meta Training and Inference Accelerator (MTIA) highlights how cloud hyperscale players are creating their own processors for large language model (LLM) training and inferencing.

Intel launched its Gaudi 3 accelerator on Tuesday to better compete with AMD and Nvidia. Google Cloud outlined new tensor processor units and Axion, an ARM-based custom chip. AWS has Trainium and Inferentia processors and Microsoft is building out its own AI chips. The upshot is rivals to Nvidia as well as huge customers such as Meta are looking to bring costs down. Why enterprises will want Nvidia competition soon

MTIA.v2 more than doubles compute and memory bandwidth compared to its predecessor released last year. MTIA is only one part of Meta's plan to build its own infrastructure. Meta also updated its PyTorch software stack to account for the updated MTIA processors.

In a blog post, Meta noted:

"MTIA has been deployed in our data centers and is now serving models in production. We are already seeing the positive results of this program as it’s allowing us to dedicate and invest in more compute power for our more intensive AI workloads.

The results so far show that this MTIA chip can handle both low complexity and high complexity ranking and recommendation models which are key components of Meta’s products.  Because we control the whole stack, we can achieve greater efficiency compared to commercially available GPUs (graphics processing units)."

Like other cloud providers such as Google Cloud and AWS, Meta will still purchase Nvidia GPUs and accelerators in bulk, but custom silicon efforts highlight how AI model training and inference workloads will aim to balance cost, speed and efficiency. Not every model needs to be trained by the best processors available.

Here's a look at the MTIA processor comparisons followed by the software stack Meta has deployed.

 

Tech Optimization Data to Decisions Innovation & Product-led Growth Future of Work Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity Big Data AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Technology Officer Chief Information Security Officer Chief Data Officer Chief Executive Officer Chief AI Officer Chief Analytics Officer Chief Product Officer

Google Cloud Next 2024: Customer Interviews

Google Cloud Next 2024: Customer Interviews

The following eight interviews are between Constellation Research founder and analyst R "Ray" Wang and customers attending the 2024 #GoogleCloudNext conference in Las Vegas, Nevada. They discuss the Google keynotes, main takeaways, future business implications, and more. 

The interviewees include:

  • Ron Miller, TechCrunch
  • Ted Abebe, UPS
  • Josh Horton, Cox 2M
  • Niraj Nagrani, Wayfair
  • Rajesh Abhyankar, Persistent Systems
  • Ed Green, McLaren Racing
  • Jason James, Aptos Retail
  • Betsy Atkins, Google Cloud Board Member

On ConstellationTV <div style='padding:56.25% 0 0 0;position:relative;'><iframe src='https://vimeo.com/showcase/11093469/embed' allowfullscreen frameborder='0' style='position:absolute;top:0;left:0;width:100%;height:100%;'></iframe></div>

Google Cloud Next: The role of genAI agents, enterprise use cases

Google Cloud Next: The role of genAI agents, enterprise use cases

Google Cloud pitched an agent-oriented vision for generative AI at Google Cloud Next and highlighted a bevy of emerging use cases going from pilot to production.

"We are now building generative AI agents," said Google Cloud CEO Thomas Kurian. "Agents are intelligent entities that take action to help you achieve specific goals."

These actions can range from helping a shopper find a dress, picking health benefits, nursing shift handoffs, bolstering security defenses or building applications. Google Cloud's agents during the keynote were built with its Gemini large language model, but presumably other LLMs were possible via the company's Model Garden.

Google Cloud continues to "offer widely used first party, third party and open-source models," said Kurian. "Vertex AI can be used to tune, augment, manage and monitor these models."

In many ways, Kurian's riff about agents is Google Cloud's answer to Microsoft's Copilot stack and AWS' Q. What Google Cloud did was tie agents to business outcomes and processes that could be automated. "These agents would connect with other agents as well as humans," said Kurian.

Kurian added that genAI agents powered by Gemini models will be the connective tissue between all of Google Cloud's services.

Constellation Research analyst Holger Mueller summed up Google Cloud's approach with agents:

"In the AI race Google provides the right mix of assistants/agents (not the inflationary number of co-pilots like Microsoft) while providing the Über AI with Gemini Cloud Assist (which has the same ambitions like Amazon's AWS' Q). And all of that on the best hardware infrastructure from chips to intra-data center networking and public networking. Google Cloud is powered by Gemini, the most advanced LLM out there, and offers grounding services with Google Search. All in all Google keeps it lead of 3-4 years when it comes to custom algorithms on custom silicon."

Here's a tour of use cases by the type of agents being deployed on Google Cloud.

Customer agents. For enterprises, customer agents are viewed as extra sales and service people. These agents are able to listen carefully, understand your needs and recommend products and services.

Mercedes Benz highlighted multiple customer agent experiences in car as well as for customizing models to buy. "The sales assistant helps customers to seamlessly interact with Mercedes when booking a test drive or navigating through offerings," said Mercedes Benz CEO Ola Källenius.

Enterprises cited by Google Cloud appeared to be gravitating toward genAI as a service engine. Discover Financial uses genAI to search and summarize procedures during calls and IHG Hotels & Resorts is building a travel planning tool for guests.

In addition, Target is optimizing offers and curbside pickup on its app and site. Best Buy is also building an agent to troubleshoot product issues and manage order deliveries. Paramount+ is also using genAI to personalize viewing recommendations.

Google Cloud customer agents can be tailored by conversation flow, languages and subject matter and then know when to hand off to a human agent.

Employee agents. The returns on employee agents are relatively straightforward: Remove repetitive tasks so employees can be more productive. Employee agents can also streamline chores such as health benefits enrollment.

Most of the employee agent examples were tethered to Gemini models running through Google Cloud Workspace, but via Vertex AI extensions models can connect to any external or internal API. Uber CEO Dara Khosrowshahi said employee agents were being built to aid support teams, summarize user communications and reduce marketing agency spending.

How Uber's tech stack, datasets drive AI, experience, growth

Other use cases included Dasa, a Brazil-based medical diagnostic company, using agents to surface relevant findings in test results; Etsy optimizing ad models; and Pepperdine University, which is using Gemini to provide captions and notes across multiple languages.

Gemini-powered agents in Workspace are also being used to analyze RFPs, contracts and other corporate documents. This analysis of large documents and paperwork automation was a key use case across companies such as HCA Healthcare and Bristol Myers Squibb.

Home Depot is leveraging Gemini for its Sidekick application that manages inventory. See: How Home Depot blends art and science of customer experience

Creative agents. Like employee agents, creative agents have been tied to Workspace in the Google Cloud ecosystem. However, I saw an AWS demo where a marketer or ad agency team can create mood boards, pick models and accelerate content concepts to minutes from days or weeks.

For Google Cloud, creative agents are all about using Gemini to create slides, images and text. Carrefour is using Vertex AI to create dynamic campaigns across social networks quickly.

Procter & Gamble is using Google Cloud's Imagen model to develop images and creative assets. Canva is using Vertex AI to power its Magic Design for Video editing tools.

WPP is using Gemini 1.5 Pro to power its media activation tools.

The returns of creative agents can be powerful in that enterprises can avoid media waste and its associated costs across a campaign. In addition, storyboards can be created and tweaked quickly.

Related: Middle managers and genAI | Why you'll need a chief AI officer | Enterprise generative AI use cases, applications about to surge | CEOs aim genAI at efficiency, automation, says Fortune/Deloitte survey

Data agents. A common use case is using generative AI to search, analyze and summarize document, video and audio repositories to surface insights. A good data agent is one that can answer questions and then tell us what questions we should be asking.

Suresh Kumar, CTO of Walmart, said it is using data agents to comb BigQuery and surface insights for personalization, supply chain signals and improve product listings.

Data agents are being deployed for drug discovery and medical treatments. Mayo Clinic is using data agents to search for more than 50 petabytes of clinical data.

In addition, delivery carriers and airlines are using data agents to optimize shipments and routes.

Data agents can be deployed for data preparation, discovery, analysis, governance and to create data pipelines. These agents can also provide notifications when KPIs are being met or in jeopardy.

Constellation Research analyst Doug Henschen said the data agent argument is strong.

"The vision for Data Agents is pretty compelling, with a key point made by Google Cloud being that multi-modal opportunities lie ahead. Multi-modal GenAI-powered data agents will unlock combinations of structured and unstructured data including video, audio, images and code and correlations with structured data. One scenario that Alphabet CEO Sundar Pichai shared was that of an insurance company adjuster that might combine video, images and text to automate a claims process. With BigQuery at the center, Google Cloud foresees data agents applying multiple engine to data, whether SQL, Spark, search or whatever to solve business problems."

BT150 CXO zeitgeist: Data lakehouses, large models vs. small, genAI hype vs. reality

Code agents. Goldman Sachs CEO David Solomon said that genAI ability to boost developer productivity was promising. "There's evidence that generative AI tools for assisted coding can boost developer efficiency and we're excited about that," said Solomon, who said genAI is being used to analyze content and market signals and boost client engagement.

Goldman Sachs rival JPMorgan Chase also sees a boom in developer productivity with genAI code assistance. JPMorgan Chase CEO Dimon: AI projects pay for themselves, private cloud buildout critical

Wayfair CTO Fiona Tan said the retailer is standardizing Google Code Assist and improvements via Gemini 1.5 Pro. Google Cloud is also leveraging Gemini Code Assist and has increased productivity by 30%.

Security agents. Anyone following the ongoing battle with Palo Alto Networks, CrowdStrike and Zscaler knows generative AI has a big role in security. Google Cloud said that Palo Alto Networks will build on top of Google Cloud AI.

Google Cloud said security agents are designed to incorporate data and intelligence to serve up insights and incident response faster. The win is that generative AI can create a multiplier effect for cybersecurity analysts by analyzing large samples of malicious code.

Charles Schwab and Pfizer were cited as a Google Cloud security customers. The goal of a security agent is to identify and address threats, summarize and explain findings and recommend next steps and remediation playbooks quickly. Ultimately, security agents will automate responses.

Constellation Research analyst Chirag Mehta analyzed Google Cloud's security strategy in a research note. He said:

"As a Google Cloud prospect or customer, take a comprehensive inventory of your current security tools landscape, encompassing Google Cloud and its partner ecosystem. Engage with Google Cloud and security tool vendors to discuss their roadmaps for Google Cloud, with a specific focus on how they plan to leverage AI to address your unique requirements. Additionally, consider exploring tools that offer multi-cloud support, regardless of your primary cloud provider, to future proof your security infrastructure."

 

Data to Decisions Digital Safety, Privacy & Cybersecurity Innovation & Product-led Growth Tech Optimization Future of Work Next-Generation Customer Experience Google Cloud Google SaaS PaaS IaaS Cloud Digital Transformation Disruptive Technology Enterprise IT Enterprise Acceleration Enterprise Software Next Gen Apps IoT Blockchain CRM ERP CCaaS UCaaS Collaboration Enterprise Service AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Chief Information Officer Chief Technology Officer Chief Information Security Officer Chief Data Officer Chief Executive Officer Chief AI Officer Chief Analytics Officer Chief Product Officer

Intel launches Gaudi 3 accelerator with availability in Q2

Intel launches Gaudi 3 accelerator with availability in Q2

Intel said its Gaudi 3 AI accelerator will be available in the second quarter with systems from Dell Technologies, HPE, Lenovo and Supermicro on tap. Intel, along with AMD, is hoping to give Nvidia some competition. 

The chipmaker's Gaudi 3 launch, announced at the Intel Vision conference, is the linchpin of Intel's plans to garner AI training and inference workloads and take share from Nvidia.

According to Intel, Gaudi 3 has 50% average better inference and 40% better average power efficiency than Nvidia H100 with lower costs. It's worth noting that the Nvidia has outlined its Blackwell GPUs and accelerators that leapfrog H100 performance.

Nevertheless, model training will be a balancing act between speed and compute costs. Enterprises will use a bevy of options for AI workloads including Nvidia, AMD and Intel as well as in-house offerings from AWS with Trainium and Inferentia and Google Cloud TPUs.

Key points about Gaudi 3:

  • Intel Gaudi 3 is manufactured on 5 nm process and uses its engines in parallel for deep learning compute and scale.
  • Gaudi 3 has a compute engine of 64 AI-custom and programmable tensor processor cores and eight matrix multiplication engines.
  • Memory boost for generative AI processing.
  • 24 GB Ethernet ports integrated into Gaudi 3 for networking speed.
  • PyTorch framework integration and optimized Hugging Face models.
  • Gaudi 3 PCIe add-in cards.

To go along with the Gaudi 3 launch, Intel said it will create an open platform for enterprise AI along with SAP, RedHat, VMware and other companies. It is also working with the Ultra Ethernet Consortium and will launch a series of network interface cards and AI connectivity chiplets.

 

Tech Optimization Data to Decisions Innovation & Product-led Growth Future of Work Next-Generation Customer Experience Digital Safety, Privacy & Cybersecurity intel AI GenerativeAI ML Machine Learning LLMs Agentic AI Analytics Automation Disruptive Technology Chief Information Officer Chief Executive Officer Chief Technology Officer Chief AI Officer Chief Data Officer Chief Analytics Officer Chief Information Security Officer Chief Product Officer