Larry Dignan

Editor in Chief of Constellation Insights
Constellation Research
Larry Dignan photograph

Results

In a LinkedIn version of this story on AI inference costs and what to do to fix them, David Giambruno, a regular at Constellation Research conferences and an export on IT efficiency noted:

“From probabilistic to deterministic… move from token to code (easy). From high cost tokens to low cost , server-less agents. Only work when need. This combination has a mean reduction of 60%. have had as much as 98% token reduction.”

Giambruno then followed up with a hard example of how he has mixed and matched models to keep token costs in check. Bottom line: Inference costs can be managed with skill. Here are a few links that have included Giambruno.

Anthropic posted its Project Glasswing update covering how 50 partners are leveraging Mythos Preview to find more than 10,000 high- or critical-severity vulnerabilities. The limitation is how quickly the vulnerabilities can be verified, disclosed and patched.

The update is worth a read, but at a high level Anthropic is arguing for new cybersecurity workflows to patch systems faster. Mythos Preview has also found a bevy of issues with open source code.

In a graphic, here's the update.

Project Glasswing findings

Anthropic said it will expand its partners with the US and allied governments.

Spotify held its Investor Day this week and the news revolved around access to concert tickets and a deal with Universal Music so users can spin up AI covers. The company also said it will have a compounded annual growth rate in mid-teen percentages and gross margins between 35% to 40%. Spotify plans to reach 1 billion subscribers and $100 billion in revenue as a goal.

The recap is here, but there are a few nuances worth noting. Consider:

  • The company's customer experience metric revolves around time well spent for users. Spotify wants to spin up features and categories so customers say their time on the service is worth it. It is an interesting twist on CX and plays into Spotify's penchant to raise prices over time.
  • Spotify knows how AI benefits the company. Niklas Gustavsson, VP of Engineering, said Spotify's AI advantage isn't about LLMs, but "applying general intelligence to our proprietary “Large Taste Model,” trained on trillions of behavioral signals and years of user interaction data across music, podcasts, and audiobooks."

For Spotify, metadata, user behavior, creator tools and cultural context is the secret sauce. Gustav said AI isn't a cost layer but monetization opportunity.

Bottom line: The Spotify game is lifetime value of the customer.

Spotify
Spotify Co-CEO Gustav Söderström

After perusing this IPO filing and thinking for a bit, I can't help but wonder about the following:

  • Index funds and money managers are going to cash out winners (or losers) to pay for SpaceX shares. Look for the MAG 7 led by Nvidia to become the source of cash for SpaceX.
  • This source of cash issue is going to be a bigger deal when Anthropic and OpenAI go public. There's only so much capital.
  • I thought SpaceX was bigger. That reaction is common when you read the IPO filing of a private market darling.
  • That xAI purchase changes the profile for SpaceX and it's a financial sinkhole. Most space sector investors would have preferred SpaceX and Starlink.
  • Who the hell approved this ridiculous total addressable market graphic? Oh we all know.
SpaceX stats 2

End of day I'll let the market figure this one out.

Quote of the week via Ed Zitron:

"How many tokens does it take to do one thing? Is it consistent across every model? Is it consistent across every employee? Are you even measuring how many tokens a task costs? Because if you’re not, that token budget is basically throwing a dart blindfolded.

Okay, now you’ve measured a task, did you make sure to measure it multiple times? Because LLMs can randomly do things differently even with the same prompt and same Claude.MD file and same strictures and same data sources. You’re gonna need at least 10 samples of each task, and you’re gonna need to make sure somebody who actually knows what they’re doing can measure them, because if you get a dimwit, they’re going to say it can do something it can’t.

Unless, of course, you can’t actually measure how many tokens a particular task can take with much accuracy, in which case every single AI token budget is bullshit. And each model does things differently depending on many different variables, some of them a result of the user, some of them a result of the AI labs themselves."

Cohere said it is releasing its Command A+ model under the Apache 2.0 license. Command A+ is a mixture of experts model now available on Hugging Face. The model is designed for efficiency.

The company recently said it will merge with Aleph Alpha. See: Cohere acquires Aleph Alpha to form US AI counterweight

According to Cohere:

“Progress in sovereign AI today depends on advancing three fronts simultaneously: performance, security, and cost. At Cohere, we are investing across all three — both in our models and in the domain-specific capabilities that power North.

That means improving reasoning, multimodal understanding, and coding performance, while ensuring models remain fit to run entirely within customer environments. The goal is not just stronger benchmarks, but systems that can support enterprise-wide transformation under real operational constraints.”

Overall, I like the Cohere move since the game for the company really revolves around North and sovereign AI. You’d have to be crazy to lock yourself into a proprietary model given how quickly things are changing with AI. More:

Intuit reported a better-than-expected third quarter with revenue of $8.6 billion, up 10%, with earnings of $11.09 a share. Non-GAAP earnings were $12.80 a share. Sasan Goodarzi, CEO of Intuit, said the company's "AI-driven export platform strategy" and proprietary data are a winning combination.

As for the outlook, Intuit projected fourth quarter revenue growth of 11% to 12% with non-GAAP earnings of $3.56 a share to $3.62 a share. That outlook includes $300 million in restructuring charges. Intuit is cutting 17% of its workforce to become "a faster, leaner, more focused company."

I'll plan on following Intuit's call and look for more on the AI strategy.

Alibaba CEO Eddie Wu penned his shareholder letter and had a bunch to say about AI. Here's a look:

"We expect the addressable market for companies like Alibaba that provide full-stack AI capabilities is poised to grow exponentially. Against this backdrop, Alibaba's AI has moved beyond the initial investment phase and entered full-scale commercialization."

Alibaba's cloud external revenue growth was 40% in its latest quarter.

"At the infrastructure layer, our proprietary T-Head AI chips have achieved production at scale, delivering high-performance compute capacity to our cloud infrastructure and Model-as-a-Service (MaaS) inference platform."

And then there's Qwen.

"In foundation models, we continued to accelerate our research and development pace, releasing three updates to the Qwen family within the past three months. Our latest generation large language model, Qwen3.7-Max, is specifically engineered for agents and expands the frontier of model capabilities, including in core competencies such as agentic coding and complex reasoning. Complementing the Qwen family, we achieved advancements in specialized models such as HappyOyster, a real-time interactive generative world model, and HappyHorse, a multimodal model for cross-modal understanding and generation."

Here's a look at Qwen 3.7 Max.

Alibaba AI efforts

Salesforce announced a bevy of Informatica updates and integrated the data platform more with its core properties. Here’s what’s available, planned and launching in 2026.

  • Headless Data Management and Headless CLAIRE — Generally Available, Spring 2026
  • Agentic Integration — Q4 2026
  • Data Quality Agent — Generally Available, Spring 2026
  • Metadata Enrichment Agent — Q4 2026
  • Agentic Multidomain MDM and Data Steward Agent — Q4 2026
  • Data 360 Connector and Scanner — Generally Available, Spring 2026
  • MDM Integration with Data 360 — Q3 2026
  • CLAIRE in Slack — In Preview Now — Q4 2026
  • Agent Fabric Context Catalog — Generally Available, Spring 2026