Happy Thursday and welcome to Patent Drop!
Talk is cheap. Today, Salesforce’s patent for a multi-talented AI assistant signals that language models need to be more than just chatbots to be useful. Plus: Ford wants to save you money at the plug; and Disney wants to time ad breaks better.
Let’s get into it.
Salesforce’s Triple Threat
Salesforce doesn’t want its AI models to be all talk.
The company is seeking to patent a system for “multi-modal language models.” For reference, a multi-modal AI model is one that can handle input of multiple different kinds of data, such as text, audio, video and images, rather than just one.
“Existing systems may be designed for a single modality in addition to a text prompt (e.g., images), therefore there is a need for systems and methods for multi-modal language models for responding to instructions,” Salesforce said in the filing.
To achieve this, each type of data would get its own “specialized multimodal encoder” that turns it into something that the language model can comprehend. For example, video input would go through a video encoder, while audio input would go through an audio encoder.
These processed inputs, as well as whatever text prompt or request that the user sent alongside the initial data, are fed to a neural network-based language model. The model then interprets that data and generates an output, which could include any kind of data, not just text.
For example, a user could submit a video of a speech and request that this system summarize the main points. Or, a user could send in a script and ask that it be turned into an audio file.

Making an AI model that can understand multiple kinds of data is a formidable task. These models tend to be data intensive, compute intensive, necessitate tedious labeling, and generally require “more dimensions to understand context,” said Bob Rogers, Ph.D., the co-founder of BeeKeeperAI and CEO of Oii.ai. “I don’t think it’s easy.”
But the potential payoff is huge, said Rogers. For enterprises, a model with a deeper understanding of context can potentially automate the most tedious tasks and boost productivity for any industry, from finance to logistics to creative.
Multimodal AI is a focus of many of the major AI firms. OpenAI debuted multimodal capabilities in GPT-4o in May, with a model that can “reason across audio, vision, and text in real time,” said Rogers. Google’s Gemini language model also has multimodal capabilities, and Microsoft recently unveiled multimodal skills for Copilot.
It stands to reason that Salesforce is following suit, said Rogers, especially with its recent focus on AI agents. The company’s Dreamforce event largely focused on the promise of customizable agents that can operate across enterprises, with CEO Marc Benioff calling these models the “third wave of AI.”
But if Salesforce wants to keep up with the rest of tech and make something actually useful, these models may need to be more than just chatbots, said Rogers. “Otherwise, it would just be a bunch of iffy text chat bot interactions that are automating my work by giving me advice, when what I really want it to do is create,” he said.
“I think that’s where they’re headed, really letting [agents] automate all the steps,” he added. “But for the agents to do everything, that agent has to be multimodal.”
Ford’s Price Check
Ford thinks you should know how much you’re really paying at the plug.
The automaker filed a patent application for “true electric vehicle charging price evaluation and optimization.” Ford’s tech essentially monitors a vehicle’s battery state to optimize the time and cost of EV charging for the driver.
The cost of public charging stations often “do not have consistent pricing and charging services,” Ford said, and can vary significantly due to things like demand and load on the grid. Additionally, a crowd at one station can lead to longer charging times.
To pick a charging station, Ford’s system monitors an EV battery’s state of charge and temperature, which impacts how quickly a battery can charge without risk of damage. Based on these conditions, the system calculates a “charge acceptance rate,” determining how quickly it can be charged safely.
Ford’s system then searches for nearby charging stations within a predefined range to find options that can meet the battery’s current needs, as well as the driver’s preferences. These options, plus information such as “price per mile” of charging, is displayed in a vehicle’s user interface. It may also suggest stations that offer quicker charging at higher prices.

Charging infrastructure remains a barrier in EV adoption, said Matt McCaffree, general manager of EV charging at FLASH Parking. Put simply, charging isn’t growing as fast as adoption, he said. “If the industry doesn’t start matching that growth rate, then there’s a risk of frustrations and a lack of faith across the consumer base,” he said.
And Ford has taken on this problem before: Some of the company’s previous IP includes “smart charge scheduling,” grid demand prediction, and modular in-home charging tech. This filing, however, presents a more “sophisticated” method of charging tracking than past patents have, said McCaffree. And as the market develops, it’s likely that patent applications will become more advanced, too.
Charging aside, the EV market may be facing an abundance of growing pains. Growth in demand has slowed from its initial rapid climb, and automakers have been forced to adjust their expectations – Ford included.
In late October, Ford announced that it will pause production of the F-150 Lightning pickup truck, its signature EV, through the end of the year. The automaker’s CEO Jim Farley warned of a “slow uptake of EVs” during its recent earnings call.
Though these seem like bad omens, McCaffree said, these may be a sign that EV adoption is simply “beyond its inflection point.” The slowdown may look “dramatic,” but as adoption continues to move forward, sales slumps are “naturally going to happen as you hit higher saturation in any market,” he said.
Disney’s Ad Break
Disney believes that AI can keep you watching during commercial breaks.
The entertainment company filed a patent application for “artificially intelligent ad-break prediction.” As the title suggests, Disney’s tech uses AI to help place ads based on the content itself.
“Ads can be a double-edged sword for media content distributors and consumers alike,” Disney said in the filing. “Too many, or poorly placed ads can be significantly off-putting to the content consumption experience, thereby potentially driving existing subscribers away.”
Disney’s tech uses several machine learning models to analyze a piece of content and determine if it’s “ad-slugged” – that is, does it contain pre-existing segments where ads can be placed, or is it “seamless” and therefore lacks those pre-defined slots.
The system locates black video frames, typically used for breaks and transitions within content. For ad slugged content, the system picks out both black video frames and silent frames, or those without audio. For seamless content, it evaluates for “blackness transitions” between frames, as well as audio continuity to place an ad with as little interruption as possible.
In both scenarios, a “probability score” is calculated that predicts whether or not a certain spot is a good fit for an ad break. This tech allows Disney to automate the ad placement process in streaming without turning off users.

If this invention feels familiar, it’s because it is: Roku filed a similar patent application that uses reinforcement learning to place ads based on a “user state,” such as the content they’re currently watching and their tenure on the platform, as well as tech that places ads based on “scene emotion.”
These filings suggest that AI may be a growing part of entertainment firms’ ad strategies as the tech continues its exponential growth. This tech could provide better return on investment for advertisers as many streamers continue to navigate their ad strategies amid an increasingly crowded ad-supported streaming market.
Tech like this makes sense as Disney seeks to entrench AI throughout its core businesses. Earlier this month, the company created a strategic group to responsibly manage how the company approaches emerging tech, such as AI and mixed reality. The unit, called the Office of Technology Enablement, will be run by former Walt Disney Studios CTO Jamie Voris.
While this new unit will focus on responsible AI and tech adoption, figuring out where and when to use these models will require some delicacy. Though implementations like the one this patent lays out likely won’t ruffle many feathers, Hollywood’s excitement over AI has continued to cause friction between studios and unions over the past year.
Extra Drops
- Airbnb wants to put its best foot forward. The company filed a patent application for “machine learning generated ranking of user reviews.”
- Apple wants to make your routine a little easier. The company is seeking to patent an “intelligent assistant for home automation.”
- EA wants to know when you’re cheating. The game developer wants to patent a system for “detecting collusion in online games.”
What Else is New?
- AMD is laying off 1,000 employees, or roughly 4% of its workforce.
- Cisco reported its fourth straight quarter of declining revenue, though results surpassed analysts’ expectations.
- IT: Understood by Few, Needed by All. Far removed from the dealmaking and handshakes of the front office — the IT department generally sits behind the scenes and, all too often, siloed from business decisions. Camunda’s guide, “Strengthening IT and Business Collaboration,” gives executives a roadmap to transform IT operations from a cost-center to a strategic asset. Supercharge your company’s growth with actionable digital transformation today.*
* Partner
Patent Drop is written by Nat Rubio-Licht. You can find them on Twitter @natrubio__.
Patent Drop is a publication of The Daily Upside. For any questions or comments, feel free to contact us at patentdrop@thedailyupside.com.