Skip to main content

Researchers find a way to make photos and muted videos ‘speak’ – here’s what it could mean for your privacy

Capturing audio from a still image may feel like something out of a sci-fi novel, but one scientist has actually devised a way to do it, with the helping hand of AI.

By creating a machine learning tool called Side Eye, a team led by professor of electrical and computer engineering and computer science at Northeastern University, Kevin Fu, can read into images to an extraordinary degree.

By applying Side Eye to a still image, they can determine the gender of a speaker in the room, where the photo was taken, and the words they spoke, according to TechXplore. They can also apply the tool to muted videos.

An AI-powered privacy nightmare?

"Imagine someone is doing a TikTok video and they mute it and dub music," Fu told the publicaton. "Have you ever been curious about what they're really saying? Was it 'Watermelon watermelon' or 'Here's my password?' Was somebody speaking behind them? You can actually pick up what is being spoken off camera."

The machine learning-powered Side Eye exploits image stabilization technology that’s universally used across almost all smartphone cameras. 

Cameras built into smartphones have springs to suspend the lens in liquid, meaning photos aren’t taken blurry or out of focus due to somebody’s shaky grip. Sensors and an electromagnet combine to push the lens in the opposite direction to whatever shakiness is being applied, to stabilize the image.

When somebody speaks near the camera lens while the photo is being taken, it creates tiny vibrations in the springs and bends the light in a subtle way. Although it would be near-impossible to extract the sonic frequency from these vibrations, this is made simple due to the rolling shutter method of photography most cameras use.

"The way cameras work today to reduce cost basically is they don't scan all pixels of an image simultaneously – they do it one row at a time," Fu added. "[That happens] hundreds of thousands of times in a single photo. What this basically means is you're able to amplify by over a thousand times how much frequency information you can get, basically the granularity of the audio."

While Side Eye itself is in a very basic form, and requires far more training data to refine and perfect, should a more advanced form of the system fall into the wrong hands, it could pose a cybersecurity nightmare for many.  

But, there are positive implications for the technology too, especially should a far more advanced form of Side Eye be used as a kind of digital evidence for those working to investigate crime. 

More from TechRadar Pro



Comments

Popular posts from this blog

Windows Copilot leak suggests deeper assimilation with Windows 11 features

Key Windows 11 features may soon be customizable as Microsoft further integrates its Windows Copilot AI assistant into the operating system. This tidbit comes from tech news site Windows Latest , which claims to have discovered new .json (JavaScript Object Notation) files within recent preview builds of Windows 11. These files apparently hint at future upgrades for the desktop AI assistant. For example, a “TaskManagerService-ai-plugin.json” was found which is supposedly a “plugin for Task Manager integration”. If this ever comes out, it could give users the ability to “monitor or close running apps using” Copilot. In total, six are currently tested and they affect various aspects of Windows 11. Next, there is an “AccessbilityTools-ai-plugin.json” that gives Copilot a way to “control accessibility [tools]. This would make it "easier for those with [a] disability to navigate through the system.” Third is “ai-plugin-WindowsSettings.json” for controlling important Windows 11 set...

Google Chrome releases security fix for this major flaw, so update now

Google says it has fixed a high-severity flaw in its Chrome browser which is currently being exploited by threat actors in the wild.  In a security advisory , the company described the flaw being abused and urged the users to apply the fix immediately.  "Google is aware that an exploit for CVE-2023-2033 exists in the wild," the advisory reads. Automatic updates The zero-day in question is a confusion weakness vulnerability in the Chrome V8 JavaScript engine, the company said. Usually, this type of flaw can be used to crash the browser, but in this case it can also be used to run arbitrary code on compromised endpoints.  The flaw was discovered by Clement Lecigne from the Google Threat Analysis Group (TAG). Usually, TAG works on finding flaws abused by nation-states, or state-sponsored threat actors. There is no word on who the threat actors abusing this flaw are, though. Read more > Patch Google Chrome now to fix this emergency security flaw > Emergency...

Samsung's ViewFinity S9 may be the monitor creatives have been searching for

Originally revealed during CES 2023 , Samsung has finally launched its ViewFinity S9 5K monitor after nine long months of waiting.  According to the announcement, the ViewFinity S9 is the company’s first-ever 5K resolution (5,120 x 2880 pixels) IPS display aimed primarily at creatives. IPS stands for in-plane switching , a form of LED tech offering some of the best color output and viewing angles on the market. This quality is highlighted by the fact that the 27-inch screen supports 99 percent of the DCI-P3 color gamut plus delivers 600 nits of brightness.  Altogether, these deliver great picture quality made vibrant by saturated colors and dark shadows. The cherry on top for the ViewFinity S9 is a Matte Display coating to “drastically [reduce] light reflections.”  As a direct rival to the Apple Studio Display , the monitor is an alternative for creative professionals looking for options. It appears Samsung has done its homework as the ViewFinity S9 addresses some of...