Film-Tech Forum ARCHIVE: Alexa scientists claim audio watermarking technique with nearly 100% detection accura

my profile | my password | search | faq & rules | forum home

»	Film-Tech Forum ARCHIVE » Community » Film-Yak » Alexa scientists claim audio watermarking technique with nearly 100% detection accura

Author

Topic: Alexa scientists claim audio watermarking technique with nearly 100% detection accura

Frank Cox
Film God

Posts: 2234
From: Melville Saskatchewan Canada
Registered: Apr 2011

posted 03-31-2019 01:09 PM

Alexa scientists claim audio watermarking technique with nearly 100% detection accuracy

quote:
Ever hear (no pun intended) of audio watermarking? It’s the process of adding distinctive sound patterns identifiable to PCs, and it’s a major way web video hosts, set-top boxes, and media players spot copyrighted tracks. But watermarking schemes aren’t particularly reliable in noisy environments, like when the audio in question is broadcasted over a loudspeaker. The resulting noise and interference — referred to in academic literature as the “second-screen” problem — severely distorts watermarks, and introduces delays that detectors often struggle to reconcile.

Researchers at Amazon, though, believe they’ve pioneered a novel workaround, which they describe in a paper newly published on the preprint server Arxiv (“Audio Watermarking over the Air with Modulated Self-Correlation“) and an accompanying blog post. The team claims their method — which they’ll detail at the International Conference on Acoustics, Speech, and Signal Processing in May — can detect watermarks added to about two seconds of audio with “almost perfect accuracy,” even when the distance between the speaker and detector is greater than 20 feet.

Better still? Unlike traditional acoustic fingerprinting methods, which require storing a separate fingerprint for each instance and have a computational complexity that’s proportional to the fingerprint database, the researchers’ approach has a constant complexity, which they say makes it ideally suited for low-power devices like Bluetooth headsets.

“Our algorithm could complement the acoustic-fingerprinting technology that currently prevents Alexa from erroneously waking when she hears media mentions of her name,” wrote Yuan-yen Tai, a research scientist in Amazon’s Alexa Speech group and coauthor of the paper. “We also envision that audio watermarking could improve the performance of Alexa’s automatic-speech-recognition system. Audio content that Alexa plays — music, audiobooks, podcasts, radio broadcasts, movies — could be watermarked on the fly, so that Alexa-enabled devices can better gauge room reverberation and filter out echoes.”

Amazon watermarking

So how’s it work? As Tai explains, the model employs a “spread-spectrum” technique in which watermark energy is spread across time and frequency, rendering it inaudible to human ears while robustifying it against postprocessing (like compression). And it generates watermarks from noise blocks of a fixed duration, each of which introduces its own distinct pattern to selected frequency components in the host audio signal.

Conventional detectors would compare the resulting sequence of noise blocks — the decoding key — with a reference copy. But Tai and colleagues take a different approach: Their algorithm embeds the noise pattern in the audio signal multiple times and compares it to itself. Because said signal passes through the same acoustic environment, Tai explains, instances of the pattern are distorted in similar ways, enabling them to be compared directly.

“The detector takes advantage of the distortion due to the acoustic channel, rather than combatting it,” he added.

It’s not a perfect solution — it necessitates shorter noise patterns, which correlate to lower detection accuracy, and when the target audio includes music, the rhythms sometimes too closely mimic the repeating noise pattern. But the team says both of these can be largely mitigated with repetitions of the noise block pattern — they randomly invert some of the blocks, decreasing the amplitude of the block where it would normally increase and vice versa.

The decoding key, then, becomes a sequence of binary values instead of noise blocks (a sequence of floating-point values), indicating whether a given noise block is inverted or not. (They’re flipped at the detector stage, at which point they’re compared with the noise block patterns.) In experiments, the team says their algorithm’s performance yielded almost 100 percent detection accuracy with watermarks 1.6 seconds in length.

| IP: Logged

Mitchell Dvoskin
Phenomenal Film Handler

Posts: 1869
From: West Milford, NJ, USA
Registered: Jan 2001

posted 03-31-2019 04:03 PM

> the model employs a “spread-spectrum” technique in which watermark energy is spread across time and frequency, rendering it inaudible to human ears while robustifying it against postprocessing (like compression).

That remains to be heard in the real world.

Back in the 1980's (I think), the recording industry (RIAA) came up with a plan to copy protect audio recordings (analog & digital) by inducing a tone notch in the recording up above the range of human hearing. The idea is that they would push though congress a law requiring all recording equipment (analog & digital) would have to listen for this tone, and if there it would know that the input is copyrighted. Unfortunately, they forgot that there are harmonics up and down the frequencies and into the range of human hearing from this plan. They claimed that the harmonic distortion was so minor that nobody would be able to here it. Stereophile Magazine challenged that notion, and setup a test with the RIAA. In every test, the magazines testers were able to pick out from identical recordings, which version had the notch and with ones didn't.

| IP: Logged

Bill Brandenstein
Master Film Handler

Posts: 413
From: Santa Clarita, CA
Registered: Jul 2013

posted 04-02-2019 06:16 PM

Being of the persuasion that no audio watermark is an acceptable "distortion" of the content, but being a musician who believes artists (particularly smaller acts) often don't get compensated fairly for their work, this presents a bit of a dilemma, doesn't it?

But what of it? So who is going to pursue watermark violations? Will violators end up going to jail or paying fines or something? I just don't see how this is going to benefit either distributors or artists merely because the technology exists. And watermarks might indicate a source or being in possession of stolen property, but what if someone is found to possess watermarked audio they're not entitled to, but got it third- or fourth-hand?

| IP: Logged

Martin Brooks
Jedi Master Film Handler

Posts: 900
From: Forest Hills, NY, USA
Registered: May 2002

posted 04-02-2019 10:18 PM

It will enable rights organizations to further automate searches for rights violations and either force abusers to pay royalties or to be able to issue takedown notices. This isn't about end-users as much as it is about websites and streamers offering content they have no rights to.

It will have flaws (like when servers are in Russia or China), but it's a step. Hopefully, it really is undetectable.

| IP: Logged

Jim Cassedy
Phenomenal Film Handler

Posts: 1661
From: San Francisco, CA
Registered: Dec 2006

posted 04-02-2019 11:30 PM

I suppose this is legit....
- but I'm always a bit skeptical when I read stories like this
around April 1st, and that use terms like "robustifying". [uhoh]

| IP: Logged

Bill Brandenstein
Master Film Handler

Posts: 413
From: Santa Clarita, CA
Registered: Jul 2013

posted 04-03-2019 07:19 PM

I read it too fast and missed "robustifying!"

Martin, that's really similar to what YouTube does now, which requires no watermarking to be effective; just a database of audio recognition. That's also similar to the Shazam app, which is really remarkable in what it "knows" and the poor environments in which it's effective.

So I guess the difference is that watermarking would help identify which source got ripped off.

| IP: Logged

All times are Central (GMT -6:00)

Printer-friendly view of this topic

The Film-Tech Forums are designed for various members related to the cinema industry to express their opinions, viewpoints and testimonials on various products, services and events based upon speculation, personal knowledge and factual information through use, therefore all views represented here allow no liability upon the publishers of this web site and the owners of said views assume no liability for any ill will resulting from these postings. The posts made here are for educational as well as entertainment purposes and as such anyone viewing this portion of the website must accept these views as statements of the author of that opinion and agrees to release the authors from any and all liability.