Digital sound-alikes: Difference between revisions

work on the formulation of the definition
(+ As of 2019 Symantec research knows of 3 cases where digital sound-alike technology has been used for crimes + <ref name="WaPo2019">)
(work on the formulation of the definition)
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
When it cannot be determined by human testing, is some synthesized recording a simulation of some person's speech, or is it a recording made of that person's actual real voice, it is a '''digital sound-alike'''.  
When it cannot be determined by human testing whether some fake voice is a synthetic fake of some person's voice, or is it an actual recording made of that person's actual real voice, it is a '''digital sound-alike'''.  


As of '''2019''' Symantec research knows of 3 cases where digital sound-alike technology '''has been used for crimes'''.<ref name="WaPo2019">
{{cite web
|url= https://www.washingtonpost.com/technology/2019/09/04/an-artificial-intelligence-first-voice-mimicking-software-reportedly-used-major-theft/
|title= An artificial-intelligence first: Voice-mimicking software reportedly used in a major theft
|last= Drew
|first= Harwell
|date= 2019-09-04
|website=
|publisher=
|access-date= 2019-09-089
|quote= }}
</ref>


Living people can defend¹ themselves against digital sound-alike by denying the things the digital sound-alike says if they are presented to the target, but dead people cannot. Digital sound-alikes offer criminals new disinformation attack vectors and wreak havoc on provability.  
Living people can defend¹ themselves against digital sound-alike by denying the things the digital sound-alike says if they are presented to the target, but dead people cannot. Digital sound-alikes offer criminals new disinformation attack vectors and wreak havoc on provability.  


[[File:Spectrogram-19thC.png|thumb|right|640px|A [[w:spectrogram|spectrogram]] of a male voice saying 'nineteenth century']]
----
----
== Timeline of digital sound-alikes ==
* In '''2016''' [[w:Adobe Inc.]]'s [[w:Adobe Voco|Voco]], an unreleased prototype, was publicly demonstrated in 2016. ([https://www.youtube.com/watch?v=I3l4XLZ59iw&t=5s View and listen to Adobe MAX 2016 presentation of Voco])


== Examples of speech synthesis software capable to make a digital sound-alikes ==
* In '''2016''' [[w:DeepMind]]'s [[w:WaveNet]] owned by [[w:Google]] also demonstrated ability to steal people's voices
* [[w:Adobe Inc.]]'s [[w:Adobe Voco|Voco]] unreleased prototype publicly demonstrated in 2016. ([https://www.youtube.com/watch?v=I3l4XLZ59iw&t=5s View and listen to Adobe MAX 2016 presentation of Voco)]
* [[w:DeepMind]]'s [[w:WaveNet]] that was acquired by [[w:Google]] in 2014


Neither of these software are available to the masses at large according to the "official truth", but as is known software has a high tendency to get pirated very quickly.
* In '''2018''' [[w:Conference on Neural Information Processing Systems|Conference on Neural Information Processing Systems]] the work [http://papers.nips.cc/paper/7700-transfer-learning-from-speaker-verification-to-multispeaker-text-to-speech-synthesis 'Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis'] ([https://arxiv.org/abs/1806.04558 at arXiv.org]) was presented. The pre-trained model is able to steal voices from a sample of only '''5 seconds''' with almost convincing results
** Listen [https://google.github.io/tacotron/publications/speaker_adaptation/ 'Audio samples from "Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis"']
** View [https://www.youtube.com/watch?v=0sR1rU3gLzQ Video summary of the work at YouTube: 'This AI Clones Your Voice After Listening for 5 Seconds']
 
* As of '''2019''' Symantec research knows of 3 cases where digital sound-alike technology '''has been used for crimes'''.<ref name="WaPo2019">
https://www.washingtonpost.com/technology/2019/09/04/an-artificial-intelligence-first-voice-mimicking-software-reportedly-used-major-theft/</ref>


----
----
== Examples of speech synthesis software not quite able to fool a human yet ==
== Examples of speech synthesis software not quite able to fool a human yet ==
Some other contenders to create digital sound-alikes are though, as of 2019, their speech synthesis in most use scenarios does not yet fool a human because the results contain tell tale signs that give it away as a speech synthesizer.   
Some other contenders to create digital sound-alikes are though, as of 2019, their speech synthesis in most use scenarios does not yet fool a human because the results contain tell tale signs that give it away as a speech synthesizer.   
Line 32: Line 27:
* '''[https://cstr-edinburgh.github.io/merlin/ Merlin]''', a [[w:neural network]] based speech synthesis system by the Centre for Speech Technology Research at the [[w:University of Edinburgh]]
* '''[https://cstr-edinburgh.github.io/merlin/ Merlin]''', a [[w:neural network]] based speech synthesis system by the Centre for Speech Technology Research at the [[w:University of Edinburgh]]


== Example of a digital sound-alike attack ==
 
== Documented digital sound-alike attacks ==
* [https://www.washingtonpost.com/technology/2019/09/04/an-artificial-intelligence-first-voice-mimicking-software-reportedly-used-major-theft/?noredirect=on 'An artificial-intelligence first: Voice-mimicking software reportedly used in a major theft'], a 2019 Washington Post article
 
----
 
== Example of a hypothetical digital sound-alike attack ==
A very simple example of a digital sound-alike attack is as follows:  
A very simple example of a digital sound-alike attack is as follows:  


Line 43: Line 44:


Thus it is high time to act and to '''[[Law proposals to ban covert modeling|criminalize the covert modeling of human appearance and voice!]]'''
Thus it is high time to act and to '''[[Law proposals to ban covert modeling|criminalize the covert modeling of human appearance and voice!]]'''
----
== Documented digital sound-alike attacks ==
* [https://www.washingtonpost.com/technology/2019/09/04/an-artificial-intelligence-first-voice-mimicking-software-reportedly-used-major-theft/?noredirect=on 'An artificial-intelligence first: Voice-mimicking software reportedly used in a major theft'], a 2019 Washington Post article


----
----