Team weByte: Detection of deception by speech signal

Tuesday, April 28, 2020

Detection of deception by speech signal

We expect to detect stress in speech by analyzing the change in microtremor frequency of the speaker’s voice.
For this purpose we have found a program which uses Empirical Mode Decomposition (EMD), a method that has been shown to be effective for the purpose of detecting stress in a person’s voice.

EMD Process

Empirical Mode Decomposition (EMD) decomposes the original signal into a finite number of intrinsic mode functions (IMFs).
IMFs are time-varying mono-component (single frequency) functions. The signal is decomposed into IMFs in such a manner that the highest frequency component of each event in the signal is captured by the first IMF.

An IMF should satisfy two conditions:

The upper and lower envelope has to be symmetric;
The number of zero-crossings and the number of extrema are exactly equal or they differ at most by one.

Once the decomposition is finalized, a real world signal can be mapped as:

Where:

c_i [k] = set of IMFs

r[k] = trend within the data (also referred to as the last IMF or residual)

Detecting stress induced signals

The second to last IMF is considered the microtremor frequency, unless the total number of IMFs are less than 3, where it is considered the last IMF.
If the tremor frequency lies in the range of 8-12 Hz, it is a considered a stress response, while a frequency outside this range is considered a stress response.

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)