Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
jroitgrund committed Oct 26, 2014
0 parents commit d96c09d
Show file tree
Hide file tree
Showing 95 changed files with 548,186 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.pyc
*.pyo
31 changes: 31 additions & 0 deletions VAD/AudioInput.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
/*
This configuration file adds the basic audio input and pre-processing (including different window size ffts)
*/



;;;;; this list will be appended to the list in the main config file
[componentInstances:cComponentManager]

// instance[wave].type=cWaveSource
instance[wave].type=cPortaudioSource


////////////////////////////////////////////////////////////////////////////////////////////////
// ~~~~~~~~~ Begin of configuration ~~~~~~~~~~~~~~~~~ //////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////////////////////


//---------------- Audio Recording --------------------------//
[wave:cPortaudioSource]
writer.dmLevel=wave
; audio buffersize must be > then buffersize of framer to avoid 100% cpu hog
audioBuffersize_sec = 0.1
buffersize_sec=1.0
channels=\cm[channels{1}:number of recording channels for live audio]
sampleRate=\cm[samplerate{16000}: sample-rate for live audio recording]
listDevices=\cm[listdevices{0}: value 1 = list available portaudio audio devices]
device=\cm[device{-1}: set portaudio audio device, -1 = default device]
monoMixdown=1


56 changes: 56 additions & 0 deletions VAD/IS13_ComParE_specbase.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
///////////////////////////////////////////////////////////////////////////////////////
///////// > openSMILE configuration file for ComParE < //////////////////
///////// //////////////////
///////// * written 2012 by Florian Eyben * //////////////////
///////// //////////////////
///////// (c) 2012 by Florian Eyben //////////////////
///////// All rights reserved //////////////////
///////////////////////////////////////////////////////////////////////////////////////



;;;;;;; component list ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

[componentInstances:cComponentManager]
instance[dataMemory].type=cDataMemory
printLevelStats=0


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

[componentInstances:cComponentManager]
instance[frame25].type=cFramer
;instance[pe25].type=cVectorPreemphasis
instance[win25].type=cWindower
instance[fft25].type=cTransformFFT
instance[fftmp25].type=cFFTmagphase

[frame25:cFramer]
reader.dmLevel=wave
writer.dmLevel=frame25
frameSize = 0.020
frameStep = 0.010
frameCenterSpecial = left

[pe25:cVectorPreemphasis]
reader.dmLevel=frame25
writer.dmLevel=frame25pe
k=0.97
de=0

[win25:cWindower]
reader.dmLevel=frame25
writer.dmLevel=winH25
winFunc=hamming

[fft25:cTransformFFT]
reader.dmLevel=winH25
writer.dmLevel=fftcH25

[fftmp25:cFFTmagphase]
reader.dmLevel=fftcH25
writer.dmLevel=fftmagH25




9 changes: 9 additions & 0 deletions VAD/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
To run the vad on a wave file:

sox my.wav -c 1 -s -2 input.wav
./SMILExtract -C Standalone.conf -I input.wav -csv vad.csv

In output_segments/*.wav the voice segments will be stored
In vad.csv the raw vad activations (-1 to +1) will be dumped. First column is a timestamp in seconds, second column (after the ,) is the vad activation.


Binary file added VAD/SMILExtract
Binary file not shown.
49 changes: 49 additions & 0 deletions VAD/Standalone.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@

;;;;; all the components we require are listed here:
[componentInstances:cComponentManager]
instance[dataMemory].type=cDataMemory
////// Enable this, to dump the detected turns as wave files (to the same path as SMILExtract.exe is in)
instance[csvout].type=cCsvSink
instance[turnOutp].type=cWaveSinkCut

// set printLevelStats=5 to see the feature names in each level for debugging problems with the feature config
printLevelStats=3
nThreads=1
execDebug=0



/*
*************************************************************
include configuration files
*************************************************************
*/

#### Audio input (look at this file to change the audio sample-rate and sound device, etc.)
\{WaveFileInput.conf}

####################### VAD configuration (choose one ...) ########################

;; The spectra for the VAD
\{IS13_ComParE_specbase.conf}

//////// the LSTM and GM based VAD from the SEMAINE system
\{lstmVAD.conf}

##################### some config sections in the main file ##########################

;; vad csv output
[csvout:cCsvSink]
reader.dmLevel = vad_VAD_voice
filename = \cm[csv{vad.csv}:vad csv output file]
printHeader = 0
timestamp = 1
number = 0
delimChar = ,

;;; debug outputs
[turnOutp:cWaveSinkCut]
preSil=0.5
reader.dmLevel = frame25
fileBase = output_segments/seg

26 changes: 26 additions & 0 deletions VAD/WaveFileInput.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/*
This configuration file adds the basic audio input and pre-processing (including different window size ffts)
*/



;;;;; this list will be appended to the list in the main config file
[componentInstances:cComponentManager]

instance[wave].type=cWaveSource


////////////////////////////////////////////////////////////////////////////////////////////////
// ~~~~~~~~~ Begin of configuration ~~~~~~~~~~~~~~~~~ //////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////////////////////


//---------------- Audio Recording --------------------------//
[wave:cWaveSource]
writer.dmLevel=wave
; audio buffersize must be > then buffersize of framer to avoid 100% cpu hog
buffersize_sec=1.0
filename = \cm[inputfile(I){input.wav}:name of wave input file to analyze]
monoMixdown=1


103 changes: 103 additions & 0 deletions VAD/lstmVAD.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
;
; VOICE ACTIVITY DETECTION CONFIGURATION
;

[componentInstances:cComponentManager]
instance[melspec_VAD].type=cMelspec
instance[plp_VAD].type=cPlp
instance[delta_VAD].type=cDeltaRegression
instance[mvn_VAD].type = cVectorMVN
instance[lstm_vad].type=cRnnProcessor
instance[dataSelector].type = cDataSelector
instance[turn].type=cTurnDetector

[melspec_VAD:cMelspec]
reader.dmLevel=fftmagH25
writer.dmLevel=melspec_power
htkcompatible = 0
nBands = 26
usePower = 1
lofreq = 0
hifreq = 8000
specScale = mel

[plp_VAD:cPlp]
reader.dmLevel=melspec_power
writer.dmLevel=plp_VAD
buffersize=100
firstCC = 1
lpOrder = 18
cepLifter = 22
compression = 0.33
htkcompatible = 0
newRASTA = 1
RASTA = 0
rastaUpperCutoff = 30.0
rastaLowerCutoff = 6.0
;rastaUpperCutoff = 29.0
;rastaLowerCutoff = 0.9
doIDFT = 1
doLpToCeps = 1
doLP = 1
doInvLog = 1
doAud = 1
doLog = 1

[energy_VAD:cEnergy]
reader.dmLevel=frame25
writer.dmLevel=energy_VAD
htkcompatible=1
rms = 0
log = 1

[delta_VAD:cDeltaRegression]
reader.dmLevel=plp_VAD
writer.dmLevel=plpde_VAD
deltawin=2
blocksize=1

[accel_VAD:cDeltaRegression]
reader.dmLevel=plpde_VAD
writer.dmLevel=plpdede_VAD
deltawin=2
blocksize=1


[mvn_VAD:cVectorMVN]
reader.dmLevel = plp_VAD;plpde_VAD
writer.dmLevel = plpmvn_VAD
copyInputName = 1
processArrayFields = 0
mode = transform
initFile = norm.dat
htkcompatible = 0
meanEnable = 1
stdEnable = 1
normEnable = 0

[lstm_vad:cRnnProcessor]
reader.dmLevel = plpmvn_VAD
writer.dmLevel = vad_VAD
netfile=net.dat

[dataSelector:cDataSelector]
reader.dmLevel = vad_VAD
writer.dmLevel = vad_VAD_voice
nameAppend = vadBin
copyInputName = 1
selectedRange = 0
elementMode = 1

[turn:cTurnDetector]
reader.dmLevel=vad_VAD_voice
writer.dmLevel=isTurn
readVad=1
threshold = 0.45
threshold2 = 0.25
writer.levelconf.noHang=1
eventRecp = turnOutp
maxTurnLength=10
maxTurnLengthGrace=2
nPre = 11
nPost = 35

Binary file added VAD/net.dat
Binary file not shown.
Binary file added VAD/norm.dat
Binary file not shown.
62 changes: 62 additions & 0 deletions VAD/smile.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in SMILExtract : openSMILE starting!
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in SMILExtract : config file is: Standalone.conf
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in cComponentManager : successfully registered 113 component types.
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in configManager : reading config file 'Standalone.conf'
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in configManager : reading config file 'WaveFileInput.conf'
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in configManager : reading config file 'IS13_ComParE_specbase.conf'
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in configManager : reading config file 'lstmVAD.conf'
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in instance 'mvn_VAD' : Loading init file in old MVN binary format
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in smileRnn : Net file format: 2
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in smileRnn : net-task: 1
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in cComponentManager : successfully finished createInstances
(14 component instances were finalised, 1 data memories were finalised)
[ 22.06.2014 - 23:13:17 ]
(MSG) [2] in cComponentManager : starting single thread processing loop
[ 22.06.2014 - 23:13:17 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #1
[ 22.06.2014 - 23:13:17 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #2
[ 22.06.2014 - 23:13:18 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #4
[ 22.06.2014 - 23:13:18 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #5
[ 22.06.2014 - 23:13:18 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #9
[ 22.06.2014 - 23:13:19 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #23
[ 22.06.2014 - 23:13:20 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #25
[ 22.06.2014 - 23:13:20 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #27
[ 22.06.2014 - 23:13:20 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #29
[ 22.06.2014 - 23:13:20 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #30
[ 22.06.2014 - 23:13:20 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #33
[ 22.06.2014 - 23:13:20 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #35
[ 22.06.2014 - 23:13:21 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #36
[ 22.06.2014 - 23:13:21 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #43
[ 22.06.2014 - 23:13:22 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #45
[ 22.06.2014 - 23:13:22 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #49
[ 22.06.2014 - 23:13:22 ]
(ERROR) [1] in instance 'turnOutp' : no frames were written for turn #52
[ 22.06.2014 - 23:13:22 ]
(MSG) [2] in cComponentManager : Processing finished! System ran for 25572 ticks.
here in destr.
Loading

0 comments on commit d96c09d

Please sign in to comment.