I try to install Pocketsphinx in French on a Buildroot embedded device. So I dowloaded the default french language model: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/French/
I tried several times to make pocketsphinx works using command line on the embedded device, but it failed each time. I learned that it could be caused by the huge size of my dictionary (~ 100 000 words, called fr.dict), so I created a much smaller dictionary of 100 words (called fr-test.dict). But it doesn't seem to have changed anything.
I use the pocketsphinx_continuous command line. I defined a hmm, a dictionary and a language model which are pulled out from this folder :
Three differents hmm:
cmusphinx-fr-5.2
cmusphinx-fr-ptm-5.2
cmusphinx-fr-ptm-8khz-5.2
Two dictionaries:
fr.dict
fr-test.dict
And three language models
fr.lm.dmp
fr-small.lm.bin
fr-phone.lm.dmp
The first two parameters doesn't seem to change anything. However, the thrid one does change the error messages I receive.
With the first lm (fr.lm.dmp) the complete log are:
# pocketsphinx_continuous -adcdev plug:pcm.mic -inmic yes -hmm /mnt/usb/sphinx-f
rench/cmusphinx-fr-5.2 -lm /mnt/usb/sphinx-french/fr.lm.dmp -dict /mnt/usb/sphin
x-french/
INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /mnt/usb/sphinx-french/cmusphinx-fr-5.2/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict /mnt/usb/sphinx-french/
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /mnt/usb/sphinx-french/cmusphinx-fr-5.2
-input_endian little little
-jsgf
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm /mnt/usb/sphinx-french/fr.lm.dmp
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 50
-vad_prespeech 20 20
-vad_startspeech 10 10
-vad_threshold 2.0 2.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(152): Reading linear feature transformation from /mnt/usb/sphinx-french/cmusphinx-fr-5.2/feature_transform
INFO: mdef.c(518): Reading model definition: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/mdef
INFO: bin_mdef.c(181): Allocating 101051 * 8 bytes (789 KiB) for CD tree
INFO: tmat.c(149): Reading HMM transition probability matrices: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/transition_matrices
INFO: acmod.c(113): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/means
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/variances
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(304): 0 variance values floored
INFO: ptm_mgau.c(804): Number of codebooks exceeds 256: 2108
INFO: acmod.c(115): Attempting to use semi-continuous computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/means
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/variances
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(304): 0 variance values floored
INFO: acmod.c(117): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/means
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/variances
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(304): 0 variance values floored
INFO: ms_senone.c(149): Reading senone mixture weights: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/mixture_weights
INFO: ms_senone.c(200): Truncating senone logs3(pdf) values by 10 bits
INFO: ms_senone.c(207): Not transposing mixture weights in memory
INFO: ms_senone.c(268): Read mixture weights for 2108 senones: 1 features x 8 codewords
INFO: ms_senone.c(320): Mapping senones to individual codebooks
INFO: ms_mgau.c(144): The value of topn: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4099 * 20 bytes (80 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /mnt/usb/sphinx-french/
INFO: dict.c(213): Dictionary size 0, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 0 words read
INFO: dict.c(358): Reading filler dictionary: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/noisedict
INFO: dict.c(213): Dictionary size 3, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 3 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 36^3 * 2 bytes (91 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 15696 bytes (15 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 15696 bytes (15 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(347): Trying to read LM in trie binary format
INFO: ngram_model_trie.c(358): Header doesn't match
INFO: ngram_model_trie.c(176): Trying to read LM in arpa format
INFO: ngram_model_trie.c(69): No \data\ mark in LM file
INFO: ngram_model_trie.c(438): Trying to read LM in DMP format
INFO: ngram_model_trie.c(520): ngrams 1=62304, 2=18541132, 3=23627127
calloc(23627127,16) failed from ngrams_raw.c(278)
With the two others, I receive these logs:
# pocketsphinx_continuous -adcdev plug:pcm.mic -inmic yes -hmm /mnt/usb/sphinx-f
rench/cmusphinx-fr-5.2 -lm /mnt/usb/sphinx-french/fr-phone.lm.dmp -dict /mnt/usb
/sphinx-french/fr.dict
INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /mnt/usb/sphinx-french/cmusphinx-fr-5.2/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict /mnt/usb/sphinx-french/fr.dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /mnt/usb/sphinx-french/cmusphinx-fr-5.2
-input_endian little little
-jsgf
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm /mnt/usb/sphinx-french/fr-phone.lm.dmp
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 50
-vad_prespeech 20 20
-vad_startspeech 10 10
-vad_threshold 2.0 2.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(152): Reading linear feature transformation from /mnt/usb/sphinx-french/cmusphinx-fr-5.2/feature_transform
INFO: mdef.c(518): Reading model definition: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/mdef
INFO: bin_mdef.c(181): Allocating 101051 * 8 bytes (789 KiB) for CD tree
INFO: tmat.c(149): Reading HMM transition probability matrices: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/transition_matrices
INFO: acmod.c(113): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/means
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/variances
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(304): 0 variance values floored
INFO: ptm_mgau.c(804): Number of codebooks exceeds 256: 2108
INFO: acmod.c(115): Attempting to use semi-continuous computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/means
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/variances
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(304): 0 variance values floored
INFO: acmod.c(117): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/means
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/variances
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(304): 0 variance values floored
INFO: ms_senone.c(149): Reading senone mixture weights: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/mixture_weights
INFO: ms_senone.c(200): Truncating senone logs3(pdf) values by 10 bits
INFO: ms_senone.c(207): Not transposing mixture weights in memory
INFO: ms_senone.c(268): Read mixture weights for 2108 senones: 1 features x 8 codewords
INFO: ms_senone.c(320): Mapping senones to individual codebooks
INFO: ms_mgau.c(144): The value of topn: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 109102 * 20 bytes (2130 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /mnt/usb/sphinx-french/fr.dict
INFO: dict.c(213): Dictionary size 105003, allocated 1018 KiB for strings, 1375 KiB for phones
INFO: dict.c(336): 105003 words read
INFO: dict.c(358): Reading filler dictionary: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/noisedict
INFO: dict.c(213): Dictionary size 105006, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 3 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 36^3 * 2 bytes (91 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 15696 bytes (15 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 15696 bytes (15 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(347): Trying to read LM in trie binary format
INFO: ngram_model_trie.c(358): Header doesn't match
INFO: ngram_model_trie.c(176): Trying to read LM in arpa format
INFO: ngram_model_trie.c(69): No \data\ mark in LM file
INFO: ngram_model_trie.c(438): Trying to read LM in DMP format
INFO: ngram_model_trie.c(520): ngrams 1=38, 2=1240, 3=23231
INFO: lm_trie.c(473): Training quantizer
INFO: lm_trie.c(481): Building LM trie
INFO: ngram_search_fwdtree.c(74): Initializing search tree
INFO: ngram_search_fwdtree.c(101): 742 unique initial diphones
INFO: ngram_search_fwdtree.c(186): Creating search channels
INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 136
INFO: ngram_search_fwdtree.c(333): Created 12 root, 8 non-root channels, 14 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(307): pocketsphinx_continuous COMPILED ON: Apr 18 2019, AT: 18:26:09
INFO: continuous.c(252): Ready....
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
A new "Input overrun" log appear every second. And remember these results does not change with another hmm or a smaller dictionary.
Any idea what is happening and how I can solve the problem ?
I will give to more information here:
1) When I use the "top" bash command to see the consumption of Pocketsphinx while running it, it return the following:
Mem: 292420K used, 717876K free, 552K shrd, 46032K buff, 112032K cached
CPU: 0% usr 3% sys 0% nic 95% idle 0% io 0% irq 0% sirq
Load average: 0.20 0.64 0.69 1/166 2377
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
2061 1 root S 16520 2% 3% sndnrj -d plug:aec
2130 2099 root R 76496 8% 0% /usr/bin/ws-proxyd -i lo -p 7894 -D /u
2371 2100 root S 33428 3% 0% pocketsphinx_continuous -adcdev plug:p
2271 2099 root S 19068 2% 0% /usr/bin/ws-proxyd -i lo -p 7895 -D /u
2276 2099 root S 99m 10% 0% /usr/bin/node /opt/app/index.js --no-u
1868 1 root S 1992 0% 0% /sbin/klogd -n
1912 2 root SW 0 0% 0% [ksdioirqd/sdio]
2104 2099 root S 22140 2% 0% /usr/bin/storaged -v 0
2073 1 root S 11796 1% 0% sc-am /etc/sc-am.cfg
2269 2099 root S 9952 1% 0% /usr/bin/sc-directive -c /flash/etc/di
2103 2099 root S 8028 1% 0% /usr/bin/sc_net
2099 1 root S 7448 1% 0% nsm
2114 2099 root S 7128 1% 0% /usr/bin/sc-led-matrix -f /dev/i2c-2
2102 2099 root S 5144 1% 0% /usr/bin/sysinfod -c /etc/sysinfod/sys
2119 1 root S 5096 1% 0% /usr/sbin/wpa_supplicant -u
2092 2091 www-data S 5064 0% 0% nginx: worker process
2065 1 root S 4972 0% 0% pupd -d /dev/spidev32766.0 -f /usr/sha
2091 1 root S 4932 0% 0% nginx: master process /usr/sbin/nginx
2253 1 root S 3860 0% 0% /usr/libexec/bluetooth/bluetoothd
So I don't think my problem is the CPU.
2) When I use the lm fr-small.lm.bin with the small dictionary of 100 words, it returned the following:
pocketsphinx_continuous -adcdev plug:pcm.mic -inmic yes -hmm /mnt/usb/sphinx-f
rench/cmusphinx-fr-5.2 -lm /mnt/usb/sphinx-french./fr-small.lm.bin -dict /mnt/us
b/sphinx-french/fr-test.dict
INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /mnt/usb/sphinx-french/cmusphinx-fr-5.2/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict /mnt/usb/sphinx-french/fr-test.dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /mnt/usb/sphinx-french/cmusphinx-fr-5.2
-input_endian little little
-jsgf
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm /mnt/usb/sphinx-french./fr-small.lm.bin
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 50
-vad_prespeech 20 20
-vad_startspeech 10 10
-vad_threshold 2.0 2.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(152): Reading linear feature transformation from /mnt/usb/sphinx-french/cmusphinx-fr-5.2/feature_transform
INFO: mdef.c(518): Reading model definition: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/mdef
INFO: bin_mdef.c(181): Allocating 101051 * 8 bytes (789 KiB) for CD tree
INFO: tmat.c(149): Reading HMM transition probability matrices: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/transition_matrices
INFO: acmod.c(113): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/means
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/variances
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(304): 0 variance values floored
INFO: ptm_mgau.c(804): Number of codebooks exceeds 256: 2108
INFO: acmod.c(115): Attempting to use semi-continuous computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/means
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/variances
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(304): 0 variance values floored
INFO: acmod.c(117): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/means
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/variances
INFO: ms_gauden.c(242): 2108 codebook, 1 feature, size:
INFO: ms_gauden.c(244): 8x32
INFO: ms_gauden.c(304): 0 variance values floored
INFO: ms_senone.c(149): Reading senone mixture weights: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/mixture_weights
INFO: ms_senone.c(200): Truncating senone logs3(pdf) values by 10 bits
INFO: ms_senone.c(207): Not transposing mixture weights in memory
INFO: ms_senone.c(268): Read mixture weights for 2108 senones: 1 features x 8 codewords
INFO: ms_senone.c(320): Mapping senones to individual codebooks
INFO: ms_mgau.c(144): The value of topn: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4199 * 20 bytes (82 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /mnt/usb/sphinx-french/fr-test.dict
INFO: dict.c(213): Dictionary size 100, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 100 words read
INFO: dict.c(358): Reading filler dictionary: /mnt/usb/sphinx-french/cmusphinx-fr-5.2/noisedict
INFO: dict.c(213): Dictionary size 103, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 3 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 36^3 * 2 bytes (91 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 15696 bytes (15 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 15696 bytes (15 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(347): Trying to read LM in trie binary format
INFO: ngram_search_fwdtree.c(74): Initializing search tree
INFO: ngram_search_fwdtree.c(101): 35 unique initial diphones
INFO: ngram_search_fwdtree.c(186): Creating search channels
INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 157
INFO: ngram_search_fwdtree.c(333): Created 2 root, 29 non-root channels, 4 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(307): pocketsphinx_continuous COMPILED ON: Apr 18 2019, AT: 18:26:09
INFO: continuous.c(252): Ready....
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
Related
I had created an empty database called nominatim on my DB cluster, pointed the DSN to my database with the dbname, host, port, user and pass and tried to import the map of Portugal with the following params:
nohup <path/to/file>/build/utils/setup.php \
--threads 4 \
--setup-db \
--import-data \
--create-functions \
--enable-diff-updates \
--enable-debug-statements \
--create-tables \
--create-partition-tables \
--create-partition-functions \
--import-wikipedia-articles \
--load-data \
--import-tiger-data \
--calculate-postcodes \
--index \
--index-noanalyse \
--create-search-indices \
--create-country-names \
--osm-file <path/to/file>/portugal-latest.osm.pbf >> <path/to/file>/logImportPortugal.txt 2>&1
But it is giving me the following error:
WARNING: Starting rank 25
Traceback (most recent call last):
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 370, in <module>
Indexer(options).run()
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 215, in run
self.index(RankRunner(self.maxrank))
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 259, in index
thread = next(next_thread)
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 300, in find_free_thread
if thread.is_done():
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 178, in is_done
if self.conn.poll() == psycopg2.extensions.POLL_OK:
psycopg2.errors.UndefinedTable: missing FROM-clause entry for table "parent"
LINE 1: parent.place_id
^
QUERY: parent.place_id
CONTEXT: PL/pgSQL function find_parent_for_address(text,text,smallint,geometry) line 25 at RAISE
PL/pgSQL function find_parent_for_poi(character,bigint,smallint,geometry,text,text,boolean) line 31 at assignment
PL/pgSQL function placex_update() line 178 at assignment
ERROR: error status 1 running nominatim!
string(33) "error status 1 running nominatim!"
The flag --setup-db shouldn't create all the relations? What am I missing?
Below is the full log:
nohup: ignoring input
2023-01-10 13:00:49 == module path: /srv/nominatim/build/module
2023-01-10 13:00:49 == Setup DB
Postgres version found: 14
Postgis version found: 3.2
set_config
------------
(1 row)
2023-01-10 13:00:54 == Import data
osm2pgsql version 1.2.0 (64 bit id space)
Mid: loading persistent node cache from /srv/nominatim/flatnode.file
Mid: pgsql, cache=0
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels
Parsing gazetteer style file '/srv/nominatim/Nominatim-3.5.1/settings/import-full.style'.
Using projection SRS 4326 (Latlong)
NOTICE: table "place" does not exist, skipping
Reading in file: /home/izadmin/planet_full/portugal-latest.osm.pbf
Using PBF parser.
^MProcessing: Node(50k 50.0k/s) Way(0k 0.00k/s) Relation(0 0.00/s)^MProcessing: Node(150k 75.0k/s) Way(0k 0.00k/s) Relation(0 0.00/s)^MProcessing: Node(310k 103.3k/s) Way(0k 0.00k/s) Relation>Node stats: total(36476942), max(10423206981) in 223s
Way stats: total(3678221), max(1129408209) in 88s
Relation stats: total(75907), max(15101706) in 7s
Stopping table: planet_osm_nodes
Stopped table: planet_osm_nodes in 0s
Stopping table: planet_osm_ways
Stopped table: planet_osm_ways in 0s
Stopping table: planet_osm_rels
Building index on table: planet_osm_rels
Stopped table: planet_osm_rels in 0s
Osm2pgsql took 319s overall
2023-01-10 13:06:14 == Create Functions
2023-01-10 13:06:14 == Create Tables
2023-01-10 13:06:14 == Create Functions
2023-01-10 13:06:14 == Create Tables
2023-01-10 13:06:15 == Create Partition Tables
2023-01-10 13:06:16 == Create Partition Functions
2023-01-10 13:06:16 == Importing wikipedia articles and redirects
2023-01-10 13:08:08 == Drop old Data
...............................................................................................................................................................................................>2023-01-10 13:08:13 == Loading word list
count
--------
318141
(1 row)
count
-------
14556
(1 row)
2023-01-10 13:09:01 == Load Data
2023-01-10 13:43:34 == Reanalysing database
Latest data imported from 2023-01-09T21:15:53Z.
2023-01-10 13:43:37 == Import Tiger data
2023-01-10 13:43:37 == Found 0 SQL files in path /srv/nominatim/Nominatim-3.5.1/data/tiger
2023-01-10 13:43:37 == WARNING: Tiger data import selected but no files found in path /srv/nominatim/Nominatim-3.5.1/data/tiger
2023-01-10 13:43:37 == Calculate Postcodes
2023-01-10 13:44:47 == Index ranks 0 - 4
'/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py' --database nominatim --port 5432 --threads 4 -v --host '127.0.0.1' --user postgres --maxrank 4WARNING: Starting indexing rank (0 to 4) >WARNING: Starting rank 0
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 0
WARNING: Starting rank 1
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 1
WARNING: Starting rank 2
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 2
WARNING: Starting rank 3
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 3
WARNING: Starting rank 4
WARNING: Done 2/2 in 0 # 3.477 per second - FINISHED rank 4
2023-01-10 13:44:47 == Index ranks 5 - 25
WARNING: Starting indexing rank (5 to 25) using 4 threads
WARNING: Starting rank 5
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 5
WARNING: Starting rank 6
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 6
WARNING: Starting rank 7
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 7
WARNING: Starting rank 8
WARNING: Done 2/2 in 0 # 7.341 per second - FINISHED rank 8
WARNING: Starting rank 9
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 9
WARNING: Starting rank 10
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 10
WARNING: Starting rank 11
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 11
WARNING: Starting rank 12
WARNING: Done 37/37 in 0 # 139.460 per second - FINISHED rank 12
WARNING: Starting rank 13
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 13
WARNING: Starting rank 14
INFO: Done 100 in 0 # 389.785 per second - rank 14 ETA (seconds): 0.53
WARNING: Done 308/308 in 0 # 321.383 per second - FINISHED rank 14
WARNING: Starting rank 15
WARNING: Done 1/1 in 0 # 661.813 per second - FINISHED rank 15
WARNING: Starting rank 16
INFO: Done 100 in 0 # 587.337 per second - rank 16 ETA (seconds): 5.38
INFO: Done 687 in 1 # 516.197 per second - rank 16 ETA (seconds): 4.99
INFO: Done 1203 in 2 # 507.508 per second - rank 16 ETA (seconds): 4.06
INFO: Done 1710 in 3 # 488.050 per second - rank 16 ETA (seconds): 3.18
INFO: Done 2198 in 4 # 455.463 per second - rank 16 ETA (seconds): 2.34
INFO: Done 2653 in 5 # 450.394 per second - rank 16 ETA (seconds): 1.35
INFO: Done 3103 in 7 # 425.337 per second - rank 16 ETA (seconds): 0.37
WARNING: Done 3262/3262 in 8 # 388.872 per second - FINISHED rank 16
WARNING: Starting rank 17
INFO: Done 100 in 0 # 1959.248 per second - rank 17 ETA (seconds): 0.01
WARNING: Done 127/127 in 0 # 1647.831 per second - FINISHED rank 17
WARNING: Starting rank 18
INFO: Done 100 in 0 # 1458.789 per second - rank 18 ETA (seconds): 2.29
INFO: Done 1558 in 0 # 1776.158 per second - rank 18 ETA (seconds): 1.06
INFO: Done 3334 in 1 # 1774.513 per second - rank 18 ETA (seconds): 0.06
WARNING: Done 3447/3447 in 1 # 1777.266 per second - FINISHED rank 18
WARNING: Starting rank 19
INFO: Done 100 in 0 # 3007.609 per second - rank 19 ETA (seconds): 1.98
INFO: Done 3107 in 1 # 2444.868 per second - rank 19 ETA (seconds): 1.20
INFO: Done 5551 in 2 # 2570.094 per second - rank 19 ETA (seconds): 0.19
WARNING: Done 6053/6052 in 2 # 2644.112 per second - FINISHED rank 19
WARNING: Starting rank 20
INFO: Done 100 in 0 # 2110.150 per second - rank 20 ETA (seconds): 13.73
INFO: Done 2210 in 0 # 2969.273 per second - rank 20 ETA (seconds): 9.04
INFO: Done 5179 in 2 # 2321.562 per second - rank 20 ETA (seconds): 10.29
INFO: Done 7500 in 3 # 2142.735 per second - rank 20 ETA (seconds): 10.07
INFO: Done 9642 in 4 # 2055.257 per second - rank 20 ETA (seconds): 9.45
INFO: Done 11697 in 5 # 2019.092 per second - rank 20 ETA (seconds): 8.60
INFO: Done 13716 in 6 # 1998.935 per second - rank 20 ETA (seconds): 7.68
INFO: Done 15714 in 7 # 1993.985 per second - rank 20 ETA (seconds): 6.70
INFO: Done 17707 in 8 # 1990.804 per second - rank 20 ETA (seconds): 5.71
INFO: Done 19697 in 9 # 1978.861 per second - rank 20 ETA (seconds): 4.74
INFO: Done 21675 in 11 # 1960.704 per second - rank 20 ETA (seconds): 3.77
INFO: Done 23635 in 12 # 1936.453 per second - rank 20 ETA (seconds): 2.81
INFO: Done 25571 in 13 # 1929.992 per second - rank 20 ETA (seconds): 1.81
INFO: Done 27500 in 14 # 1939.698 per second - rank 20 ETA (seconds): 0.81
WARNING: Done 29070/29067 in 14 # 1942.903 per second - FINISHED rank 20
WARNING: Starting rank 21
INFO: Done 100 in 0 # 2234.986 per second - rank 21 ETA (seconds): 0.48
WARNING: Done 1167/1167 in 0 # 1538.119 per second - FINISHED rank 21
WARNING: Starting rank 22
INFO: Done 100 in 0 # 1638.646 per second - rank 22 ETA (seconds): 18.97
INFO: Done 1738 in 0 # 1982.608 per second - rank 22 ETA (seconds): 14.85
INFO: Done 3720 in 2 # 1732.995 per second - rank 22 ETA (seconds): 15.85
INFO: Done 5452 in 3 # 1742.091 per second - rank 22 ETA (seconds): 14.77
INFO: Done 7194 in 4 # 1723.857 per second - rank 22 ETA (seconds): 13.92
INFO: Done 8917 in 5 # 1717.254 per second - rank 22 ETA (seconds): 12.97
INFO: Done 10634 in 6 # 1712.904 per second - rank 22 ETA (seconds): 12.00
INFO: Done 12346 in 7 # 1716.312 per second - rank 22 ETA (seconds): 10.98
INFO: Done 14062 in 8 # 1718.356 per second - rank 22 ETA (seconds): 9.96
INFO: Done 15780 in 9 # 1720.232 per second - rank 22 ETA (seconds): 8.95
INFO: Done 17500 in 10 # 1727.692 per second - rank 22 ETA (seconds): 7.92
INFO: Done 19227 in 11 # 1727.651 per second - rank 22 ETA (seconds): 6.92
INFO: Done 20954 in 12 # 1723.779 per second - rank 22 ETA (seconds): 5.93
INFO: Done 22677 in 13 # 1715.023 per second - rank 22 ETA (seconds): 4.96
INFO: Done 24392 in 14 # 1717.552 per second - rank 22 ETA (seconds): 3.95
INFO: Done 26109 in 15 # 1697.694 per second - rank 22 ETA (seconds): 2.99
INFO: Done 27806 in 16 # 1659.334 per second - rank 22 ETA (seconds): 2.04
INFO: Done 29465 in 17 # 1662.434 per second - rank 22 ETA (seconds): 1.03
INFO: Done 31127 in 18 # 1667.951 per second - rank 22 ETA (seconds): 0.03
WARNING: Done 31184/31183 in 18 # 1668.224 per second - FINISHED rank 22
WARNING: Starting rank 23
WARNING: Done 0/0 in 0 # 0.000 per second - FINISHED rank 23
WARNING: Starting rank 24
INFO: Done 100 in 0 # 2041.400 per second - rank 24 ETA (seconds): 2.52
INFO: Done 2141 in 1 # 1956.016 per second - rank 24 ETA (seconds): 1.59
INFO: Done 4097 in 2 # 1911.586 per second - rank 24 ETA (seconds): 0.60
WARNING: Done 5248/5245 in 2 # 1908.391 per second - FINISHED rank 24
WARNING: Starting rank 25
Traceback (most recent call last):
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 370, in <module>
Indexer(options).run()
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 215, in run
self.index(RankRunner(self.maxrank))
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 259, in index
thread = next(next_thread)
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 300, in find_free_thread
if thread.is_done():
File "/srv/nominatim/Nominatim-3.5.1/nominatim/nominatim.py", line 178, in is_done
if self.conn.poll() == psycopg2.extensions.POLL_OK:
psycopg2.errors.UndefinedTable: missing FROM-clause entry for table "parent"
LINE 1: parent.place_id
^
QUERY: parent.place_id
CONTEXT: PL/pgSQL function find_parent_for_address(text,text,smallint,geometry) line 25 at RAISE
PL/pgSQL function find_parent_for_poi(character,bigint,smallint,geometry,text,text,boolean) line 31 at assignment
PL/pgSQL function placex_update() line 178 at assignment
ERROR: error status 1 running nominatim!
string(33) "error status 1 running nominatim!"
I am using Nominatim 3.5.1
Problem: I am getting no results 95% of the time and the rest is just not accurate single-words from pocketsphinx.
Could this be due to low recording volume?
So far:
Pocketsphinx is initialized with default hmm, lm and dict included with it. The setup code is: (no crashes, no problems whatsoever).
g_NPCController.Debug("Initializing internal decoder ... ");
string directoryPrefix = Directory.GetCurrentDirectory() + Path.DirectorySeparatorChar +
"Pocketsphinx" + Path.DirectorySeparatorChar + "model";
string hmmDir = directoryPrefix + Path.DirectorySeparatorChar + "en-us" + Path.DirectorySeparatorChar + "en-us";
string dictDir = directoryPrefix + Path.DirectorySeparatorChar + "en-us" + Path.DirectorySeparatorChar + "cmudict-en-us.dict";
string lmDir = directoryPrefix + Path.DirectorySeparatorChar + "en-us" + Path.DirectorySeparatorChar + "en-us.lm.bin";
Config c = Pocketsphinx.Decoder.DefaultConfig();
if (Application.platform == RuntimePlatform.Android)
{
c.SetString("-hmm", "/sdcard/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/en-us-ptm");
c.SetString("-dict", "/sdcard/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/cmudict-en-us.dict");
c.SetString("-lm", "/sdcard/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/en-us.lm.bin");
} else if (Application.platform == RuntimePlatform.IPhonePlayer) {
// TODO - outta luck for now
}
else
{
if (g_NPCController.DebugMode)
c.SetString("-logfn", Directory.GetCurrentDirectory() + Path.DirectorySeparatorChar + "Pocketsphinx" + Path.DirectorySeparatorChar + "current.log");
c.SetString("-hmm", hmmDir);
c.SetString("-dict", dictDir);
c.SetString("-lm", lmDir);
}
//c.SetString("-keyphrase", "hello world");
c.SetFloat("-kws_threshold", 1e-30);
c.SetFloat("-samprate", (int) g_NPCAudioListener.SampleFrequency);
c.SetInt("-nfft", 2048);
g_Decoder = new Pocketsphinx.Decoder(c);
g_Decoder.StartUtt();
g_NPCController.Debug("... local decoder initialized.");
Then, within a coroutne in the main thread, buffers are processed in one shot. Buffers can be from 2600 to 12800 bytes long each. The following code is called every frame:
while (!buffer.Closed)
yield return null;
g_NPCAudioListener.AudioBufferQueue.Dequeue();
byte[] audio = new byte[buffer.CurrentBuffer16.Count * sizeof(short)];
Buffer.BlockCopy(buffer.CurrentBuffer16.ToArray(), 0, audio, 0, audio.Length);
g_Decoder.ProcessRaw(audio, audio.Length, false, consumed == buffer.CurrentBuffer.Count);
if (g_Decoder.Hyp() != null) {
g_DictationResults.Enqueue(g_Decoder.Hyp().Hypstr);
g_Decoder.EndUtt();
g_Decoder.StartUtt();
}
Any help will be greatly appreciated. I am very close to make it work - this would be a huge help for my project. Am I missing some configuration parameters?
With init log:
INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci yes yes
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn live current
-cmninit 40,3,-1 40,3,-1
-compallsen no no
-dict C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\cmudict-en-us.dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us
-input_endian little little
-jsgf
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1e-30 1.000000e-30
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us.lm.bin
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\current.log
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 2048
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec 0-12/13-25/26-38
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 50
-vad_prespeech 20 20
-vad_startspeech 10 10
-vad_threshold 3.0 3.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(162): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us/mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(149): Reading HMM transition probability matrices: C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us/transition_matrices
INFO: acmod.c(113): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us/means
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us/variances
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(304): 222 variance values floored
INFO: ptm_mgau.c(475): Loading senones from dump file C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us/sendump
INFO: ptm_mgau.c(499): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(562): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(594): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(837): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 138623 * 32 bytes (4331 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\cmudict-en-us.dict
INFO: dict.c(213): Dictionary size 134522, allocated 1014 KiB for strings, 1677 KiB for phones
INFO: dict.c(336): 134522 words read
INFO: dict.c(358): Reading filler dictionary: C:\Users\fgera\Development\Git\Computer-Graphics\Motional.AI\Pocketsphinx\model\en-us\en-us/noisedict
INFO: dict.c(213): Dictionary size 134527, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 42672 bytes (41 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 42672 bytes (41 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
INFO: ngram_search_fwdtree.c(74): Initializing search tree
INFO: ngram_search_fwdtree.c(101): 790 unique initial diphones
INFO: ngram_search_fwdtree.c(186): Creating search channels
INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 152144
INFO: ngram_search_fwdtree.c(333): Created 722 root, 152016 non-root channels, 53 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: cmn_live.c(120): Update from < 40.00 3.00 -1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: cmn_live.c(138): Update to < 95.43 -6.79 -5.30 -6.10 -10.80 2.20 -1.31 -0.54 -2.47 -3.11 -0.83 2.18 5.92 >
INFO: ngram_search_fwdtree.c(1550): 1601 words recognized (18/fr)
INFO: ngram_search_fwdtree.c(1552): 237595 senones evaluated (2700/fr)
INFO: ngram_search_fwdtree.c(1556): 578857 channels searched (6577/fr), 49670 1st, 54325 last
INFO: ngram_search_fwdtree.c(1559): 3042 words for which last channels evaluated (34/fr)
INFO: ngram_search_fwdtree.c(1561): 18162 candidate words for entering last phone (206/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 18.25 CPU 20.739 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 14.14 wall 16.065 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 57 words
INFO: ngram_search_fwdflat.c(948): 692 words recognized (8/fr)
INFO: ngram_search_fwdflat.c(950): 83914 senones evaluated (954/fr)
INFO: ngram_search_fwdflat.c(952): 102603 channels searched (1165/fr)
INFO: ngram_search_fwdflat.c(954): 4800 words searched (54/fr)
INFO: ngram_search_fwdflat.c(957): 2171 word transitions (24/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.03 CPU 0.036 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.03 wall 0.033 xRT
INFO: cmn_live.c(120): Update from < 95.43 -6.79 -5.30 -6.10 -10.80 2.20 -1.31 -0.54 -2.47 -3.11 -0.83 2.18 5.92 >
INFO: cmn_live.c(138): Update to < 92.77 -8.91 -9.40 -6.80 -11.46 -0.71 -2.89 -0.45 1.43 -3.17 -1.35 0.17 3.64 >
INFO: ngram_search_fwdtree.c(1550): 4191 words recognized (37/fr)
INFO: ngram_search_fwdtree.c(1552): 451300 senones evaluated (4029/fr)
INFO: ngram_search_fwdtree.c(1556): 3011851 channels searched (26891/fr), 76495 1st, 179109 last
INFO: ngram_search_fwdtree.c(1559): 8886 words for which last channels evaluated (79/fr)
INFO: ngram_search_fwdtree.c(1561): 234585 candidate words for entering last phone (2094/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 28.91 CPU 25.809 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 18.27 wall 16.315 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 167 words
INFO: ngram_search_fwdflat.c(948): 2425 words recognized (22/fr)
INFO: ngram_search_fwdflat.c(950): 166879 senones evaluated (1490/fr)
INFO: ngram_search_fwdflat.c(952): 279618 channels searched (2496/fr)
INFO: ngram_search_fwdflat.c(954): 13084 words searched (116/fr)
INFO: ngram_search_fwdflat.c(957): 9930 word transitions (88/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.08 CPU 0.070 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.08 wall 0.074 xRT
INFO: ngram_search.c(467): Resized score stack to 200000 entries
INFO: ngram_search.c(459): Resized backpointer table to 10000 entries
INFO: cmn_live.c(120): Update from < 92.77 -8.91 -9.40 -6.80 -11.46 -0.71 -2.89 -0.45 1.43 -3.17 -1.35 0.17 3.64 >
INFO: cmn_live.c(138): Update to < 93.21 -8.67 -8.88 -5.56 -10.87 -0.19 -2.78 -0.73 2.84 -3.07 -1.93 0.79 2.44 >
INFO: ngram_search_fwdtree.c(1550): 6212 words recognized (97/fr)
INFO: ngram_search_fwdtree.c(1552): 251244 senones evaluated (3926/fr)
INFO: ngram_search_fwdtree.c(1556): 2124350 channels searched (33192/fr), 43302 1st, 190124 last
INFO: ngram_search_fwdtree.c(1559): 9693 words for which last channels evaluated (151/fr)
INFO: ngram_search_fwdtree.c(1561): 212658 candidate words for entering last phone (3322/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 13.48 CPU 21.069 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 9.23 wall 14.425 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 311 words
INFO: ngram_search_fwdflat.c(948): 3854 words recognized (60/fr)
INFO: ngram_search_fwdflat.c(950): 138888 senones evaluated (2170/fr)
INFO: ngram_search_fwdflat.c(952): 414216 channels searched (6472/fr)
INFO: ngram_search_fwdflat.c(954): 18404 words searched (287/fr)
INFO: ngram_search_fwdflat.c(957): 11245 word transitions (175/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.14 CPU 0.220 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.12 wall 0.187 xRT
INFO: cmn_live.c(120): Update from < 93.21 -8.67 -8.88 -5.56 -10.87 -0.19 -2.78 -0.73 2.84 -3.07 -1.93 0.79 2.44 >
INFO: cmn_live.c(138): Update to < 93.41 -9.57 -8.47 -5.47 -10.06 0.42 -2.95 -1.82 2.57 -2.62 -0.98 1.79 2.24 >
INFO: ngram_search_fwdtree.c(1550): 3858 words recognized (54/fr)
INFO: ngram_search_fwdtree.c(1552): 236795 senones evaluated (3289/fr)
INFO: ngram_search_fwdtree.c(1556): 1169796 channels searched (16247/fr), 40281 1st, 142488 last
INFO: ngram_search_fwdtree.c(1559): 7030 words for which last channels evaluated (97/fr)
INFO: ngram_search_fwdtree.c(1561): 72648 candidate words for entering last phone (1009/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 17.91 CPU 24.870 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 12.32 wall 17.115 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 155 words
INFO: ngram_search_fwdflat.c(948): 2938 words recognized (41/fr)
INFO: ngram_search_fwdflat.c(950): 111642 senones evaluated (1551/fr)
INFO: ngram_search_fwdflat.c(952): 229547 channels searched (3188/fr)
INFO: ngram_search_fwdflat.c(954): 10456 words searched (145/fr)
INFO: ngram_search_fwdflat.c(957): 6314 word transitions (87/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.08 CPU 0.109 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.08 wall 0.104 xRT
INFO: cmn_live.c(120): Update from < 93.41 -9.57 -8.47 -5.47 -10.06 0.42 -2.95 -1.82 2.57 -2.62 -0.98 1.79 2.24 >
INFO: cmn_live.c(138): Update to < 93.74 -10.48 -8.87 -5.63 -9.42 0.32 -3.21 -2.07 2.21 -1.54 -0.68 1.97 1.85 >
INFO: ngram_search_fwdtree.c(1550): 7019 words recognized (80/fr)
INFO: ngram_search_fwdtree.c(1552): 325095 senones evaluated (3694/fr)
INFO: ngram_search_fwdtree.c(1556): 1955385 channels searched (22220/fr), 58503 1st, 232252 last
INFO: ngram_search_fwdtree.c(1559): 11334 words for which last channels evaluated (128/fr)
INFO: ngram_search_fwdtree.c(1561): 115217 candidate words for entering last phone (1309/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 18.08 CPU 20.543 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 11.98 wall 13.615 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 365 words
INFO: ngram_search_fwdflat.c(948): 2632 words recognized (30/fr)
INFO: ngram_search_fwdflat.c(950): 176368 senones evaluated (2004/fr)
INFO: ngram_search_fwdflat.c(952): 522457 channels searched (5937/fr)
INFO: ngram_search_fwdflat.c(954): 23759 words searched (269/fr)
INFO: ngram_search_fwdflat.c(957): 13859 word transitions (157/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.11 CPU 0.124 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.11 wall 0.130 xRT
INFO: cmn_live.c(120): Update from < 93.74 -10.48 -8.87 -5.63 -9.42 0.32 -3.21 -2.07 2.21 -1.54 -0.68 1.97 1.85 >
INFO: cmn_live.c(138): Update to < 93.53 -10.03 -8.85 -4.80 -8.58 0.32 -3.52 -2.14 2.91 -1.47 -0.63 2.22 1.90 >
INFO: ngram_search_fwdtree.c(1550): 883 words recognized (21/fr)
INFO: ngram_search_fwdtree.c(1552): 146888 senones evaluated (3416/fr)
INFO: ngram_search_fwdtree.c(1556): 750300 channels searched (17448/fr), 25850 1st, 46629 last
INFO: ngram_search_fwdtree.c(1559): 2323 words for which last channels evaluated (54/fr)
INFO: ngram_search_fwdtree.c(1561): 53709 candidate words for entering last phone (1249/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 17.42 CPU 40.516 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 12.03 wall 27.979 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 53 words
INFO: ngram_search_fwdflat.c(948): 650 words recognized (15/fr)
INFO: ngram_search_fwdflat.c(950): 37862 senones evaluated (881/fr)
INFO: ngram_search_fwdflat.c(952): 45609 channels searched (1060/fr)
INFO: ngram_search_fwdflat.c(954): 2226 words searched (51/fr)
INFO: ngram_search_fwdflat.c(957): 1814 word transitions (42/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.02 CPU 0.036 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.02 wall 0.042 xRT
INFO: cmn_live.c(120): Update from < 93.53 -10.03 -8.85 -4.80 -8.58 0.32 -3.52 -2.14 2.91 -1.47 -0.63 2.22 1.90 >
INFO: cmn_live.c(138): Update to < 92.82 -9.69 -8.69 -5.19 -8.84 0.28 -2.89 -2.53 2.95 -0.77 0.05 2.61 1.68 >
INFO: ngram_search_fwdtree.c(1550): 868 words recognized (19/fr)
INFO: ngram_search_fwdtree.c(1552): 165478 senones evaluated (3597/fr)
INFO: ngram_search_fwdtree.c(1556): 1148658 channels searched (24970/fr), 30324 1st, 29845 last
INFO: ngram_search_fwdtree.c(1559): 1763 words for which last channels evaluated (38/fr)
INFO: ngram_search_fwdtree.c(1561): 103611 candidate words for entering last phone (2252/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 12.61 CPU 27.412 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 8.01 wall 17.409 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 46 words
INFO: ngram_search_fwdflat.c(948): 585 words recognized (13/fr)
INFO: ngram_search_fwdflat.c(950): 35815 senones evaluated (779/fr)
INFO: ngram_search_fwdflat.c(952): 37653 channels searched (818/fr)
INFO: ngram_search_fwdflat.c(954): 2052 words searched (44/fr)
INFO: ngram_search_fwdflat.c(957): 1642 word transitions (35/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.02 CPU 0.034 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.02 wall 0.043 xRT
INFO: cmn_live.c(120): Update from < 92.82 -9.69 -8.69 -5.19 -8.84 0.28 -2.89 -2.53 2.95 -0.77 0.05 2.61 1.68 >
INFO: cmn_live.c(138): Update to < 93.42 -9.82 -8.43 -4.88 -8.43 0.13 -2.56 -2.49 3.26 -0.28 0.04 2.72 1.43 >
INFO: ngram_search_fwdtree.c(1550): 6952 words recognized (67/fr)
INFO: ngram_search_fwdtree.c(1552): 414969 senones evaluated (3990/fr)
INFO: ngram_search_fwdtree.c(1556): 2748306 channels searched (26426/fr), 71316 1st, 227747 last
INFO: ngram_search_fwdtree.c(1559): 11669 words for which last channels evaluated (112/fr)
INFO: ngram_search_fwdtree.c(1561): 197819 candidate words for entering last phone (1902/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 14.70 CPU 14.138 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 10.16 wall 9.771 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 267 words
INFO: ngram_search_fwdflat.c(948): 4301 words recognized (41/fr)
INFO: ngram_search_fwdflat.c(950): 215433 senones evaluated (2071/fr)
INFO: ngram_search_fwdflat.c(952): 503454 channels searched (4840/fr)
INFO: ngram_search_fwdflat.c(954): 22257 words searched (214/fr)
INFO: ngram_search_fwdflat.c(957): 14100 word transitions (135/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.16 CPU 0.150 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.15 wall 0.146 xRT
INFO: cmn_live.c(120): Update from < 93.42 -9.82 -8.43 -4.88 -8.43 0.13 -2.56 -2.49 3.26 -0.28 0.04 2.72 1.43 >
INFO: cmn_live.c(138): Update to < 93.27 -9.95 -7.95 -3.79 -7.90 0.07 -2.63 -2.50 3.61 -0.30 -0.63 2.60 1.69 >
INFO: ngram_search_fwdtree.c(1550): 1560 words recognized (32/fr)
INFO: ngram_search_fwdtree.c(1552): 174992 senones evaluated (3646/fr)
INFO: ngram_search_fwdtree.c(1556): 1340415 channels searched (27925/fr), 30752 1st, 70997 last
INFO: ngram_search_fwdtree.c(1559): 3334 words for which last channels evaluated (69/fr)
INFO: ngram_search_fwdtree.c(1561): 135428 candidate words for entering last phone (2821/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 5.31 CPU 11.068 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 3.83 wall 7.975 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 99 words
INFO: ngram_search_fwdflat.c(948): 1259 words recognized (26/fr)
INFO: ngram_search_fwdflat.c(950): 61872 senones evaluated (1289/fr)
INFO: ngram_search_fwdflat.c(952): 107099 channels searched (2231/fr)
INFO: ngram_search_fwdflat.c(954): 4479 words searched (93/fr)
INFO: ngram_search_fwdflat.c(957): 3862 word transitions (80/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.05 CPU 0.098 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.04 wall 0.083 xRT
INFO: cmn_live.c(120): Update from < 93.27 -9.95 -7.95 -3.79 -7.90 0.07 -2.63 -2.50 3.61 -0.30 -0.63 2.60 1.69 >
INFO: cmn_live.c(138): Update to < 93.46 -10.27 -7.86 -3.87 -7.63 -0.25 -2.72 -2.36 3.36 -0.15 -0.46 2.59 1.64 >
INFO: ngram_search_fwdtree.c(1550): 3794 words recognized (53/fr)
INFO: ngram_search_fwdtree.c(1552): 239815 senones evaluated (3331/fr)
INFO: ngram_search_fwdtree.c(1556): 1394923 channels searched (19373/fr), 44549 1st, 127514 last
INFO: ngram_search_fwdtree.c(1559): 6614 words for which last channels evaluated (91/fr)
INFO: ngram_search_fwdtree.c(1561): 107867 candidate words for entering last phone (1498/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 23.55 CPU 32.704 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 15.71 wall 21.824 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 174 words
INFO: ngram_search_fwdflat.c(948): 2625 words recognized (36/fr)
INFO: ngram_search_fwdflat.c(950): 112676 senones evaluated (1565/fr)
INFO: ngram_search_fwdflat.c(952): 238159 channels searched (3307/fr)
INFO: ngram_search_fwdflat.c(954): 11272 words searched (156/fr)
INFO: ngram_search_fwdflat.c(957): 6236 word transitions (86/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.08 CPU 0.109 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.08 wall 0.110 xRT
INFO: cmn_live.c(120): Update from < 93.46 -10.27 -7.86 -3.87 -7.63 -0.25 -2.72 -2.36 3.36 -0.15 -0.46 2.59 1.64 >
INFO: cmn_live.c(138): Update to < 93.46 -10.27 -7.86 -3.87 -7.63 -0.25 -2.72 -2.36 3.36 -0.15 -0.46 2.59 1.64 >
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 0 words
INFO: ngram_search_fwdtree.c(429): TOTAL fwdtree 179.72 CPU 24.721 xRT
INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 121.51 wall 16.713 xRT
INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.75 CPU 0.103 xRT
INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.73 wall 0.100 xRT
INFO: ngram_search.c(303): TOTAL bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(306): TOTAL bestpath 0.00 wall 0.000 xRT
You have endianess issue when you convert 16-bit buffer to bytes here:
Buffer.BlockCopy(buffer.CurrentBuffer16.ToArray(), 0, audio, 0, audio.Length);
You need to swap bytes
I am trying to get a feel for what I should expect in terms of performance from cloud storage.
I just ran the gsutil perfdiag from a compute engine instance in the same location (US) and the same project as my cloud storage bucket.
For nearline storage, I get a 25 Mibit/s read and 353 Mibit/s write, is that low / high / average, why such discrepancy between read and write ?
==============================================================================
DIAGNOSTIC RESULTS
==============================================================================
------------------------------------------------------------------------------
Latency
------------------------------------------------------------------------------
Operation Size Trials Mean (ms) Std Dev (ms) Median (ms) 90th % (ms)
========= ========= ====== ========= ============ =========== ===========
Delete 0 B 5 112.0 52.9 78.2 173.6
Delete 1 KiB 5 94.1 17.5 90.8 115.0
Delete 100 KiB 5 80.4 2.5 79.9 83.4
Delete 1 MiB 5 86.7 3.7 88.2 90.4
Download 0 B 5 58.1 3.8 57.8 62.2
Download 1 KiB 5 2892.4 1071.5 2589.1 4111.9
Download 100 KiB 5 1955.0 711.3 1764.9 2814.3
Download 1 MiB 5 2679.4 976.2 2216.2 3869.9
Metadata 0 B 5 69.1 57.0 42.8 129.3
Metadata 1 KiB 5 37.4 1.5 37.1 39.0
Metadata 100 KiB 5 64.2 47.7 40.9 113.0
Metadata 1 MiB 5 45.7 9.1 49.4 55.1
Upload 0 B 5 138.3 21.0 122.5 164.8
Upload 1 KiB 5 170.6 61.5 139.4 242.0
Upload 100 KiB 5 387.2 294.5 245.8 706.1
Upload 1 MiB 5 257.4 51.3 228.4 319.7
------------------------------------------------------------------------------
Write Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Write throughput: 353.13 Mibit/s.
------------------------------------------------------------------------------
Read Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Read throughput: 25.16 Mibit/s.
------------------------------------------------------------------------------
System Information
------------------------------------------------------------------------------
IP Address:
##.###.###.##
Temporary Directory:
/tmp
Bucket URI:
gs://pl_twitter/
gsutil Version:
4.12
boto Version:
2.30.0
Measurement time:
2015-05-11 07:03:26 PM
Google Server:
Google Server IP Addresses:
##.###.###.###
Google Server Hostnames:
Google DNS thinks your IP is:
CPU Count:
4
CPU Load Average:
[0.16, 0.05, 0.06]
Total Memory:
14.38 GiB
Free Memory:
11.34 GiB
TCP segments sent during test:
5592296
TCP segments received during test:
2417850
TCP segments retransmit during test:
3794
Disk Counter Deltas:
disk reads writes rbytes wbytes rtime wtime
sda1 31 5775 126976 1091674112 856 1603544
TCP /proc values:
wmem_default = 212992
wmem_max = 212992
rmem_default = 212992
tcp_timestamps = 1
tcp_window_scaling = 1
tcp_sack = 1
rmem_max = 212992
Boto HTTPS Enabled:
True
Requests routed through proxy:
False
Latency of the DNS lookup for Google Storage server (ms):
2.5
Latencies connecting to Google Storage server IPs (ms):
##.###.###.### = 1.1
------------------------------------------------------------------------------
In-Process HTTP Statistics
------------------------------------------------------------------------------
Total HTTP requests made: 94
HTTP 5xx errors: 0
HTTP connections broken: 0
Availability: 100%
For standard storage I get:
==============================================================================
DIAGNOSTIC RESULTS
==============================================================================
------------------------------------------------------------------------------
Latency
------------------------------------------------------------------------------
Operation Size Trials Mean (ms) Std Dev (ms) Median (ms) 90th % (ms)
========= ========= ====== ========= ============ =========== ===========
Delete 0 B 5 121.9 34.8 105.1 158.9
Delete 1 KiB 5 159.3 58.2 126.0 232.3
Delete 100 KiB 5 106.8 17.0 103.3 125.7
Delete 1 MiB 5 167.0 77.3 145.1 251.0
Download 0 B 5 87.2 10.3 81.1 100.0
Download 1 KiB 5 95.5 18.0 92.4 115.6
Download 100 KiB 5 156.7 20.5 155.8 179.6
Download 1 MiB 5 219.6 11.7 213.4 232.6
Metadata 0 B 5 59.7 4.5 57.8 64.4
Metadata 1 KiB 5 61.0 21.8 49.6 85.4
Metadata 100 KiB 5 55.3 10.4 50.7 67.7
Metadata 1 MiB 5 75.6 27.8 67.4 109.0
Upload 0 B 5 162.7 37.0 139.0 207.7
Upload 1 KiB 5 165.2 23.6 152.3 194.1
Upload 100 KiB 5 392.1 235.0 268.7 643.0
Upload 1 MiB 5 387.0 79.5 340.9 486.1
------------------------------------------------------------------------------
Write Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Write throughput: 515.63 Mibit/s.
------------------------------------------------------------------------------
Read Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Read throughput: 123.14 Mibit/s.
------------------------------------------------------------------------------
System Information
------------------------------------------------------------------------------
IP Address:
10.240.133.190
Temporary Directory:
/tmp
Bucket URI:
gs://test_throughput_standard/
gsutil Version:
4.12
boto Version:
2.30.0
Measurement time:
2015-05-21 11:08:50 AM
Google Server:
Google Server IP Addresses:
##.###.##.###
Google Server Hostnames:
Google DNS thinks your IP is:
CPU Count:
8
CPU Load Average:
[0.28, 0.18, 0.08]
Total Memory:
Upload 1 MiB 5 387.0 79.5 340.9 486.1
49.91 GiB
Free Memory:
47.9 GiB
TCP segments sent during test:
5165461
TCP segments received during test:
1881727
TCP segments retransmit during test:
3423
Disk Counter Deltas:
disk reads writes rbytes wbytes rtime wtime
dm-0 0 0 0 0 0 0
loop0 0 0 0 0 0 0
loop1 0 0 0 0 0 0
sda1 0 4229 0 1080618496 0 1605286
TCP /proc values:
wmem_default = 212992
wmem_max = 212992
rmem_default = 212992
tcp_timestamps = 1
tcp_window_scaling = 1
tcp_sack = 1
rmem_max = 212992
Boto HTTPS Enabled:
True
Requests routed through proxy:
False
Latency of the DNS lookup for Google Storage server (ms):
1.2
Latencies connecting to Google Storage server IPs (ms):
##.###.##.### = 1.3
------------------------------------------------------------------------------
In-Process HTTP Statistics
------------------------------------------------------------------------------
Total HTTP requests made: 94
HTTP 5xx errors: 0
HTTP connections broken: 0
Availability: 100%
==============================================================================
DIAGNOSTIC RESULTS
==============================================================================
------------------------------------------------------------------------------
Latency
------------------------------------------------------------------------------
Operation Size Trials Mean (ms) Std Dev (ms) Median (ms) 90th % (ms)
========= ========= ====== ========= ============ =========== ===========
Delete 0 B 5 145.1 59.4 117.8 215.2
Delete 1 KiB 5 178.0 51.4 190.6 224.3
Delete 100 KiB 5 98.3 5.0 96.6 104.3
Delete 1 MiB 5 117.7 19.2 112.0 140.2
Download 0 B 5 109.4 38.9 91.9 156.5
Download 1 KiB 5 149.5 41.0 141.9 192.5
Download 100 KiB 5 106.9 20.3 108.6 127.8
Download 1 MiB 5 121.1 16.0 112.2 140.9
Metadata 0 B 5 70.0 10.8 76.8 79.9
Metadata 1 KiB 5 113.8 36.6 124.0 148.7
Metadata 100 KiB 5 63.1 20.2 55.7 86.5
Metadata 1 MiB 5 59.2 4.9 61.3 62.9
Upload 0 B 5 127.5 22.6 117.4 153.6
Upload 1 KiB 5 215.2 54.8 221.4 270.4
Upload 100 KiB 5 229.8 79.2 171.6 329.8
Upload 1 MiB 5 489.8 412.3 295.3 915.4
------------------------------------------------------------------------------
Write Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Write throughput: 503 Mibit/s.
------------------------------------------------------------------------------
Read Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Read throughput: 1.05 Gibit/s.
------------------------------------------------------------------------------
System Information
------------------------------------------------------------------------------
IP Address:
################
Temporary Directory:
/tmp
Bucket URI:
gs://test_throughput_standard/
gsutil Version:
4.12
boto Version:
2.30.0
Measurement time:
2015-05-21 06:20:49 PM
Google Server:
Google Server IP Addresses:
#############
Google Server Hostnames:
Google DNS thinks your IP is:
CPU Count:
8
CPU Load Average:
[0.08, 0.03, 0.05]
Total Memory:
49.91 GiB
Free Memory:
47.95 GiB
TCP segments sent during test:
4958020
TCP segments received during test:
2326124
TCP segments retransmit during test:
2163
Disk Counter Deltas:
disk reads writes rbytes wbytes rtime wtime
dm-0 0 0 0 0 0 0
loop0 0 0 0 0 0 0
loop1 0 0 0 0 0 0
sda1 0 4202 0 1080475136 0 1610000
TCP /proc values:
wmem_default = 212992
wmem_max = 212992
rmem_default = 212992
tcp_timestamps = 1
tcp_window_scaling = 1
tcp_sack = 1
rmem_max = 212992
Boto HTTPS Enabled:
True
Requests routed through proxy:
False
Latency of the DNS lookup for Google Storage server (ms):
1.6
Latencies connecting to Google Storage server IPs (ms):
############ = 1.3
2nd Run:
==============================================================================
DIAGNOSTIC RESULTS
==============================================================================
------------------------------------------------------------------------------
Latency
------------------------------------------------------------------------------
Operation Size Trials Mean (ms) Std Dev (ms) Median (ms) 90th % (ms)
========= ========= ====== ========= ============ =========== ===========
Delete 0 B 5 91.5 14.0 85.1 106.0
Delete 1 KiB 5 125.4 76.2 91.7 203.3
Delete 100 KiB 5 104.4 15.9 99.0 123.2
Delete 1 MiB 5 128.2 36.0 116.4 170.7
Download 0 B 5 60.2 8.3 63.0 68.7
Download 1 KiB 5 62.6 11.3 61.6 74.8
Download 100 KiB 5 103.2 21.3 110.7 123.8
Download 1 MiB 5 137.1 18.5 130.3 159.8
Metadata 0 B 5 73.4 35.9 62.3 114.2
Metadata 1 KiB 5 55.9 18.1 55.3 75.6
Metadata 100 KiB 5 45.7 11.0 42.5 59.1
Metadata 1 MiB 5 49.9 7.9 49.2 58.8
Upload 0 B 5 128.2 24.6 115.5 158.8
Upload 1 KiB 5 153.5 44.1 132.4 206.4
Upload 100 KiB 5 176.8 26.8 165.1 209.7
Upload 1 MiB 5 277.9 80.2 214.7 378.5
------------------------------------------------------------------------------
Write Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Write throughput: 463.76 Mibit/s.
------------------------------------------------------------------------------
Read Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Read throughput: 184.96 Mibit/s.
------------------------------------------------------------------------------
System Information
------------------------------------------------------------------------------
IP Address:
#################
Temporary Directory:
/tmp
Bucket URI:
gs://test_throughput_standard/
gsutil Version:
4.12
boto Version:
2.30.0
Measurement time:
2015-05-21 06:24:31 PM
Google Server:
Google Server IP Addresses:
####################
Google Server Hostnames:
Google DNS thinks your IP is:
CPU Count:
8
CPU Load Average:
[0.19, 0.17, 0.11]
Total Memory:
49.91 GiB
Free Memory:
47.9 GiB
TCP segments sent during test:
5180256
TCP segments received during test:
2034323
TCP segments retransmit during test:
2883
Disk Counter Deltas:
disk reads writes rbytes wbytes rtime wtime
dm-0 0 0 0 0 0 0
loop0 0 0 0 0 0 0
loop1 0 0 0 0 0 0
sda1 0 4209 0 1080480768 0 1604066
TCP /proc values:
wmem_default = 212992
wmem_max = 212992
rmem_default = 212992
tcp_timestamps = 1
tcp_window_scaling = 1
tcp_sack = 1
rmem_max = 212992
Boto HTTPS Enabled:
True
Requests routed through proxy:
False
Latency of the DNS lookup for Google Storage server (ms):
3.5
Latencies connecting to Google Storage server IPs (ms):
################ = 1.1
------------------------------------------------------------------------------
In-Process HTTP Statistics
------------------------------------------------------------------------------
Total HTTP requests made: 94
HTTP 5xx errors: 0
HTTP connections broken: 0
Availability: 100%
3rd run
==============================================================================
DIAGNOSTIC RESULTS
==============================================================================
------------------------------------------------------------------------------
Latency
------------------------------------------------------------------------------
Operation Size Trials Mean (ms) Std Dev (ms) Median (ms) 90th % (ms)
========= ========= ====== ========= ============ =========== ===========
Delete 0 B 5 157.0 78.3 101.5 254.9
Delete 1 KiB 5 153.5 49.1 178.3 202.5
Delete 100 KiB 5 152.9 47.5 168.0 202.6
Delete 1 MiB 5 110.6 20.4 105.7 134.5
Download 0 B 5 104.4 50.5 66.8 167.6
Download 1 KiB 5 68.1 11.1 68.7 79.2
Download 100 KiB 5 85.5 5.8 86.0 90.8
Download 1 MiB 5 126.6 40.1 100.5 175.0
Metadata 0 B 5 67.9 16.2 61.0 86.6
Metadata 1 KiB 5 49.3 8.6 44.9 59.5
Metadata 100 KiB 5 66.6 35.4 44.2 107.8
Metadata 1 MiB 5 53.9 13.2 52.1 69.4
Upload 0 B 5 136.7 37.1 114.4 183.5
Upload 1 KiB 5 145.5 58.3 116.8 208.2
Upload 100 KiB 5 227.3 37.6 233.3 259.3
Upload 1 MiB 5 274.8 45.2 261.8 328.5
------------------------------------------------------------------------------
Write Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Write throughput: 407.03 Mibit/s.
------------------------------------------------------------------------------
Read Throughput
------------------------------------------------------------------------------
Copied a 1 GiB file 5 times for a total transfer size of 5 GiB.
Read throughput: 629.07 Mibit/s.
------------------------------------------------------------------------------
System Information
------------------------------------------------------------------------------
IP Address:
###############
Temporary Directory:
/tmp
Bucket URI:
gs://test_throughput_standard/
gsutil Version:
4.12
boto Version:
2.30.0
Measurement time:
2015-05-21 06:32:48 PM
Google Server:
Google Server IP Addresses:
################
Google Server Hostnames:
Google DNS thinks your IP is:
CPU Count:
8
CPU Load Average:
[0.11, 0.13, 0.13]
Total Memory:
49.91 GiB
Free Memory:
47.94 GiB
TCP segments sent during test:
5603925
TCP segments received during test:
2438425
TCP segments retransmit during test:
4586
Disk Counter Deltas:
disk reads writes rbytes wbytes rtime wtime
dm-0 0 0 0 0 0 0
loop0 0 0 0 0 0 0
loop1 0 0 0 0 0 0
sda1 0 4185 0 1080353792 0 1603851
TCP /proc values:
wmem_default = 212992
wmem_max = 212992
rmem_default = 212992
tcp_timestamps = 1
tcp_window_scaling = 1
tcp_sack = 1
rmem_max = 212992
Boto HTTPS Enabled:
True
Requests routed through proxy:
False
Latency of the DNS lookup for Google Storage server (ms):
2.2
Latencies connecting to Google Storage server IPs (ms):
############## = 1.6
All things being equal, write performance is generally higher for modern storage systems because of presence of a caching layer between the application disks, that said, what you are seeing is within the expected range for "nearline" storage.
I have observed far superior throughput results when using "standard" storage buckets. Though latency did not improve much. Consider using a "Standard" bucket if your application requires high throughput. If your application is sensitive to latency, then using local storage as a cache (or scratch space) may be the only option.
Here is a snippet from one my experiments on "Standard" buckets:
------------------------------------------------------------------------------
Latency
------------------------------------------------------------------------------
Operation Size Trials Mean (ms) Std Dev (ms) Median (ms) 90th % (ms)
========= ========= ====== ========= ============ =========== ===========
Delete 0 B 10 91.5 12.4 89.0 98.5
Delete 1 KiB 10 96.4 9.1 95.6 105.6
Delete 100 KiB 10 92.9 22.8 85.3 102.4
Delete 1 MiB 10 86.4 9.1 84.1 93.2
Download 0 B 10 54.2 5.1 55.4 58.8
Download 1 KiB 10 83.3 18.7 78.4 94.9
Download 100 KiB 10 75.2 14.5 68.6 92.6
Download 1 MiB 10 95.0 19.7 86.3 126.7
Metadata 0 B 10 33.5 7.9 31.1 44.8
Metadata 1 KiB 10 36.3 7.2 35.8 46.8
Metadata 100 KiB 10 37.7 9.2 36.6 44.1
Metadata 1 MiB 10 116.1 231.3 36.6 136.1
Upload 0 B 10 151.4 67.5 122.9 195.9
Upload 1 KiB 10 134.2 22.4 127.9 149.3
Upload 100 KiB 10 168.8 20.5 168.6 188.6
Upload 1 MiB 10 213.3 37.6 200.2 262.5
------------------------------------------------------------------------------
Write Throughput
------------------------------------------------------------------------------
Copied 5 1 GiB file(s) for a total transfer size of 10 GiB.
Write throughput: 3.46 Gibit/s.
Parallelism strategy: both
------------------------------------------------------------------------------
Write Throughput With File I/O
------------------------------------------------------------------------------
Copied 5 1 GiB file(s) for a total transfer size of 10 GiB.
Write throughput: 3.9 Gibit/s.
Parallelism strategy: both
------------------------------------------------------------------------------
Read Throughput
------------------------------------------------------------------------------
Copied 5 1 GiB file(s) for a total transfer size of 10 GiB.
Read throughput: 7.04 Gibit/s.
Parallelism strategy: both
------------------------------------------------------------------------------
Read Throughput With File I/O
------------------------------------------------------------------------------
Copied 5 1 GiB file(s) for a total transfer size of 10 GiB.
Read throughput: 1.64 Gibit/s.
Parallelism strategy: both
Hope that is helpful.
I can't understand, where my ceph raw space is gone.
cluster 90dc9682-8f2c-4c8e-a589-13898965b974
health HEALTH_WARN 72 pgs backfill; 26 pgs backfill_toofull; 51 pgs backfilling; 141 pgs stuck unclean; 5 requests are blocked > 32 sec; recovery 450170/8427917 objects degraded (5.341%); 5 near full osd(s)
monmap e17: 3 mons at {enc18=192.168.100.40:6789/0,enc24=192.168.100.43:6789/0,enc26=192.168.100.44:6789/0}, election epoch 734, quorum 0,1,2 enc18,enc24,enc26
osdmap e3326: 14 osds: 14 up, 14 in
pgmap v5461448: 1152 pgs, 3 pools, 15252 GB data, 3831 kobjects
31109 GB used, 7974 GB / 39084 GB avail
450170/8427917 objects degraded (5.341%)
18 active+remapped+backfill_toofull
1011 active+clean
64 active+remapped+wait_backfill
8 active+remapped+wait_backfill+backfill_toofull
51 active+remapped+backfilling
recovery io 58806 kB/s, 14 objects/s
OSD tree (each host has 2 OSD):
# id weight type name up/down reweight
-1 36.45 root default
-2 5.44 host enc26
0 2.72 osd.0 up 1
1 2.72 osd.1 up 0.8227
-3 3.71 host enc24
2 0.99 osd.2 up 1
3 2.72 osd.3 up 1
-4 5.46 host enc22
4 2.73 osd.4 up 0.8
5 2.73 osd.5 up 1
-5 5.46 host enc18
6 2.73 osd.6 up 1
7 2.73 osd.7 up 1
-6 5.46 host enc20
9 2.73 osd.9 up 0.8
8 2.73 osd.8 up 1
-7 0 host enc28
-8 5.46 host archives
12 2.73 osd.12 up 1
13 2.73 osd.13 up 1
-9 5.46 host enc27
10 2.73 osd.10 up 1
11 2.73 osd.11 up 1
Real usage:
/dev/rbd0 14T 7.9T 5.5T 59% /mnt/ceph
Pool size:
osd pool default size = 2
Pools:
ceph osd lspools
0 data,1 metadata,2 rbd,
rados df
pool name category KB objects clones degraded unfound rd rd KB wr wr KB
data - 0 0 0 0 0 0 0 0 0
metadata - 0 0 0 0 0 0 0 0 0
rbd - 15993591918 3923880 0 444545 0 82936 1373339 2711424 849398218
total used 32631712348 3923880
total avail 8351008324
total space 40982720672
Raw usage is 4x real usage. As I understand, it must be 2x ?
Yes, it must be 2x. I don't really shure, that the real raw usage is 7.9T. Why do you check this value on mapped disk?
This are my pools:
pool name KB objects clones degraded unfound rd rd KB wr wr KB
admin-pack 7689982 1955 0 0 0 693841 3231750 40068930 353462603
public-cloud 105432663 26561 0 0 0 13001298 638035025 222540884 3740413431
rbdkvm_sata 32624026697 7968550 31783 0 0 4950258575 232374308589 12772302818 278106113879
total used 98289353680 7997066
total avail 34474223648
total space 132763577328
You can see, that the total amount of used space is 3 times more than the used space in the pool rbdkvm_sata (+-).
ceph -s shows the same result too:
pgmap v11303091: 5376 pgs, 3 pools, 31220 GB data, 7809 kobjects
93736 GB used, 32876 GB / 123 TB avail
I don't think you have just one rbd image. The result of "ceph osd lspools" indicated that you had 3 pools and one of pools had name "metadata".(Maybe you were using cephfs). /dev/rbd0 was appeared because you mapped the image but you could have other images also. To list the images you can use "rbd list -p ". You can see the image info with "rbd info -p "
I am using ubuntu 12, nginx, uwsgi 1.9 with socket, django 1.5.
Config:
[uwsgi]
base_path = /home/someuser/web/
module = server.manage_uwsgi
uid = www-data
gid = www-data
virtualenv = /home/someuser
master = true
vacuum = true
harakiri = 20
harakiri-verbose = true
log-x-forwarded-for = true
profiler = true
no-orphans = true
max-requests = 10000
cpu-affinity = 1
workers = 4
reload-on-as = 512
listen = 3000
Client tests from Windows7:
C:\Users\user>C:\AppServ\Apache2.2\bin\ab.exe -c 255 -n 5000 http://www.someweb.com/about/
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/
Benchmarking www.someweb.com (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Finished 5000 requests
Server Software: nginx
Server Hostname: www.someweb.com
Server Port: 80
Document Path: /about/
Document Length: 1881 bytes
Concurrency Level: 255
Time taken for tests: 66.669814 seconds
Complete requests: 5000
Failed requests: 1
(Connect: 1, Length: 0, Exceptions: 0)
Write errors: 0
Total transferred: 10285000 bytes
HTML transferred: 9405000 bytes
Requests per second: 75.00 [#/sec] (mean)
Time per request: 3400.161 [ms] (mean)
Time per request: 13.334 [ms] (mean, across all concurrent requests)
Transfer rate: 150.64 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 8 207.8 1 9007
Processing: 10 3380 11480.5 440 54421
Waiting: 6 1060 3396.5 271 48424
Total: 11 3389 11498.5 441 54423
Percentage of the requests served within a certain time (ms)
50% 441
66% 466
75% 499
80% 519
90% 3415
95% 36440
98% 54407
99% 54413
100% 54423 (longest request)
I have set following options too:
echo 3000 > /proc/sys/net/core/netdev_max_backlog
echo 3000 > /proc/sys/net/core/somaxconn
So,
1) I make first 3000 requests super fast. I see progress in ab and in uwsgi requests logs -
[pid: 5056|app: 0|req: 518/4997] 80.114.157.139 () {30 vars in 378 bytes} [Thu Mar 21 12:37:31 2013] GET /about/ => generated 1881 bytes in 4 msecs (HTTP/1.0 200) 3 headers in 105 bytes (1 switches on core 0)
[pid: 5052|app: 0|req: 512/4998] 80.114.157.139 () {30 vars in 378 bytes} [Thu Mar 21 12:37:31 2013] GET /about/ => generated 1881 bytes in 4 msecs (HTTP/1.0 200) 3 headers in 105 bytes (1 switches on core 0)
[pid: 5054|app: 0|req: 353/4999] 80.114.157.139 () {30 vars in 378 bytes} [Thu Mar 21 12:37:31 2013] GET /about/ => generated 1881 bytes in 4 msecs (HTTP/1.0 200) 3 headers in 105 bytes (1 switches on core 0)
I dont have any broken pipes or worker respawns.
2) Next requests are running very slow or with some timeout. Looks like that some buffer becomes full and I am waiting before it becomes empty.
3) Some buffer becomes empty.
4) ~500 requests are processed super fast.
5) Some timeout.
6) see Nr. 4
7) see Nr. 5
8) see Nr. 4
9) see Nr. 5
....
....
Need your help
check with netstat and dmesg. You have probably exhausted ephemeral ports or filled the conntrack table.