Important Events

User login

Feeds

Syndicate content

Contact details

Contact WCL

Tel: +30 2610 996496
+30 2610 996480
Fax: +30 2610 997336
+30 2610 991855
Email: raniadou [AT] upatras.gr
 
 
Electrical and Computer Enginnering Department
 
University of Patras
 

Demo: Speaker Verification System Based on Probabilistic Neural Networks

     A simplified block diagram of the Probabilistic Neural Network (PNN)-based speaker verification system, WCL-1, is presented in Figure 1. The upper part of the figure summarizes the process of training, where the process of building of the reference model, referred to as Universal Background Codebook (UBgCB), as well as construction of the individual codebooks for the target speakers is shown.  A personal PNN for each of the target users is created, by utilizing the reference codebook and the codebook created for the corresponding user.  The lower part of the figure illustrates the operational mode of the WCL-1 system.  The processing steps, WCL-1 performs for each test trial in order to make a final decision, are shown. Further details are available in [1, 2].
Figure 1.  A simplified block diagram of the WCL-1 system
Figure 1.  A simplified block diagram of the WCL-1 system
     Comparative performance assessment of the speaker verification system, WCL-1, with respect to other state-of-the-art speaker verification systems was performed during several world-wide speaker recognition evaluation campaigns. Specifically, the WCL-1 system successfully participated in the 2002, 2003, and 2004 NIST Speaker Recognition Evaluations and the 2003 NFI/TNO Forensic Speaker Recognition Evaluation. 
     The 2003 NFI/TNO Forensic Speaker Recognition Evaluation campaign [3, 4, 5] provided a common setup for the evaluation of a number of systems from fifteen participating organizations – Universities, commercial companies, and research institutes.  Since some participants submitted results for multiple systems the total was about thirty systems. One participating organization failed to provide results in the framework of the evaluation, and the results form another one did not meet the requirements.  In addition, one of the systems represented internal methodology of the National Forensic Institute of The Netherlands, and by this reason its results were kept by the organizers for internal use only.  Each of the rest twelve organizations had one primary system.
     In brief, the 2003 NFI/TNO Forensic Speaker Recognition Evaluation targeted at assessment of the practical value of the present state-of-the-art technology in forensic context. Real-world wiretapped recordings, collected in real police investigations, from real criminal suspects, were used during this evaluation in order to provide conditions closer as much as possible to real forensic investigations. A comprehensive description of the evaluation results is presented in [4], and a comparison between the NFI-TNO Forensic Speaker Recognition Evaluation and the NIST Speaker Recognition Evaluations is available in [5].
      Figure 2 presents the official evaluation results and the ranking of the systems. (The normalized decision cost for some systems exceeds unity, and in the graph is limited to one.)  The table bellow the figure presents the values of the normalized decision cost for each system.  As seen in the figure, the most left system is the best, since it provides the lowest actual decision cost (the right bar in each pair of bars), and the systems in the right side of it show a lower performance, since they demonstrate higher decision cost. Apparently, for some systems the actual decision cost and the optimal decision cost a quite different.  The large red ellipse in Figure 2 indicates the results for the WCL-1 system.  As presented in this plot, the WCL-1 system took second position in the final ranking.
Figure 2. Actual and optimal (minimal) decision costs for the twelve primary systems [4, 5]
Figure 2. Actual and optimal (minimal) decision costs for the twelve primary systems [4, 5]

IF YOU ARE NEW USER, PLEASE NAVIGATE HERE TO TRAIN THE SYSTEM VERIFICATOR WITH YOU VOICE!
(JavaApplet in action)

For more information please contact: Nikos Fakotakis or Todor Ganchev
Web based demo created by: Charalampos Tsimpouris

References
  1. Ganchev T., Fakotakis, N., Kokkinakis, G. (2002).  “Speaker Verification System Based on Probabilistic Neural Networks”, 2002 NIST Speaker Recognition Evaluation Workshop, May 19-22, 2002, Vienna, Virginia, USA. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.551
  2. Ganchev T., Potamitis, I., Fakotakis, N., Kokkinakis, G. (2004a). “Text-Independent Speaker Verification for Real Fast-Varying Noisy Environments”, International Journal of Speech Technology, Kluwer Academic Publishers. Vol. 7, No. 4, October  2004, pp. 281–292. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.63.4479
  3. van Leeuwen, D., Bouten, J. (2003). The NFI/TNO Forensic Speaker Recognition Evaluation Plan. Revision: 2.0. Available: http://speech.tm.tno.nl/aso/evalplan.pdf
  4. van Leeuwen, D., Bouten, J (2004). Results of the 2003 NFI-TNO forensic speaker recognition evaluation. In Proc. Odyssey 2004 Speaker and Language recognition workshop, ISCA, pp. 75–82. Available: http://lands.let.ru.nl/literature/leeuwen.2004.1.pdf
  5. van Leeuwen, D., Martin, A.F., Przybocki, M.A., Bouten, J.S. (2006). NIST and NFI-TNO evaluations of automatic speaker recognition. In Computer Speech and Language, Vol. 20, Issues 2-3, April-July 2006, pp. 128-158. Available: http://dx.doi.org/10.1016/j.csl.2005.07.001