Artificial Intelligence Group

Contact Info:

AI-Group
Wire Communication Lab.
Electrical and Computer Engineering Department
University of Patras
26 500 Rio, GREECE
Head of the AI group: Prof. Nikos Fakotakis

Tel.: +30.2610.996.496
Fax: +30.2610.997.336
E-mail: Prof. Nikos Fakotakis

Tech. Assistant: Mrs. Rania Doufexi
 
 
 

The Artificial Intelligence Group (AI group) is a part of the Wire Communications Laboratory of the Electrical and Computer Engineering Department, of the University of Patras, Greece.

The WCL / AI group is an international team of more than 15 individuals: Greek, Bulgarian, Romanian and English professionals on teaching and research positions. The research and technology development staff, constituting the core of our team, has academic degrees in electrical engineering, computer science, physics and mathematics. The research activities carried out by members of the AI group resulted in over than 20 PhD dissertations and over than 300 scientific publications in both basic and applied research.

The WCL/ AI group, counts more than 30 years of continuous activity in research and technology development. During this period, the WCL/ AI group has participated in more than 30 national and European RTD projects. Its major research contributions are in the areas of Speech & Language Technology and Artificial Intelligence.

 

Speech Processing

Speech Enhancement

Speaker Localization and Tracking
Robust Automatic Speech
Recognition
Speaker Recognition
Spoken Language and Dialect
Recognition
Emotion/Affect Recognition
Text to Speech Synthesis
Sound Recognition
 
Natural Language Processing
 
Natural Language Understanding and
Generation
Dialog Management and Processing
Spoken Interaction Strategies
Lexicography
Text Engineering
Information Extraction
 
Artificial Intelligence
 
Search Methods
Problem Solving
Rule Based Systems
Knowledge Representation
Logic Programming
Machine Learning
Intelligent Human-Machine
Interaction
User Modeling
Automata Theory
Game Theory
Quantum AI
 

Audio and Speech Processing

The AI group has a long tradition in the fields of Automatic Speech Recognition, Text-To-Speech synthesis (TTS), speaker verification and identification, as well as, in language modeling as a means to enhance the speech recognition performance.
 

  • Members of the AI group have developed numerous speech processing components/applications, among which are the following:
  • Adaptive Framework for Real-Time Acoustic Surveillance of Potential Hazards, based on probabilistic structures (developed  within the Prometheus project)
  • Speech/Music Discriminator, HMM-based, frequency-domain and wavelet-based features
  • Automatic Recognizer of Urban Environmental Sound Events, based on hierarchical structures
  • A Speech Annotation Toolbox (developed within the SpeechDat(II) and SpeechDat(Car) projects)
  • Recording Tools for Speech Database creation
  • A Modern Greek TTS system, based on MBROLA concatenative algorithm
  • A Modern Greek TTS system, based on Klutt formant synthesizer
  • A Modern Greek TTS system, based on unit selection, corpus-based
  • Speaker Verification and Speaker Identification systems, based on Neural Networks.
  • Automatic Speech Recognition for Greek, British English and German Languages 
  • Spoken Language Recognition System, PPRLM-based
  • Greek-Cypriot Dialect recognizer, PRLM-based
  • Automatic Speech Segmentation Tools, HMM-based
  • Speech-based Emotion Detection System, GMM-based
  • Real-life Speech-based Affect Recognition System, based on acoustic and linguistic information
  • An Environment for Building Interactive Natural Interfaces (in the framework of the GEMINI project)
  • A Dialogue System for the Automation of Call Centre Services, automating the collection of data for car insurance companies (in the framework of the European Project ACCeSS).
  • A Dialogue System for Telephone-based Services (in the framework of the European Project IDAS)
  • A Spoken Dialogue Interaction System for smart-home environment (in the framework of the INSPIRE project)
  • The Phone-Call Router for the Department of Electrical and Computer Engineering at the University of Patras
  • The Voice Portal for the University of Patras

Natural Language Processing

The AI group has developed natural language tools for Modern Greek covering a wide variety of applications.
In particular, the following tools/components are available:
  • A grapheme-to-phoneme (and vice versa) converter for Modern Greek, based on the two-level morphology model.
  • A morphological processor for Modern Greek based on the PC-KIMMO formalization, performing morphological analysis and synthesis over a lexicon of 30.000 lemmas.
  • A unification-based syntactic analyzer for Modern Greek based on the PC-PATR formalization.
  • A sentence and chunk boundaries detector for unrestricted Modern Greek text.
  • A stylistic analyzer for unrestricted Modern Greek text that categorizes texts in terms of genre and author.
  • A business letter generator for Modern Greek that takes into account stylistic aspects (in the framework of the national project DIALOGOS).
  • A semantic parser for the identification of temporal expressions in Modern Greek texts.
  • Algorithms for incremental construction of lexicons in Directed Acyclic Word Graphs (DAWG) and algorithms for fast access of these lexicons.

 

Speech and Language Resources
The AI group created (either on its own or in cooperation with other partners) a number of speech and language resources, among which are the following:

  • SpeechDat(II)-FDB-5000-Greek – a speech recognition database with 5000 speakers (within the SpeechDat(II) project)
  • SpeechDat(Car)-Greek – a speech recognition database (within the SpeechDat(Car) project)
  • PolyCost Speaker Recognition database (within the COST 250 project)
  • Orientel Cypriot Greek Speech database (within the Orientel project)
  • MoveOn Motorcycle speech and noise database for police information support systems (within the MoveOn project)
  • Prosodic database for text-to-speech synthesis for Greek language
  • Acted emotional speech database for Greek language
  • Greek speech database for corpus-based text-to-speech synthesis
  • Real-world Affective Speech corpus (smart-home domain)
  • PlayMancer Multimodal Affective corpus – video, speech, bio-signals, (serious game domain), (within the PlayMancer project)
  • Prometheus database – A Multimodal Database of Heterogeneous Sensors for Human Behavior Analysis and Interpretation – microphone arrays, video cameras, infrared cameras, 3D cameras, IR movement detection sensors, (within the Prometheus project)
  • Various text corpora (with overall size over 50 Mwords)
  • ESPRIT 860: Greek newspaper corpus with grammatical analysis of words
  • ORTHO: Greek monolingual lexicon, compiled from several printed dictionaries
  • COLLINS: Corpus and dictionary
  • ONOMASTICA: Lexicon of Greek proper names
  • IDAS: Surnames in phonetic transcription
  • POLYGLOT: Speech samples, annotated
  • LIP READING: 157 AVI files with lip moves during word pronunciation
  • Korais lexicon, with over 80000 lemmas

Artificial Intelligence

  • Morphological analysers
  • Syntactic parsers
  • Lemmatizers (also language independent ones)
  • Grapheme-to-phoneme and phoneme-to-grapheme converters
  • A generic platform for semi-automatic generation of multilingual and multimodal interfaces

Past Research Activities


Optical Character Recognition

The AI group has developed tools for the preprocessing of document images and words as well as systems for character recognition. In more detail, the following tools are available:

  • A skew estimation system for printed and handwritten documents.
  • A shift correction system for printed and handwritten words.
  • A handwritten character recognition system for Modern Greek.

 
Authorship Recognition from text documents

The AI group has developed tools for authorship identification from text documents.

 
 
 
 

 

Courses

The following courses are given by the members of the AI group:


 

Artificial Intelligence

Part 1: Search.

Problem Solving: Problem and Problem Spaces, Defining the Problem's Formal Description.

Problem Characteristics: Production Systems. Production Systems Characteristics. Search Methods: Kinds of Search Methods. Searching for a Path. Blind Search. Heuristics. Searching for the optimal path. Game playing.

Part 2: Knowledge Representation.

Logic: Propositional Calculus. Predicate Calculus. Resolution. Structured Representations: Declarative Representations. Semantic Nets. Conceptual Dependency. Frames. Scripts.
Procedural Representations. Statistical Reasoning: Probability reasoning. Fuzzy logic.

Part 3: Introduction to PROLOG.

Introduction. Prolog rules. Matching. Recursion. Cutting. List processing. Built-in functions.
Assert-Retract. Defining operators. Reserved streams. DCG rules.


Lecturer: Prof. N.Fakotakis


 

Natural Language Technology

Part 1: Introduction to linguistics.

Phonetics, Phonology, Morphology, Syntactics, Transformation grammar, Semantics, Pragmatics.

Lecturer: Assoc.Prof. K. Sgarbas

Part 2: Syntactic Analysis of Natural and Artificial Languages.

Statistical study of natural languages, Formal languages and grammars, Syntactic analysis,

Methods for constructing grammars, Statistical grammars.

Lecturer: Assoc.Prof. K. Sgarbas

Part 3: Semantic-Pragmatic Processing.

Semantic processing and a logical form, Semantic interpretation, Reference resolution, Use of world-knowledge for acts.

Lecturer: Assoc.Prof. K. Sgarbas


 

Speech Technology

Introduction, Speech Production, Hearing, Speech Perception. Speech Sounds and Features. Speech Signal Analysis, Preprocessing.
Feature Extraction: Filter-Bank Analysis, Linear Predictive Coding. Vector Quantization.
Pattern Recognition: Distortion Measures, Dynamic Time-Warping. Hidden Markov models (HMM), Artificial Neural Networks (ANN). Speech Recognition Systems. Speaker Recognition Systems. Speech coding: Time Domain Speech Coding, Coding Techniques using Speech Spectrum (Frequency Domain). Coding Techniques using Analysis-Synthesis (Frequency Domain), Coding Techniques using Linear Prediction.

 

Laboratory Exercises:

Lecturer: Prof. N.Fakotakis, Prof. V. Dermatas


 

Digital Logic Design

Introduction. Single-bit memory elements: T flip-flops, SR flip-flops, JK flip-flops, D flip-flops, latching action of a flip-flop. Counters: series and parallel connection of counters, synchronous up/down-counters, decade binary up-down-counters, asynchronous binary counters, asynchronous resetable counters, integrated-circuit counters. Shift register counters and generators: shift register with parallel loading, shift registers as counters, the universal state diagram for shift registers, the design of a decade counter, shift register sequence generators, the ring counter.
Clock-driven sequential circuits: analysis of a clocked sequential circuit, the design procedure for clocked sequential circuits, the design of a sequence generator, moore and mealy state machines, pulsed synchronous circuits, state reduction, state assignment. Event-driven circuits: races and cycles, race-free assignment for a three-state machine, race-free assignment for a four-state machine, a sequence detector.
Hazards: gate delays, the generation of spikes, the production of static hazards in combinational networks, the elimination of static hazards, design of hazard-free combinational hazards, detection of hazards in an existing network, dynamic hazards.

Lecturer: Prof. N.Fakotakis

RTD Projects

The AIG has participated (or is currently participating) as a partner or coordinator in more than 30 RTD projects. A short list of selected projects is as follows:

 

• ESPRIT-860: "Linguistic Analysis of the European Languages",

• POLYGLOT (ESPRIT II-2104): "A Multilanguage Speech-to-Text and Text-to-Speech System",

• TRANSLEARN (LRE, 61-016): "Interactive Corpus-based Translation Drafting Tool",

• GRAMCHECK (MLAP 11): "A Grammatical and Style Checker",

• TRANSLIB (LIB-3038): "Advanced Tools for Accessing Multilingual Library Catalogues",

• ACCeSS (LE-1 1802): "Automated Call Center through Speech Understanding System",

• VASME (TRANSPORT PL-00010): "Value Added Services for Maritime Environment",

• SPEECHDAT (LE-2 4001): "Speech Databases for the Creation of Voice Driven Teleservices",

• IDAS (LE-38315): "Interactive, Telephone-based, Directory Assistance Services",

• SPEECHDAT-CAR (LE-8334): "Speech Data Basis for Voice Driven Teleservices and Control in Automotive Environments",

• E2M (IST-2000-30167): "From e-services to mobile services",

• COST 278: "Spoken Language Interaction in Telecommunication",

OrienTel (IST-2000-28373): "Multilingual Access to Interactive Communication Services for the Mediterranean and Middle East",

GEMINI (IST-2001-32343): "Generic Environment for Multilingual Interactive Natural Interfaces",

INSPIRE (IST-2001-32746): "Infotainment Management with speech Interaction via Remote-Microphones and Telephone Interfaces",

MoveOn (IST-2005-034753): "Multi-modal and multi-sensor zero-distraction interaction interface for two wheel vehicles ON the move".

LOGOS (EHΓ-102): "A general architecture for speech recognition and (user friendly) dialogue interaction for advanced commercial applications (LOGOS)"

PlayMancer (FP7-ICT-215839-2007): "PlayMancer: A European Serious Gaming 3D Environment"

Prometheus (FP7-ICT-214901-2007): "Prediction and interpretation of human behaviour based on probabilistic structures and heterogeneous sensors (Prometheus)"

 

 

 

 

Tools and Resources

Artificial Intelligence Group activities include the development of real world tools, Data Bases and applications. This page contains links regarding these resources.

 

Tools

Tool/Link Description Contact
deGREEKLISH A Greeklish to Greek Converter
Link: http://tools.wcl.ece.upatras.gr/degreeklish
Ilias Kotinas
     

Speech resources

The following speech databases are available:

SpeechDat(II)

A Database for the creation of voice driven teleservices.

SpeechDat-Car

A speech database recorded in vehicles

Socrates Emotional Speech

An emotional speech database recorded in a smart home environment. More...

Language Resources

 You can find an online presentation of the languages resources here: Language resources of WCL


AIG People

AI members on line presentations

 

Faculty

 

 

Post-Doc

 

 

PhD Students

Links

Software Tools for NLP