The white-paper by Prof. Prasanna - 'Processing of Reverberant Speech for Time-Delay Estimation' provides the perfect platform for a low-cost video conferencing solution. Time-delay estimation from microphone outputs is the first step for many sound localization algorithms. The paper proposed a robust method to obtain the time-delay between the signal of two identical microphones. The time-delay estimate can be obtained using the features extracted even from short segments (50–100 ms) of speech from a pair of microphones. The proposed method for time-delay estimation is found to perform better than the generalized cross-correlation (GCC) approach. Time-delay was calculated from the Hilbert envelope of the LP residual of all the speech inputs from the microphones. The method used had significantly better performance than the commonly used Generalised Cross Correlation (GCC).
A microphone array is constructed by using two identical microphones fixed at a known distance between them. The microphones are connected to a computer running our implementation of the algorithm, written in Matlab. The camera is mounted on a servo motor, hooked to the same computer. The microphones record the audio, feed it to the computer. The algorithm calculates the time-delay and accordingly calculates the angle at which the speaker is located. The servo motor is fed the angle information, and the camera is turned accordingly.