AudioSense is a Wireless Acoustic Sensor Network (WASN) which can be used for real time spatial audio recording. The main goal in this approach is object based audio representation, ie. signals which represent individual sound sources. Individual sound sources are extracted from a sound mixture using specific sound source separation algorithms, such as Independent Component Analysis. Sound object representation gives great flexibility in audio scene reconstruction in terms of loudspeakers and headphones setups. One of the finest features introduced by this kind of representation is a possibility of interactive manipulation of sound objects at the receiver/user side.
The system consists of the WASN part and the sound processing part.
WASN is a heterogeneous network with two classes of devices:
Aggregated streams are forwarded to the nearest Gateway which serves as an interface between the WASN and the 3D Audio Processing Unit.
3D Audio Processing Unit performs:
Total bit rate of MPEG-H 3D Audio varies from 256 up to 1200 kbps for 22.2 channel material. Interesting feature provided by MPEG-H 3D Audio is the ability to decode and render spatial audio for different loudspeakers setups and headphones as it has been shown below
BeagleBone Black, AudioCape and Microphone array (BBB extension).
3D Audio Processing Unit
This energy efficient single board computer is based on Xilinx Zynq 7000 SoC (system on chip) module supported with Epiphany III mulitcore accelerator. The Epiphany chip contains 16 high performance RISC CPU cores, where each of them can operate at 1GHz and 2 GFLOPS/sec. This high computing power computer consumes only up to 5 Watts which makes it very convenient to use in a WASN system like AudioSense.
3D AudioSense is a research project which focuses on the capturing of spatial audio scene using a distributed wireless sensor network.