Abstract
This paper presents a novel approach for multimodal data fusion based on the Vector-Quantized Variational Autoencoder (VQVAE) architecture. The proposed method is simple yet effective in achieving excellent reconstruction performance on paired MNIST-SVHN data and WiFi spectrogram data. Additionally, the multimodal VQVAE model is extended to the 5G communication scenario, where an end-to-end Channel State Information (CSI) feedback system is implemented to compress data transmitted between the base-station (eNodeB) and User Equipment (UE), without significant loss of performance. The proposed model learns a discriminative compressed feature space for various types of input data (CSI, spectrograms, natural images, etc), making it a suitable solution for applications with limited computational resources.
Original language | English |
---|---|
Number of pages | 11 |
Journal | IEEE Transactions on Cognitive Communications and Networking |
Early online date | 10 Oct 2023 |
DOIs | |
Publication status | E-pub ahead of print - 10 Oct 2023 |
Bibliographical note
Publisher Copyright: IEEEFunding Information: This work was performed as a part of the OPERA Project, funded by the UK Engineering and Physical Sciences Research Council (EPSRC), Grant EP/R018677/1. This work has also been funded in part by the Next-Generation Converged Digital Infrastructure (NG-CDI) Project, supported by BT and Engineering and Physical Sciences Research Council (EPSRC), Grant ref. EP/R004935/1.
Keywords
- cs.LG
- cs.AI
- eess.SP