Video-SwinUNet: Spatio-temporal Deep Learning Framework for VFSS Instance Segmentation

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

25 Downloads (Pure)


This paper presents a deep learning framework for medical video segmentation. Convolution neural network (CNN) and transformer-based methods have achieved great milestones in medical image segmentation tasks due to their incredible semantic feature encoding and global information comprehension abilities. However, most existing approaches ignore a salient aspect of medical video data - the temporal dimension. Our proposed framework explicitly extracts features from neighbouring frames across the temporal dimension and incorporates them with a temporal feature blender, which then tokenises the high-level spatio-temporal feature to form a strong global feature encoded via a Swin Transformer. The final segmentation results are produced via a UNet-like encoder-decoder architecture. Our model outperforms other approaches by a significant margin and improves the segmentation benchmarks on the VFSS2022 dataset, achieving a dice coefficient of 0.8986 and 0.8186 for the two datasets tested. Our studies also show the efficacy of the temporal feature blending scheme and cross-dataset transferability of learned capabilities. Code and models are fully available at
Original languageEnglish
Title of host publication2023 IEEE International Conference on Image Processing (ICIP)
Number of pages5
ISBN (Electronic)9781728198354
Publication statusPublished - 11 Sept 2023
Event2023 IEEE International Conference on Image Processing - Kuala Lumpur Convention Centre, Kuala Lumpur , Malaysia
Duration: 8 Oct 202311 Oct 2023


Conference2023 IEEE International Conference on Image Processing
Abbreviated titleICIP 2023
CityKuala Lumpur


  • cs.CV
  • cs.AI


Dive into the research topics of 'Video-SwinUNet: Spatio-temporal Deep Learning Framework for VFSS Instance Segmentation'. Together they form a unique fingerprint.

Cite this