Failure Complementarity of Temporal and Single- Frame YOLO Detectors under Severe Motion Blur

Zhe Feng

doi:10.26689/jera.v10i2.14374

Download PDF

Keywords

Motion blur
Object detection
Temporal modeling
Detection complementarity
Ensemble
YOLO

DOI

10.26689/jera.v10i2.14374

Submitted : 2026-03-04

Accepted : 2026-03-19

Published : 2026-04-03

Abstract

Object detection under high-speed motion remains challenging due to severe motion blur, which degrades spatial appearance and limits the effectiveness of single-frame detectors. While temporal modeling is widely explored to enhance performance, its specific behavior under extreme motion blur is not yet fully characterized. In this work, we conducted an experimental study comparing a single-frame YOLOv8n detector and a temporal-enhanced variant that incorporates multi-frame inputs and frame-difference cues within the backbone. Results on a highly blurred table tennis dataset show that although the temporal-enhanced model and single-frame baseline achieve similar average precision (AP~0.52), they exhibit markedly different failure modes. Quantitative analysis reveals a Jaccard index of only 0.43, demonstrating a pronounced complementarity between the detection outcomes of the two models. By exploiting this behavioral divergence through a simple ensemble strategy, we achieve a substantial aggregate performance gain, increasing AP50 from 0.52 to 0.73. These findings suggest that under extreme blur, temporal modeling can induce complementary detection behavior beyond improving individual detector accuracy, offering an alternative perspective for designing robust detection systems in highly degraded visual environments.

References

Potmesil M, Chakravarty I, 1983, Modeling Motion Blur in Computer-Generated Images. SIGGRAPH Comput. Graph. 1983(17): 389–399.

Zhang X, Zhang T, Yang Y, et al., 2020, Real-Time Golf Ball Detection and Tracking Based on Convolutional Neural Networks. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC), Toronto, ON, Canada, 2808–2813.

Wu Q, Yang Y, Han R, et al., 2026, LDA-YOLO: A Lightweight Deblurring-Aware Network for Real-Time Object Detection in Blurred Aerial Images. Journal of Real-Time Image Processing 2026(23): 23.

Moreira C, Ferreira L, Coelho P, 2025, A Comprehensive Review of Ball Detection Techniques in Sports. PeerJ Computer Science, 2025(11): e3079.

Su S, Delbracio M, Wang J, et al., 2017, Deep Video Deblurring for Hand-Held Cameras, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Truong N, Lee Y, Owais M, et al., 2020, SlimDeblurGAN-Based Motion Deblurring and Marker Detection for Autonomous Drone Landing. Sensors, 2020(20): 3918.

Sun R, Li X, Liu G, et al., 2025, An Improved MPRNet for the Improvement of Blurred Concrete Crack Images. Structures, 2025(73): 108416.

Dai S, Wu Y, 2008, Motion from Blur. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, AK, USA, 1–8.

Gong D, Yang J, Liu L, et al., 2017, From Motion Blur to Motion Flow: A Deep Learning Solution for Removing Heterogeneous Motion Blur, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Cho S, Kim S, Jung S, et al., 2022, Blur-Robust Object Detection Using Feature-Level Deblurring via Self-Guided Knowledge Distillation. IEEE Access, 2022(10): 79491–79501.

Li Q, Zhang Y, Fang L, et al., 2025, DREB-Net: Dual-Stream Restoration Embedding Blur-Feature Fusion Network for High-Mobility UAV Object Detection. IEEE Transactions on Geoscience and Remote Sensing, 2025(63): 1–18.

Zhu X, Lyu S, Wang X, et al., 2021, TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2778–2788.

Zhu X, Wang Y, Dai J, 2017, Flow-Guided Feature Aggregation for Video Object Detection, In Proceedings of the IEEE International Conference on Computer Vision (ICCV).

Alqaysi H, Fedorov I, Qureshi F, et al., 2021, A Temporal Boosted YOLO-Based Model for Birds Detection around Wind Farms. Journal of Imaging 2021(7): 227.

Fokkinga E, van Leeuwen M, Kuijf H, 2025, Air-to-Ground Real-Time Temporal Small Object Detection from a Flying Platform, In Proceedings of the Artificial Intelligence for Security and Defence Applications, 2025(13679): 1367907.

Everingham M, Van Gool L, Williams C, et al., 2010, The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, 2010(88): 303–338.

Lin T, Maire M, Belongie S, et al., 2014, Microsoft COCO: Common Objects in Context. In Proceedings of the Computer Vision – ECCV 2014, 740–755.

Simonyan K, Zisserman A, 2014, Two-Stream Convolutional Networks for Action Recognition in Videos, In Proceedings of the Advances in Neural Information Processing Systems, 27.

Feichtenhofer C, Fan H, Malik J, et al., 2019, SlowFast Networks for Video Recognition, In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).

Razinkov E, Saveleva I, Matas J, 2018, ALFA: Agglomerative Late Fusion Algorithm for Object Detection, In Proceedings of the International Conference on Pattern Recognition (ICPR), 2594–2599.

Jocher G, 2023, YOLOv8: Ultralytics Next-Generation YOLO, https://github.com/ultralytics/ultralytics.

Jaccard P, 1912, The Distribution of the Flora in the Alpine Zone. New Phytologist, 1912(11): 37–50.

Girshick R, Donahue J, Darrell T, et al., 2014, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).