📞 +91-7667918914 | âœ‰ī¸ ijarcce@gmail.com
International Journal of Advanced Research in Computer and Communication Engineering
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 15, ISSUE 4, APRIL 2026

A Deterministic Multi-Metric Framework for Automated Image Dataset Validation in Computer Vision

M.Balavignesh, Dr. C. Karpagavalli, Dr. M. Kaliappan

👁 15 viewsđŸ“Ĩ 3 downloads
Share: 𝕏 f in ✈ ✉
Abstract: Deep learning model performance in computer vision is fundamentally limited by the quality of training data, yet augmented datasets frequently contain feature corruption such as extreme blur, noise, and lighting anomalies. This paper presents the AI-Based Image Dataset Quality Validator, a high-precision, data-centric framework designed for automated dataset sanitization. The system employs a deterministic multi-metric validation pipeline integrating Laplacian Variance for sharpness auditing and ITU-R 601 Luma weighting for exposure control, enabling fine-grained defect identification that traditional global-threshold filters miss. A core innovation of the architecture is the Parallel Structural Label Synchronization module, which guarantees a strict 1:1 correspondence between images and their respective annotations stored in either TXT or CSV format, automatically eliminating orphan labels during export. To handle large- scale batches on standard hardware, the system implements Active Memory Recovery through controlled garbage collection. Experimental evaluation on a 500-image benchmark demonstrates 96.8% rejection accuracy with an average throughput of 42.5 ms per image. The proposed framework reduces manual data-cleaning effort by an estimated 98%, delivering a scalable, Green AI solution for high-integrity computer vision pipelines.

Keywords: Image Dataset Validation, Computer Vision, Laplacian Variance, Label Synchronization, Data-Centric AI, Image Quality Assessment (IQA), YOLO Framework, Green AI, Automated Data Sanitization, TXT/CSV Annotation Management.

How to Cite:

[1] M.Balavignesh, Dr. C. Karpagavalli, Dr. M. Kaliappan, “A Deterministic Multi-Metric Framework for Automated Image Dataset Validation in Computer Vision,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2026.154161

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.