📞 +91-7667918914 | ✉️ ijarcce@gmail.com
International Journal of Advanced Research in Computer and Communication Engineering
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 15, ISSUE 4, APRIL 2026

Spatially-Gated CNN-Transformer Hybrids for Pneumonia Classification: A Unified Framework for Metric-Optimized Local-Global Explainability

Ch Mydhili*, B Madhav Rao

👁 15 views📥 1 download
Share: 𝕏 f in
Abstract: The deep learning paradigm shift is completely transforming image classification especially in the medical scenario where proper classification helps establish the patients' treatment protocol. CNN has proven to be very successful in learning local spatial features such as textures and edges, whereas Swin-T excel in capturing global context dependencies through self-attention mechanism. However, it is evident that both methods have some shortcomings in case of considering only one approach since CNN fails to adequately capture the long-range dependency while Transformer models require huge amounts of data and powerful computations. Although hybrid CNN-Transformer architectures provide an answer to each architecture's limitation by combining their feature learning abilities into one, they remain black-box models which produce hard-to-explain predictions and hence cannot be trusted especially in the medical field where accurate classifications are required. This project aims at finding an answer to this problem by introducing an innovative model called Spatially-Gated CNN-Transformer Hybrid with Dual-Level Explainability. The model will consist of a ResNet-50 CNN backbone to learn local features, and a Swin Transformer to model global context through spatial attention gating mechanism. Besides, this research will introduce a dual-level explainability method involving both Grad-CAM and Attention Rollout.

Keywords: CNN–Transformer Hybrid Network, Dual-Level Explainability, Explanation Fusion, Quality Evaluation Metrics.

How to Cite:

[1] Ch Mydhili*, B Madhav Rao, “Spatially-Gated CNN-Transformer Hybrids for Pneumonia Classification: A Unified Framework for Metric-Optimized Local-Global Explainability,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2026.154282

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.