•  
  •  
 

Corresponding Author

Suryanti Awang

Authors ORCID

Mohammed Ahmed Talab: https://orcid.org/0000-0002-7486-1669

Sumia Abdulhussien Razooqi AL-Obaidi: https://orcid.org/0000-0002-2863-6479

Othman I. Hammadi: https://orcid.org/0000-0002-3254-9490

Suryanti Awang: https://orcid.org/0000-0002-5468-1150

Nur Syafiqah Mohd Nafis: https://orcid.org/0000-0001-5833-9371

Hasan Kahtan: https://orcid.org/0000-0001-6521-7081

Document Type

Article

Keywords

Deep learning, Vision transformers, Chest X-ray (CXR), Convolutional neural network (CNN)

Abstract

Chest X-rays (CXRs) have been utilized as an important tool for diagnosing thoracic diseases because of their cost-effectiveness and accessibility. However, subtle signs of diseases, anatomical overlap and lack of expert radiologist in certain areas lead to difficulty in reading the CXRs. This indicates the needs of automated diagnostic systems to be accurate. Existing deep learning methods have shortcomings because they require pixel-level annotations to train models well. To address this problem, this study proposes Vision Transformers (ViT) with transfer learning for multiclass classification and weakly supervised localization of thoracic diseases in chest X-ray images. The technique uses patching, positional embedding, and transformer encoding all at once. Subsequently, a Multi-Layer Perceptron (MLP) head and score-weighted class activation mapping (Score-CAM) are implemented to make predictions process efficient. Thus, this technique makes diagnoses more accurate and makes it easier to find small lesions based on the experiment imposed to the ChestX-ray14 dataset. The results show that the accuracies are reliable: 0.83 for cardiomegaly, 0.85 for edema, 0.80 for consolidation, 0.87 for pleural effusion, 0.75 for atelectasis, and 0.89 for pneumonia. This study shows how Vision Transformers could help physicians find thoracic diseases earlier and make better decisions.

Share

COinS