Breast Cancer Diagnosis in Mammography Images Using Deep Convolutional Neural Network-Based Transfer and Scratch Learning Approach

Objectives : The study aims to utilize a Deep Convolutional Neural Network (Deep-CNN) model called MobileNetV2, which has low computational requirements, to accomplish binary classiﬁcation of mammography images. To achieve this objective, the study investigates two methods: transfer learning and scratch learning. Methods: The proposed technique aims to classify mammography images from the Digital Database of Screening Mammography (DDSM) dataset into either malignant or benign categories using transfer learning and scratch learning methods based on Deep-CNN. Before being fed into the Deep-CNN, the images’ contrast level is enhanced using the min-max contrast enhancement technique. MobileNetV2, a lightweight CNN architecture, is used


Introduction
Breast cancer (BC) is the most common cancer among the female population.Early detection of breast cancer is crucial in saving lives since malignant tumors develop rapidly and spread throughout the body.Screening mammography is the most recommended imaging technique by radiologists to identify breast abnormalities (1) .However, radiologists are constantly faced with a significant problem in accurately diagnosing breast abnormalities due to the low occupancy of breast tissues on full mammograms and the inadequate contrast of mammography images.As a result of the success rate of Deep Learning (DL) algorithms in tackling computer vision challenges, many researchers have developed Computer-Aided Detection (CAD) systems to identify and forecast serious health issues from medical images.In recent years, many Deep Convolutional Neural Networks (Deep-CNN) algorithms have been successfully applied in breast cancer detection (BCD) (2,3) .Here, some of the recent BCD models developed by Deep-CNNs are described.
Samriddha et.al. applied the ensemble transfer learning (TL) approach in BCD using histopathology images.With an ensemble of GoogleNet, VGG11 and MobileNetV3, the authors achieved maximum accuracy of 96.15% with 400x magnified images obtained from Breakhis dataset (4) .
Wakili et al. and Voon et al. utilized a transfer learning (TL) approach with Deep-CNN architecture for identifying breast cancer using histopathology images (5,6) .Heenaye-Mamode Khan et al. employed the ResNet-50 architecture for multiclass classification of mammography images using TL with a fine-tuning approach.Their customized ResNet-50 achieved an 88% accuracy for classifying mammography images into four different classes (7) .Mohamed E et al. developed a novel CNN pooling block and integrated it with U-Net, Alexnet, GoogleNet, and ResNet18 architectures for the segmentation and classification of thermogram dataset, achieving a global accuracy of 98.3% (8) .
Houssein et al. utilized ResNet-50 in combination with the marine predators algorithm (MPA) to detect breast cancer from mammography images.By tuning the hyperparameters of the MPA architecture, the model achieved an accuracy of 98.32% with the DDSM dataset and 98.88% with the MIAS dataset (9) .
Hyo-Eun Kim et al. (10) utilized a TL approach with ResNet-34 pretrained model architecture on a large-scale mammography dataset.From 2004 to 2016, the authors collected 170,230 full-field digital mammography images (4 views) from five different institutions: three in Seoul, South Korea, one in the United States, and one in the United Kingdom.The algorithm was trained in two stages, namely patch-level training (Stage-1) and image fine-tuning (Stage-2).Patch-level training was carried out to extract low-high-level features, followed by fine-tuning approaches.Discriminative learning methods were utilized in Stage I, whereas all semi-supervised learning methods were used in Stage II.The performance of AI-based algorithms were compared with radiologists using recall-based sensitivity, AUROC, and specificity.With three different https://www.indjst.org/validation sets, the overall AUROC evaluated was 0.959.The study showed that the performance of AI-based algorithms on large-scale data was superior to that of radiologists in cancer detection.
Karin et al. (11) used InceptionResNet-v2 architecture to calculate a risk score for predicting the future occurrence of breast cancer.They provided attributes related to mammographic images and radiation parameters to the network in the training phase to calculate the mammographic risk score and the density score.The DNN model was able to predict which women were susceptible to breast cancer in the future, with a focus on more aggressive tumors.
Qiyuan Hu et al. (12) developed a Computer-Aided Diagnosis (CADx) system for diagnosing breast cancer using multiparametric magnetic resonance imaging (MRI).They collected 927 MR scans from 616 women and used a VGG-19 CNN model to extract features, followed by an SVM (Support Vector Machine) classifier to categorize the scans into benign or malignant lesions.The performance of the DL model was compared with the DeLong test, which showed that the DL model had increased diagnostic accuracy with an AUC of 95% in breast image interpretation.
Abhishek Das et al. (13) designed a stacked ensemble model using gene expression datasets and breast histopathology images to identify breast cancer.The 1-D gene expression data was converted into 2-D deconstructed images using empirical wavelet transform to perform 2-stage classification.Three CNNs served as the basic classifiers in the first stage, followed by a Multilayer Perceptron in the 2nd stage.One of the three CNNs was trained with original images, whereas the other CNNs were trained with deconstructed images.The proposed strategy achieved a 98.08% classification accuracy.
The transfer learning strategy proposed by Wessam M. Salama et al. (14) aimed to segment and classify mammography images into benign and malignant categories.The authors employed a modified U-Net architecture for breast segmentation in the first stage and utilized multiple CNNs including InceptionV3, DenseNet121, ResNet50, VGG16, and MobileNetV2 for image classification in the second stage.The model was trained and validated on three datasets: CBIS-DDSM, MIAS, and DDSM.In terms of performance with the DDSM dataset, the U-Net model and InceptionV3 outperformed other architectures, achieving an accuracy of 98.87%, a sensitivity of 98.98%, AUC of 98.88%, a precision of 98.79%, and F1 score of 97.99%, with a computational time of 1.2134 seconds.
Mohapatra et al. proposed a breast cancer detection model using VGG16, AlexNet, and ResNet-50 architectures.They applied transfer learning with pretrained weights on ImageNet dataset.Among these architectures, AlexNet model outperformed the others with an accuracy of 65% (15) .Bibhu et.al. have created a computer-aided diagnosis (CAD) system for detecting breast cancer using the Wisconsin breast cancer dataset (WBCD).The system employs a combination of PCA (principal component analysis) and NN (neural network) algorithms.According to their simulation results, the system achieved an accuracy rate of 97%, surpassing the accuracy of other existing state-of-the-art methods.This suggests that the integration of PCA and NN algorithms can be an effective strategy for developing CAD systems for breast cancer detection (16) .
The BCD model proposed by Sahu et al. combined both statistical and machine learning methods, resulting in more precise and advanced models for the prediction and diagnosis of breast cancer (17) .The authors also developed a BCD model and evaluated various machine learning classifiers under different operational conditions and datasets.Their findings indicate that SVM is a suitable choice for accurately identifying performance metrics, including sensitivity, accuracy, error, and specificity (18) .
In addition to implementing Deep CNN models on CPUs or GPUs, researchers have also investigated the deployment of Deep CNN models on hardware platforms or in cloud environments.For instance, Shahirah Zahir et al. implemented a Deep CNN-based breast cancer model on an IoT platform (19) .Similarly, Chowdhury et al. developed an Android-based cloud environment for breast cancer identification (20) .
According to previous studies, the majority of existing Breast Cancer Diagnosis (BCD) models utilize a Deep-CNN based transfer learning (TL) technique because developing a BCD model from scratch requires domain expertise in network layer design and proper hyperparameter selection.Our paper aims to develop a Deep-CNN based BCD model using both TL and scratch learning (SL) approaches.Additionally, we compare the performance of BCD models developed using TL and SL approaches in terms of machine learning performance measures, training time, and total trainable parameters.We also compare our obtained model with some existing BCD models to demonstrate its efficiency.In Section 2, we will describe the methodology we used to develop the BCD model, while in Section 3, we will present and discuss the simulation results.Finally, in Section 4, we will provide concluding remarks, future scope, and implications of our study.

Methodology
The proposed model for detecting breast cancer utilizes a Deep-CNN architecture and employs both TL and SL approaches.The TL approach uses a pre-trained architecture, MobileNetV2, with three different variants of TL, namely TL without fine-tuning, TL with fine-tuning, and fixed feature extraction approach.The suggested methodology also describes the development of a BCD model with a 7-layered CNN from scratch. Figure 1 shows the proposed algorithmic methodology for the BCD model, https://www.indjst.org/along with a brief explanation of the procedures involved in developing it.

Dataset
The BCD model was developed using mammography images from the publicly available Digital Database for Screening Mammography (DDSM) dataset (21) .A total of 5000 mammography images, consisting of 2500 images from the benign class and 2500 images from the malignant class, were used for network training.Out of these, 20% of the images were reserved for evaluating the trained model, while the remaining 80% were used to train the CNN architecture.

Image Preprocessing
Mammography images often suffer from poor contrast, which can limit the generalizability of a trained CNN model to new, unseen images.Therefore, it is essential to select an appropriate image contrast enhancement technique to improve image quality.In this study, we utilized the min-max contrast enhancement technique to enhance the contrast of all mammography images after resizing them to 227x227x3.The image contrast enhancement process is demonstrated in Figure 2.

MobileNetV2
The MobileNetV2, which is a modified version of MobileNetV1, serves as the convolutional base in all three TL variants for performing mammography image classification tasks.The low-powered and lightweight structure of MobileNetV2 makes it suitable to deploy the trained model on a smart embedded platform with low memory and computation capabilities.
As shown in Figure 3, MobileNetV2 consists of three layers: the expansion layer, depth-wise convolution layer, and projection layer.The residual bottleneck connection in the MobileNet architecture reduces the number of channels available at the https://www.indjst.org/projection layer.This reduction in the number of channels at the projection layer output is referred to as a "bottleneck".In the standard MobileNet architecture, a 3x3 depth-wise convolutional layer extracts features from input channels, while a 1x1 pointwise convolutional layer combines the feature maps generated by the depth-wise convolutional layer to efficiently reduce the dimensionality of input channels.This feature makes depth-wise separable convolution filters faster than standard convolutional filters and reduces the network training time.

Transfer Learning Approach
Transfer learning is an effective approach to leverage the features learned from one task to another related task, resulting in improved model performance and reduced training time.This paper investigates the BCD model for mammography image classification using three variations of the TL approach, which are graphically represented in Figure 4.The three variants of the transfer learning approach used in this paper are as follows:

Transfer Learning (Without Fine-tuning)
This variant utilizes the pre-trained MobileNetV2 architecture with a frozen convolutional base, meaning that no changes are made to the layer parameters in the convolutional base.The last classification layer, which is a fully connected layer, is customized to match the number of classification categories required for the task.Once the necessary changes are made to the classification layer, the network is trained using DDSM training images.

Transfer Learning with fine-tuning
This approach involves fine-tuning specific layers in the convolutional base while replacing the last layers with a new fully connected layer.The convolutional base is fine-tuned, and the fully connected (FC) layers are retrained based on the new learning task.In the present study, the last 23 layers of the convolutional base were trained with a learning rate of 0.1, and the simulation results were obtained accordingly.

Fixed Feature Extraction approach
In this variant, the convolutional base of the Deep CNN model is kept frozen while only the last layer, the FC layer, is replaced with a new classifier.In the proposed BCDM, the FC layer of the pre-trained CNN model architecture (MobileNetV2) is substituted with a Random Forest (RF) machine learning classifier.This approach has proven effective in training the model without the need for customizing the hyperparameters of MobileNetV2.The RF classifier is initialized with a suitable number of Decision Trees (DTs) to estimate the accuracy based on the average of the predictions generated by all DTs.

Scratch Learning (SL) Approach
The SL approach provides developers with the flexibility to design network layers from scratch.In our study, we have created a 7-layered CNN architecture using the SL approach.The structure of the 7-layered CNN model is illustrated in Figure 5.The architecture consists of repeated convolutional layers, activation functions, and pooling layers.The final layer of the network is a dense layer, followed by a dropout layer, which generates the classification score.

Results and Discussion
In this section, we present the results of the simulations conducted using three different TL approaches and the SL approach.The model parameters used for each approach are summarized in Table 1.After training, the performance of each model is evaluated based on machine learning performance measures such as accuracy, precision, F1-score, and AUC.The number of trainable parameters and the network training time are also computed for each approach.Table 2 presents the results of the four different approaches (Fixed Feature Extraction, Transfer Learning without fine-tuning, Transfer Learning with fine-tuning, and Scratch Learning) used for mammography image classification.The total model parameters, trainable and non-trainable parameters, training time, and performance metrics (accuracy, precision, F1 score, and AUC) are reported for each approach.The corresponding confusion matrices and AUC for each approach are shown in Figures 6 and 7 respectively.
As demonstrated in Table 2, the FFE approach demands no trainable parameters, achieved impressive results when using the RF classifier with MobileNetV2 as a convolutional base.The model achieved an accuracy of 0.994 and had a much shorter network training time of 63.86 seconds compared to TL approaches.The FFE approach utilized transfer learning by using pre-trained convolutional layers and adding a new classifier on top.This allowed the model to leverage the knowledge gained from training on a large dataset, which is transferable to the task of mammography image classification.The addition of the Random Forest classifier helped to improve the model's performance by adding a decision-making layer that considers multiple decision trees.This approach can help to reduce overfitting and improve model generalization Although TL with fine-tuning had an accuracy of 1.0, the appropriate selection of hyperparameters is critical for achieving such results.On the other hand, developing a scratch model with appropriate layers, filters, stride, and padding can lead to optimal performance, but this requires domain expertise.have used transfer learning with Xception as the pre-trained model and gradient boosting machine as the classifier for binary classification on a private dataset.Their proposed approach achieves an accuracy of 84.5% and an AUC of 0.88.Huang et al. (23) have used transfer learning with MobileNetV3 and bilinear structure for multiclass classification of whole slide images.They reported an accuracy of 0.88.Li et al. (24) have used transfer learning with Resnet-v2 and a CNN for multiclass classification on the in breast dataset.Their proposed approach achieves an accuracy of 70% and an AUC of 0.84.Finally, our paper proposed a transfer learning approach using MobileNetV2 as the pre-trained model and RF classifier for binary classification on the DDSM dataset.Our approach achieved an accuracy of 99.4% and an AUC of 0.99, which indicates that your approach performs exceptionally well on this particular task.
Overall, Table 3 highlights the effectiveness of transfer learning techniques in achieving high accuracy and AUC for different classification tasks on different datasets.However, it also suggests that the performance of these techniques can vary based on the choice of pre-trained model, classifier, and dataset.

Fig 5 .
Fig 5. Structure of a 7-layered CNN Model

Table 1 .
Model Parameters of different Deep CNN Strategies

Table 2 .
Performance Metrics obtained with different Deep CNN Strategies

Table 3
provides a summary of the results obtained by different research studies that have employed deep learning-based strategies for deep CNN based breast cancer detection tasks on different datasets.Trang et al.

Table 3 .
Comparison of the proposed work with existing Breast Cancer ModelsFrom the literature review and simulation results, it is clear that Deep CNNs have potential for CAD development in early-stage breast cancer diagnosis and prediction without requiring extensive preprocessing and feature engineering.Transfer learning makes it easier for developers to create application-specific models using pre-trained CNNs, and various transfer learning approaches can improve the performance of DL architectures without relying heavily on large datasets.However, amending existing model architectures and selecting appropriate hyperparameters remains challenging to create suitable models for clinical applications.The lack of standard annotated medical datasets is an open area of research that poses a challenge to supervised DL model optimization.Additionally, selecting suitable image pre-processing techniques is vital to enhancing model performance by reducing the False Positive Rate (FPR) due to contrast variation and noise in mammography images.Network hyperparameters, such as epochs, learning rate, and dropout rate, significantly influence reliable model performance.The systematic combination of supervised and unsupervised DL architectures can prevent models from overfitting and underfitting issues.Training the network with a hybrid image dataset and clinical information may enhance network generalization ability.The use of lightweight CNNs such as SqueezeNet, MobileNetV2, and Xception can enable developers to implement DL models on smart platforms with low computational power, and ensembling DL architectures can enhance network capability to perform accurate medical image analysis.Future research on breast cancer diagnosis can focus on customizing existing model architectures and exploring cloud-based implementation for improved results and efficiency.The availability of opensource mammography datasets and the implementation of ensemble models with simple architectures can further enhance the potential of the BCD model.