Abstract
Developing an automated lung disease diagnosis framework is still remains one of the most challenging and demanding tasks in recent days. Most of the medical experts highly preferring the Computed Tomography (CT) lung images for an accurate disease detection. For this purpose, various segmentation, optimization, and classification techniques are developed in the conventional works for lung pulmonary disease detection. However, the existing techniques have the major problems of over segmentation, inaccurate ROI extraction, reduced accuracy, computational complexity, and high false positives. Thus, this research work intends to a simple and efficient segmentation based classification framework for an accurate lung nodules detection and pulmonary disease classification. Here, the tanh normalization technique is applied for preprocessing the input lung CT image with reduced noise and increased quality. After that, the perceptual U-Net segmentation algorithm is employed to accurately segment the lung nodules from the preprocessed CT images with simple computational operations. Moreover, the Decked Dragonfly Optimization (DDO) technique is used for choosing the relevant features based on the best optimal solution, which supports to obtain an increased detection accuracy and reduced classification error rate. Finally, the Speculative Deceptive Network (SDN) based classification algorithm is deployed to exactly detect the pulmonary lung cancer according to the optimal features. During evaluation, the performance of the proposed segmentation based DDO-SDN mechanism is validated and compared by using various evaluation parameters.
Introduction
Diagnosing a lung disease [1–3] is one of the most demanding and crucial tasks in recent times for providing earlier treatment to the patients. The medical experts highly prefer an automated disease diagnosis tools for identifying and categorizing the types of diseases [4–6]. According to the recent reviews, it is observed that 1.61 million deaths happen at each year due to the lung cancer. Typically, the pulmonary lung cancer is considered as the dreadful and life-threatening disease highly affecting the people around the world [7–9]. Hence, it should be accurately diagnosed for saving the life of people. In medical sector, the restorative images like Computed Tomography (CT), Magnetic Resonance Imaging (MRI), X-ray, and etc are extensively used for disease diagnosis [10, 11]. Among others, the CT is the most suitable and accurate medical imaging tool that captures the human body organs with minimal noise effects. Due to the in homogeneity of lung nodules [12–14], developing an automated lung segmentation system is treated as one of the complicated tasks. So, most of the conventional works [15–18] developed the different types of segmentation and classification mechanisms for lung cancer identification and nodule segmentation. Yet, it faced various problems [13, 19–22] associated to the parameters of complexity, difficult to understand, over segmentation, reduced accuracy, high mis-prediction rate, and error rate. So, the proposed work motivates to develop an efficient and automated lung nodule segmentation and classification system for detecting the pulmonary diseases from the CT images. The major research objectives of this paper are as follows:
To preprocess the input lung CT image, the tanh normalization mechanism is applied that eliminates the noise and smoothens the image for further processing.
To exactly segment the lung nodules from the preprocessed image, the perceptual U-Net segmentation mechanism is utilized.
To obtain the most relevant features from the segmented region, the Decked Dragonfly Optimization (DDO) technique is implemented, which helps to optimize the feature set.
To predict the pulmonary lung cancer by using the optimal features, the Speculative Deceptive Network (SDN) based classification algorithm is deployed.
To test the results and performance of the proposed lung nodule segmentation and disease classification system, various evaluation metrics are used during analysis.
The remaining portions of this paper are segregated into the following sections: Section 2 reviews the conventional segmentation, optimization, and prediction methodologies used for detecting the cancer from the lung CT images. Also, it individually analyzes the benefits and limitations of each model according to its key features and computational operations. Section 3 presents the description about the proposed lung nodule segmentation and classification methodology used for an earlier disease diagnosis. Section 4 validates the results of conventional and proposed lung cancer detection approaches by using various evaluation measures. Finally, the overall paper is summarized with the findings, challenges, and future scope in Section 5.
Related works
This section reviews the conventional image processing techniques used for an accurate segmentation and detection of nodules from the lung CT images. Also, it investigates the benefits and demerits of each technique according to its operating characteristics and principles.
Primakov et al. [23] implemented an automated segmentation system for detecting lung cancer from CT images. The purpose of this work was to accurately identify the tumor slice, location, and size by using an advanced U-Net based Convolutional Neural Network (CNN) classification mechanism. However, the suggested system was difficult to understand, which affects the efficiency of entire system. Wang et al. [24] deployed a Deep Self-paced Active Learning (DSAL) mechanism for detecting pulmonary nodules from 3D thoracic lung images. The purpose of this work was to accurately segment the tumor affected region with reduced error rate and increased segmentation accuracy. Yet, it has the major problems of increased dimensionality of features, reduced convergence speed, and time consumption. Shaziya et al. [25] deployed an automated lung segmentation system from CT images by using the U-Net Convolutional Network method. Also, it intends to attain an increased detection accuracy by accurately segmenting the lung regions. Haq et al. [26] utilized a multi-label Deep Learning Segmentation (DLS) technique for segmenting the cardio-pulmonary substructure from the CT images. Here, the image preprocessing was performed at the initial stage for normalizing the lung image with increased contrast and quality.
Kiser et al. [27] deployed a Pleural effusion segmentation algorithm for detecting the pulmonary disease from the lung CT image. Here, the semi-automated segmentation techniques were used to detect the tumor with better accuracy. For validating the suggested approach, the similarity coefficients and hausdorff distance measures were computed in this analysis. ALzubi et al. [28] utilized a boosted NN ensemble classification mechanism for developing an accurate lung cancer diagnosis framework. Here, the Weight Optimized Neural Network (WONN) technique was utilized to improve the performance of detection system with minimized error and time. In addition to that, it objects to reduce the false positive rate and classification time by accurately categorizing the cancer. Yet, it has the major problems of high processing time, complex computational operations, and lack of scalability. Kriegsmann et al. [29] used a CNN based deep learning mechanism for accurately predicting the small-cell and non-small-cell lung cancer. The contribution of this work was to categorize the different types of lung cancers such as Pulmonary Adenocarcinoma (ADC), small cell lung cancer, and Pulmonary Squamous Cell Carcinoma (SqCC) with increased accuracy. Moreover, the hyper-parameter tuning was performed to tune the parameters for improving the detection performance. The major drawbacks of this work were high computational and cost complexity, which degrades the efficacy of entire cancer prediction system. Abe et al. [30] implemented a simple method for designing a real-time detection system for lung tumor identification and classification. Lin et al. [31] employed a Taguchi parametric optimization technique integrated with the 2D CNN mechanism for detecting lung cancer from the CT image. It was one of the statistical mechanism developed based on an orthogonal array, in which the control factor was calculated to improve the stability and minimize the quality loss. The key merits of this work were fast convergence speed and optimized time consumption.
According to this review, it is analyzed that the conventional works are highly focused on developing an accurate segmentation and classification techniques for automated lung disease diagnosis system. However, it faced some complications associated to the following factors:
Over segmentation
Complex mathematical operations
Reduced detection accuracy
High false positives
Difficult to understand the system model
Therefore, the proposed work objects to implement an effective medical image processing techniques for the design and development of lung nodule segmentation and classification system.
Proposed methodology
This section presents the detailed description about the proposed automated lung nodule segmentation and tumor classification system. The main contribution of this work is to develop a simple and efficient automated segmentation and classification system for accurately identifying the pulmonary lung cancer from the CT images. Also, it objects to reduce the error rate and mis-prediction outcomes by properly detecting the tumor region from the given input CT image. The main aim of this work is to accurately classify the nodule and non-nodule lung images by using an advanced medical image processing techniques. For this purpose, a combination of mechanisms such as segmentation, optimization, and classification are implemented in this framework, and the overall working flow of the proposed system is shown in Fig. 1. It includes the following stages:
Tanh normalization and preprocessing
Lung nodule segmentation using Perceptual U-Net algorithm
Decked Dragonfly Optimization (DDO)
Speculative Deceptive Network (SDN) classification algorithm
Performance analysis
Initially, the input CT lung image is obtained from the dataset and it is preprocessed by using the tanh normalization mechanism, which helps to improve the overall quality of more efficient of the image by suppressing the noise/artifacts. Then, an advanced Perceptual U-Net segmentation mechanism is implemented to segment the lung region from the filtered image, which helps to increase the accurate diagnosis of disease. After that, the DDO algorithm is employed to select the most relevant features based on the best optimal solution, which helps to reduce the complexity of classifier. The obtained features are properly trained for detecting the pulmonary lung cancer by using the SDN classification mechanism. At the end, the classifier accurately predicts the nodule and non-nodule classes according to the optimized features. The primary advantages of the proposed DDO-SDN based lung disease diagnosis system are follows: Increased convergence rate, reaches the optimal solution with minimum iterations, high accuracy, reduced false outputs, and reduced time consumption.
Preprocessing
At first, the input lung CT image obtained from the dataset is preprocessed by using the tanh normalization mechanism. The main purpose of using this mechanism is to increase the quality of CT image with increased quality and reduced noise. Typically, the original medical images from the datasets are very blurred, and noisy. So, it may degrade the performance of classifier with false positives and error outputs. Hence, it is one of the most important and essential process to increase the quality of medical images before parameter optimization and classification. For this purpose the different types of filtering techniques are developed in the conventional works, but it limits with the major problems of inefficient smoothening, existence of noise, and reduced quality. Therefore, the proposed work intends to develop an efficient filtering technique for normalizing the lung CT images. Also, some other factors of using the tanh normalization mechanism are simple to implement, easy to understand, reduced error rate and time consumption.
Lung nodule segmentation
Decked Dragonfly Optimization (DDO)
Separation
Alignment
Cohesion
Speculative Deceptive Network (SDN)
Results and discussion
This section presents the performance analysis of the proposed perceptual U-Net segmentation based DDO-SDN classification techniques. The original contribution of this work is to develop an automated lung nodule segmentation and pulmonary disease classification system by using advanced image processing mechanisms. Here, the results of the proposed DDO-SDN technique is validated in terms of accuracy, sensitivity, specificity, precision, recall, f-measure, and time. These parameters are increasingly used in all types of medical image processing applications for assessing the performance and effectiveness of the mechanisms.
Dataset description
Here, the EL-CAP lung image dataset and Lung Cancer data obtained (https://data.world/cancerdatahp/lung-cancer-data) are used for system implementation and validation, which are the publicly available benchmark datasets. The ELCAP public lung imaging database is taken into consideration as an input dataset for nodule detection. There are many CT lung images (a collection of 50 low-dose images) for the purpose of diagnosing diseases. With i5 processors and 4GB RAM, the proposed lung nodule identification work is implemented in MATLAB 2016a. The outcome analysis compares several optimization strategies and discusses the isolated features for each lung image. Fig. 4 (a) to (d) depicts the sample input lung images obtained from the EL-CAP dataset, ROI extracted images, contrast enhanced portions, and segmented output images.
Precision analysis
No of training samples | Precision | ||
KSCC with linear kernel | SVM-NN | Proposed | |
100 | 95.48 | 92.4 | 96.5 |
200 | 95.86 | 93.54 | 96.98 |
300 | 97.15 | 94.91 | 98.68 |
400 | 98.15 | 95.48 | 99.12 |
500 | 98.62 | 95.98 | 99.47 |
Recall analysis
No of training samples | Recall | ||
KSCC with linear kernel | SVM-NN | Proposed | |
100 | 92.84 | 91.51 | 93.51 |
200 | 92.99 | 91.85 | 94.43 |
300 | 94.19 | 93.45 | 95.68 |
400 | 95.97 | 94.51 | 96.98 |
500 | 97.8 | 96.89 | 98.45 |
F-Measure analysis
No of training samples | F-Measure | ||
KSCC with linear kernel | SVM-NN | Proposed | |
100 | 95.6 | 94.2 | 96.2 |
200 | 96.1 | 94.5 | 96.7 |
300 | 97 | 95.7 | 97.8 |
400 | 98.6 | 97.1 | 98.7 |
500 | 99.6 | 98.3 | 99.2 |
Accuracy analysis
No of training samples | Accuracy | ||
KSCC with linear kernel | SVM-NN | Proposed | |
100 | 95.45 | 94.15 | 95.6 |
200 | 96.98 | 94.98 | 96.8 |
300 | 97.46 | 95.68 | 97.9 |
400 | 97.89 | 96.87 | 98.5 |
500 | 98.85 | 98.54 | 99.4 |
Table 5 and Fig. 9 validates the accuracy and time of conventional [33] and proposed classification methodologies. Typically, the time consumption of disease detection system is estimated based on the amount of time required to train and test the samples by the classifier. Moreover, the reduced time consumption ensures the improved performance and efficacy of the classifier. Based on the estimated results, it is analyzed that the proposed technique outperforms the other classifiers with increased accuracy and reduced time consumption in terms of seconds. Because, the DDO technique could efficiently reduce the dimensionality of features by selecting the optimal parameters based on the best optimal solution. Hence, the training and testing operations of the classifier are improved with reduced processing time consumption.
Accuracy and time analysis
Techniques | Accuracy (%) | Processing Time (s) |
Decision tree | 95.40 | 0.0159 |
KNN – Euclidean distance (k = 6) | 99.8 | 0.0313 |
KNN – Euclidean distance (k = 10) | 96.4 | 0.0312 |
GA-KNN | 100 | 0.0156 |
Proposed | 100 | 0.0098 |
Table 6 and Fig. 10 shows the overall performance analysis of the proposed DDO-SDN mechanism, where the results indicate that the proposed technique provides an improved performance values by accurately detecting the disease.
Performance analysis of DDO-SDN
Parameters | Performance value |
Sensitivity | 98.7 |
Specificity | 98.9 |
Accuracy | 99.25 |
Precision | 99.2 |
Recall | 98.9 |
F-Measure | 98.7 |
FPR | 0.006 |
TPR | 99.1 |
Table 7 presents the comparative analysis of existing tensorflow and proposed tanh normalization models based on the parameters of error rate (%) and time (s). According to the results, it is observed that the tanh normalization mechanism outperforms the existing tensorflow model with reduced time 15s and error rate 6.54%. Due to its efficient filtering and noise removal operations, the performance of the tanh normalization model is highly improved than the tensorflow model. Table 8 and Fig. 11 assess the performance of the perceptual U-Net lung nodule segmentation mechanism used in the proposed work. Overall, the obtained results indicate that the proposed U-Net segmentation algorithm provides an improved results by accurately segmenting the given lung image.
Comparative analysis between tensorflow and tanh normalization models
Techniques | Error (%) | Time (s) |
Tensorflow normalization | 8.25% | 21 |
Tanh normalization | 6.54% | 15 |
Performance analysis of segmentation
Performance measures | Values (%) |
Sensitivity | 99 |
Specificity | 98.8 |
Accuracy | 99 |
Dice | 98.2 |
Jaccard index | 98 |
Discussion
The paper primarily addressed lung image segmentation and nodule classification in the context of medical image processing. Finding the best subset of features from the large number of input images in the ELCAP public dataset is the primary problem of image classification. We extracted the lung and nodule portions of the dataset and identified the nodule impacted and normal lung images. For this purpose, an advanced image processing techniques such as tanh normalization, perceptual U-Net segmentation, DDO based feature selection, and SDN classification. Here, the lung nodules and non-nodules are exactly segmented and classified with increased accuracy and reduced processing time.
Conclusion
This paper presents a new Perceptual U-Net segmentation based DDO-SDN classification system for detecting the pulmonary lung cancer from CT images. The original contribution of this work is to implement a simple and effective detection framework for an accurate diagnosis of pulmonary lung cancer. In this framework, an advanced preprocessing, segmentation, optimization, and classification methodologies are used to design an automated lung disease detection system. Here, the tanh normalization mechanism is applied at first for preprocessing the input CT images by eliminating the noise/artifacts, and improving the quality of image. Typically, the noisy image can disrupt the overall detection performance of the classifier, hence it should be properly filtered for further operations. After that, the perceptual U-Net segmentation algorithm is used to accurately segment the lung nodules from the preprocessed image, which helps to avoid over-segmentation by accurately cropping the regions. After that, the most relevant features are extracted from the segmented portion based on the optimal solution provided by the DDO algorithm. It mainly helps to obtain an increased detection accuracy and reduce the error rate by suppressing the dimensionality of features. Finally, the SDN based classification mechanism is used to exactly detect the pulmonary lung cancer from the given CT image according to its features. During performance analysis, the results of the proposed segmentation based DDO-SDN technique are validated and compared using various measures. According to the comparative results, it is evident that the proposed DDO-SDN technique overwhelms other approaches with increased accuracy and reduced false positives.
In future, this work can be enhanced by implementing a new optimization and deep learning based classification methods for lung disease diagnosis.
Authors Contribution
The study's inception and design involved input from all authors. T.Chitra prepared the material, collected the data, and carried out the analysis. All writers provided feedback on earlier draughts of the paper after T.Chitra wrote the original draught. The paper's author, P.Jaganathan, contributed to the background research and assisted with the mathematical derivations. The author C.Sundar Technically participated, contributed a factual evaluation, and assisted with manuscript editing. The final manuscript was read and approved by all writers.
Conflict of interest
The authors declare that they have no conflict of interest.
Funding sources
The authors have no relevant financial or non-financial interests to disclose.
Ethical statement
The study submitted to IMAGING have been conducted in accordance with the Declaration of Helsinki and according to requirements of all applicable local and international standards.
Acknowledgements
I am grateful to all of those with whom I have had the pleasure to work during this and other related Research Work. Each of the members of my Dissertation Committee has provided me extensive personal and professional guidance and taught me a great deal about both scientific research and life in general.
References
- [1]↑
Bharati S, Podder P, Mondal R, Mahmood A, Raihan-Al-Masud M: Comparative performance analysis of different classification algorithm for the purpose of prediction of lung cancer. In International conference on intelligent systems design and applications: Springer; 2018: 447–457.
- [2]
Asuntha A, Srinivasan A: Deep learning for lung Cancer detection and classification. Multimedia Tools and Applications 2020; 79(11): 7731–7762.
- [3]
Bajo-Morales J, Galvez JM, Prieto-Prieto JC, Herrera LJ, Rojas I, Castillo-Secilla D: Heterogeneous gene expression cross-evaluation of Robust biomarkers using machine learning techniques applied to lung cancer. Current Bioinformatics 2022; 17(2): 150–163.
- [4]↑
Singh GAP, Gupta P: Performance analysis of various machine learning-based approaches for detection and classification of lung cancer in humans. Neural Computing and Applications 2019; 31(10): 6863–6877.
- [5]
Toğaçar M, Ergen B, Cömert Z: Detection of lung cancer on chest CT images using minimum redundancy maximum relevance feature selection method with convolutional neural networks. Biocybernetics and Biomedical Engineering 2020; 40(1): 23–39.
- [6]
Nadira T, Rustam Z: Classification of cancer data using support vector machines with features selection method based on global artificial bee colony. In AIP conference proceedings 2018; 2023(1): AIP Publishing LLC, 020205.
- [7]↑
Sujitha R, Seenivasagam V: Classification of lung cancer stages with machine learning over big data healthcare framework. Journal of Ambient Intelligence and Humanized Computing 2021; 12(5): 5639–5649.
- [8]
Zhang G, Lin L, Wang J: Lung nodule classification in CT images using 3D densenet. Journal of Physics: Conference Series 2021; 1827(1): IOP Publishing, 012155.
- [9]
Jenipher VN, Radhika S: SVM kernel methods with data normalization for lung cancer survivability prediction application. In 2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV) 2021: IEEE, 1294–1299.
- [10]↑
Chaunzwa TL, et al.: Deep learning classification of lung cancer histology using CT images. Scientific Reports 2021; 11(1): 1–12.
- [11]↑
Manju B, Athira V, Rajendran A: Efficient multi-level lung cancer prediction model using support vector machine classifier. In IOP Conference Series: Materials Science and Engineering 2021; 1012(1): IOP Publishing, 012034.
- [12]↑
Lakshmanaprabu S, Mohanty SN, Shankar K, Arunkumar N, Ramirez G: Optimal deep learning model for classification of lung cancer on CT images. Future Generation Computer Systems 2019; 92: 374–382.
- [13]↑
Alsinglawi B, et al.: An explainable machine learning framework for lung cancer hospital length of stay prediction. Scientific Reports 2022; 12(1): 1–10.
- [14]
Hindocha S, et al.: A comparison of machine learning methods for predicting recurrence and death after curative-intent radiotherapy for non-small cell lung cancer: development and validation of multivariable clinical prediction models. EBioMedicine 2022; 77: 103911.
- [15]↑
Shaffie A, et al.: A novel technology to integrate imaging and clinical markers for non-invasive diagnosis of lung cancer. Scientific Reports 2021; 11(1): 1–10.
- [16]
Huang G, Wei X, Tang H, Bai F, Lin X, Xue D: A systematic review and meta-analysis of diagnostic performance and physicians’ perceptions of artificial intelligence (Ai)-assisted CT diagnostic technology for the classification of pulmonary nodules. Journal of Thoracic Disease 2021; 13(8): 4797.
- [17]
Mishra S, Thakkar HK, Mallick PK, Tiwari P, Alamri A: A sustainable IoHT based computationally intelligent healthcare monitoring system for lung cancer risk detection. Sustainable Cities and Society 2021; 72: 103079.
- [18]
Morozov SP, et al.: A simplified cluster model and a tool adapted for collaborative labeling of lung cancer CT scans. Computer Methods and Programs in Biomedicine 2021; 206: 106111.
- [19]↑
Va B, Subramoniam M, Mathew L: Noninvasive detection of COPD and lung cancer through breath analysis using MOS Sensor array based e-nose. Expert Review of Molecular Diagnostics 2021; 21(11): 1223–1233.
- [20]
Zhao L, et al.: A weighted discriminative extreme learning machine design for lung cancer detection by an electronic nose system. IEEE Transactions on Instrumentation and Measurement 2021; 70: 1–9.
- [21]
Osarogiagbon RU, et al.: The international association for the study of lung cancer molecular database project: Objectives, challenges, and opportunities. Journal of Thoracic Oncology 2021; 16(6): 897–901.
- [22]
Alifano M, et al.: The Reality of lung cancer paradox: the impact of body mass index on long-term survival of resected lung cancer. A French nationwide analysis from the epithor database. Cancers 2021; 13(18): 4574.
- [23]↑
Primakov SP, et al.: Automated detection and segmentation of non-small cell lung cancer computed tomography images. Nature Communications 2022; 13(1): 1–12.
- [24]↑
Wang W, et al.: Nodule-plus R-CNN and deep self-paced active learning for 3D instance segmentation of pulmonary nodules. Ieee Access 2019; 7: 128796–128805.
- [25]↑
Shaziya H, Shyamala K, Zaheer R: Automatic lung segmentation on thoracic CT scans using U-net convolutional network. In 2018 International conference on communication and signal processing (ICCSP) 2018: IEEE, 0643–0647.
- [26]↑
Haq R, Hotca A, Apte A, Rimner A, Deasy J O, Thor M: Cardio-pulmonary substructure segmentation of radiotherapy computed tomography images using convolutional neural networks for clinical outcomes analysis. Physics and Imaging in Radiation Oncology 2020; 14: 61–66.
- [27]↑
Kiser K J, et al.: PleThora: Pleural effusion and thoracic cavity segmentations in diseased lungs for benchmarking chest CT processing pipelines. Medical Physics 2020; 47(11): 5941–5952.
- [28]↑
Alzubi J A, Bharathikannan B, Tanwar S, Manikandan R, Khanna A, Thaventhiran C: Boosted neural network ensemble classification for lung cancer disease diagnosis. Applied Soft Computing 2019; 80: 579–591.
- [29]↑
Kriegsmann M, et al.: Deep learning for the classification of small-cell and non-small-cell lung cancer. Cancers 2020; 12(6): 1604.
- [30]↑
Abe T, et al.: Simple method for evaluating achievement degree of lung dose optimization in individual patients with locally advanced non-small cell lung cancer treated with intensity modulated radiotherapy. Thoracic Cancer; 2022.
- [31]↑
Lin C-J, Jeng S-Y, Chen M-K: Using 2D CNN with Taguchi parametric optimization for lung cancer recognition from CT images. Applied Sciences 2020; 10(7): 2591.
- [32]↑
Nanglia P, Kumar S, Mahajan AN, Singh P, Rathee D: A hybrid algorithm for lung cancer classification using SVM and Neural Networks. ICT Express 2021; 7(3): 335–341.
- [33]↑
Maleki N, Zeinali Y, Niaki STA: A k-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection. Expert Systems with Applications 2021; 164: 113981.