
Image Processing Projects
CSE Projects, ECE Projects - Final year Projects
Download Project List Description
Image Processing Projects: This technique means processing images using mathematical algorithm. ElysiumPro provides a comprehensive set of reference-standard algorithms and workflow process for students to do implement image enhancement, geometric transformation, and 3D image processing for research.Quality Factor
- 100% Assured Results
- Best Project Explanation
- Tons of Reference
- Cost optimized
- Controlpanel Access
1. Adaptive Hierarchical Multi-Headed Convolutional Neural Network With Modified Convolutional Block Attention for Aerial Forest Fire Detection
This project presents an advanced deep learning-based system for aerial forest fire detection using an Adaptive Hierarchical Multi-Headed Convolutional Neural Network integrated with a Modified Convolutional Block Attention Module (mCBAM). Leveraging the FlameVision Classification Dataset, the model is trained to distinguish between "Fire" and "No Fire" images using hierarchical convolution layers and attention mechanisms to enhance feature focus and discrimination. Data preprocessing includes normalization and extensive augmentation to improve generalization, while the model's performance is evaluated through accuracy, precision, recall, F1-score, and Grad-CAM visualizations. A user-interactive Streamlit web application is developed for real-time prediction and interpretability, enabling users to upload images, receive fire detection results, and visualize attention maps highlighting critical regions. This system aims to support rapid, intelligent, and interpretable fire monitoring for environmental and safety applications.
information through natural language interactions. The chatbot uses machine learning and natural language processing to understand farmers’ queries related to crop management, pest control, weather forecasts, and government schemes. It offers personalized advice based on regional conditions and crop types, helping farmers make informed decisions. Agribot operates on multiple platforms, ensuring accessibility even in remote areas with limited connectivity. By automating agricultural support, the system reduces dependency on experts and improves productivity. The project highlights the role of AI in empowering rural communities and advancing sustainable farming practices.
2. Analysis of the Influence of CT Imaging Conditions on the CT Image Features of Pulmonary Nodule Phantoms
This study investigates the impact of CT imaging conditions on the image features of pulmonary nodule phantoms, aiming to optimize the diversity of imaging parameters for building robust AI-enabled medical device software. Fifteen pulmonary nodule phantoms, representing three nodule types, were embedded within a whole lung phantom and scanned under various CT acquisition parameters—including tube voltage, tube current-time product, collimator width, and reconstruction kernel (14 imaging parameter sets total). Using PyRadiomics, texture features were extracted from the CT images and key features were selected via LASSO regression. Subsequent correlation analysis and non-parametric testing identified reconstruction kernels as the most influential factor affecting texture feature stability, while tube voltage, tube current, and collimator width showed no significant effects. Leveraging these data, multiple classification algorithms including a Convolutional Neural Network (CNN) based on ResNet50 and classical machine learning models (SVM, Random Forest, Logistic Regression) were trained and evaluated to classify pulmonary nodule types. The results demonstrated effective feature representation and high classification accuracy, confirming that incorporating diverse reconstruction kernels is critical in assembling representative test datasets for AI systems in pulmonary nodule analysis. This approach highlights the importance of CT imaging parameter selection to ensure reliable AI model performance in clinical workflows.
3. EmotionNet-X: An Optimized CNN Architecture for Robust Facial Emotion Analysis
Facial Emotion Recognition (FER) enables computers to interpret human emotions by analyzing facial expressions, playing a crucial role in applications such as IoT-based access control, real-time health monitoring, security, and live assistance. Despite advancements, FER remains challenging due to variations in facial expressions, demographics, and environmental conditions. Existing deep learning models, particularly Convolutional Neural Networks (CNNs), have demonstrated strong performance but often incur high computational costs, limiting their deployment in real-time IoT systems. This paper proposes EmotionNet-X, a lightweight CNN architecture designed to balance accuracy and efficiency for practical FER implementation. EmotionNet-X features a streamlined structure with four convolutional layers, seven dropout layers, and batch normalization, comprising 19.9 million parameters and achieving an inference time of 18 ms per image. The model’s performance was evaluated against established architectures such as VGG19, ResNet50V2, MobileNetV2, and EfficientNetB7, using public datasets CK+ and FER2013. EmotionNet-X achieved an outstanding accuracy of 99.86% on the CK+ dataset, demonstrating its potential for cost-effective and real-time emotion recognition in diverse IoT applications.
4. Automated Number Plate Recognition Using OCR and Deep Learning
Automatic Number Plate Detection (ANPD) plays a vital role in monitoring vehicles amid the rapid growth of automobile numbers and rising illegal activities. This paper presents a method utilizing the open-source TensorFlow platform for machine learning to enhance the accuracy of number plate recognition. The process begins with inputting images of vehicles, which are often low-resolution and lack clear edge details, requiring advanced image processing techniques. The system extracts the vehicle’s number plate from the image, crops it, and converts it to grayscale to reduce noise and standardize detection across various plate colors without needing separate algorithms. Optical Character Recognition (OCR) is then applied to extract the alphanumeric characters from the processed plate image. Extracted data is stored in an Excel file for easy access and future reference. Compared to state-of-the-art recognition methods, the proposed approach shows an average improvement of 3.6%. Finally, the paper introduces a hybrid classification model combining Support Vector Machine (SVM) and Bayesian rule-based techniques, further improving the accuracy and reliability of number plate recognition systems.
5. A Novel Local Binary Patterns-Based Approach and Proposed CNN Model to Diagnose Breast Cancer by Analyzing Histopathology Images
Breast cancer is a leading cause of mortality among women, highlighting the need for accurate early diagnosis. This study proposes two high-performance methods for preliminary diagnosis of breast cancer using histopathology images: a CNN-based model and a Local Binary Pattern (LBP)-based approach. The CNN model features a 20-layer architecture, while the LBP method is enhanced with a novel Quad Star LBP (QS-LBP) pattern, inspired by a star-like structure. Both methods were evaluated on two comprehensive datasets widely used in breast cancer research: the BreaKHis dataset with approximately 7,924 images at various magnifications (40X, 100X, 200X, 400X), and another dataset containing about 278,000 histopathology images. Images processed with QS-LBP were further classified using Random Forest and Optimized Forest algorithms. The QS-LBP method achieved 94.58% accuracy, 92.3% F1 score, and 97.9% AUC/ROC, while the CNN model reached 98.27% accuracy, 98% F1 score, and 97% AUC/ROC. Both approaches outperform many existing techniques in distinguishing benign from malignant breast cancer images, offering promising tools for supporting early and reliable breast cancer diagnosis.
6. Early Diagnosis and Severity Assessment of Weligama Coconut Leaf Wilt Disease and Coconut Caterpillar Infestation Using Deep Learning-Based Image Processing Techniques
Coconut cultivation worldwide faces significant yield losses due to pest and disease outbreaks, notably Weligama Coconut Leaf Wilt Disease (WCWLD) and Coconut Caterpillar Infestation (CCI), which severely impact production in Sri Lanka and neighboring regions. Traditional detection methods rely on labor-intensive field observations, limiting timely diagnosis and intervention. This study explores advanced deep learning techniques for early and accurate detection of WCWLD and CCI. Transfer learning-based Convolutional Neural Networks (CNN) and Mask Region-based CNN (Mask R-CNN) were employed to identify the diseases and assess WCWLD severity. Additionally, the You Only Look Once (YOLO) object detection models were utilized to count caterpillar populations on affected leaves. Datasets from Matara, Puttalam, and Makandura regions in Sri Lanka were used for training and validation. The proposed methods achieved detection accuracies of 90% for WCWLD and 95% for CCI. The WCWLD severity classification attained 97% accuracy. For caterpillar counting, YOLOv5, YOLOv8, and YOLO11 models delivered accuracies of 96.87%, 96.1%, and 95.9%, respectively. These results demonstrate the potential of deep learning models to provide efficient, scalable, and precise monitoring solutions for coconut pest and disease management.
7. Machine Learning-Based Normal White Blood Cell Multi-Classification Optimization
White blood cell (WBC) classification is traditionally performed manually, relying on subjective judgment, which can lead to variability and inefficiency. To address this, this study proposes an optimized machine learning (ML) system for classifying five normal WBC types with high accuracy. Using open-source Raabin-WBC and private datasets, WBC images were segmented into nucleus, cytoplasm, and cell regions via U-Net, a deep learning model, achieving high segmentation accuracy (nucleus: 98.58%, Dice coefficient: 0.9233; cells: 99.47%, Dice coefficient: 0.9324). Feature extraction involved intensity histogram, hue saturation value, and CIE Lab features, with a final optimized set of 108 features determined through systematic experimentation. Among multiple classifiers, Support Vector Machine (SVM) yielded the best performance with an accuracy of 97.36%, and overall classification accuracy reached 98.22% using the selected features. The entire segmentation and classification process was implemented with a user-friendly graphical interface, requiring approximately 137 seconds per analysis. This ML-based system demonstrates significant potential to improve the efficiency and reliability of peripheral blood smear (PBS) tests, offering a valuable tool for clinical diagnostics.
8. FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning
Recently, fake audio detection has gained significant attention, as advancements in speech synthesis and voice conversion have increased the vulnerability of automatic speaker verification (ASV) systems to spoofing attacks. A key challenge in this task is generalizing models to detect unseen, out-of-distribution (OOD) attacks. Although existing approaches have shown promising results, they inherently suffer from overconfidence issues due to the usage of softmax for classification, which can produce unreliable predictions when encountering unpredictable spoofing attempts. To deal with this limitation, we propose a novel framework called fake audio detection with evidential learning (FADEL). By modeling class probabilities with a Dirichlet distribution, FADEL incorporates model uncertainty into its predictions, thereby leading to more robust performance in OOD scenarios. Experimental results on the ASVspoof2019 Logical Access (LA) and ASVspoof2021 LA datasets indicate that the proposed method significantly improves the performance of baseline models. Furthermore, we demonstrate the validity of uncertainty estimation by analyzing a strong correlation between average uncertainty and equal error rate (EER) across different spoofing algorithms.
9. Identifying Human Factors in Aviation Accidents with Natural Language Processing and Machine Learning Models
The use of machine learning techniques to identify contributing factors in air incidents has grown significantly, helping to identify and prevent accidents and improve air safety. In this paper, classifier models such as LS, KNN, Random Forest, Extra Trees, and XGBoost, which have proven effective in classification tasks, are used to analyze incident reports parsed with natural language processing (NLP) techniques, to uncover hidden patterns and prevent future incidents. Metrics such as precision, recall, F1-score and accuracy are used to assess the degree of correctness of the predictive models. The adjustment of hyperparameters is obtained with Grid Search and Bayesian Optimization. KNN had the best predictive rating, followed by Random Forest and Extra Trees. The results indicate that the use of machine learning tools to classify incidents and accidents helps to identify their root cause, improving situational decision-making.
10. CNN Issues in Skin Lesion Classification:Data Distribution and Quantity
This study investigates challenges faced by convolutional neural networks (CNNs) in classifying skin lesions, focusing on the impact of data distribution and dataset size. It highlights how imbalanced classes and limited sample quantities can cause model bias and reduce classification accuracy. The paper suggests strategies such as data augmentation, transfer learning, and careful dataset curation to address these issues and improve diagnostic reliability in skin cancer detection.
11. Screening Dyslexia Using Visual Auditory Computer Games and Machine Learning
This research explores an innovative approach to dyslexia screening by combining visual-auditory computer games with machine learning analysis. The system collects gameplay data reflecting cognitive and sensory processing abilities, which are then analyzed using ML algorithms to identify patterns indicative of dyslexia. This non-invasive, engaging method aims to provide early detection, especially for children, facilitating timely intervention and support.
12. A Novel Deep Learning Approach for Accurate Cancer Type and Subtype Identification
Cancer, characterized by uncontrolled cell growth and spread, requires early and accurate diagnosis to improve patient outcomes. This study presents a novel deep learning framework for detecting and classifying eight primary cancer types and 26 subtypes using a large Kaggle dataset with over 130,000 images. Our approach integrates pre-trained Convolutional Neural Networks (CNNs), machine learning classifiers such as K-Nearest Neighbors (KNN) and Support Vector Machine (SVM), and hybrid CNN-Long Short-Term Memory (LSTM) architectures. Two classification strategies were employed: one classifying main and subtypes simultaneously, and another predicting main classes first, followed by subtypes. Notably, KNN outperformed CNNs for the Lymphoma class. We further introduced novel models, Vception (VGG + Inception) and Vmobilnet (VGG + MobileNet), combined with LSTM, enhancing diagnostic accuracy. An X-OR Gate fusion technique applied post-prediction significantly reduced misclassification, achieving 99.95% accuracy for main classes and 99.13% for subtypes. Individually, KNN with Principal Component Analysis reached 97.14% accuracy for Lymphoma detection. These results demonstrate the potential of advanced multimodal deep learning models to set new benchmarks in cancer diagnosis, offering improved precision and promising better patient care.
13. Brain Age Prediction Using a Lightweight Convolutional Neural Network
In this study, we propose a deep learning-based model for brain age prediction in preterm infants using neonatal MRI scans. The goal is to develop an efficient system that can predict the biological brain age of preterm infants, providing valuable insights into their neurodevelopmental trajectory. Preterm infants are at higher risk of developmental delays, making accurate prediction of brain age crucial for early interventions. The model leverages advanced transfer learning techniques to handle both 2D and 3D MRI data. For 2D MRI images, architectures like DeepBrainNet, DenseNet169, and ResNet152 are utilized, while for 3D MRI images, DeepBrainNet, DenseNet169, and ResNet101 are explored to capture the volumetric nature of the data.
14. Predicting mortality and recurrence in colorectal cancer: Comparative assessment of predictive models
This study focuses on predicting mortality and recurrence in colorectal cancer (CRC) by leveraging deep learning models to analyze medical images. Specifically, convolutional neural networks (CNNs) are employed to classify whether an input image indicates the presence of colorectal cancer. Various models, such as ResNet,, are comparatively assessed for their accuracy, sensitivity, and specificity in detecting CRC. The dataset comprises histopathological or radiological images labeled with clinical outcomes. Models are trained to not only detect cancer presence but also predict the likelihood of recurrence or patient mortality. Data augmentation and preprocessing techniques are applied to enhance model performance. Evaluation metrics include AUC-ROC, F1-score, and confusion matrices. Transfer learning is utilized to improve training efficiency on medical image datasets. Results demonstrate that deep learning can provide reliable, non-invasive CRC diagnostics. This approach supports clinical decision-making and personalized patient management.
15. Source-Free Collaborative Domain Adaptation via Multi-Perspective Feature Enrichment for Functional MRI Analysis
This study introduces a source-free collaborative domain adaptation approach to analyze functional MRI data, eliminating the need for labeled source data during adaptation. By employing multi-perspective feature enrichment, the method enhances feature representation across domains, facilitating accurate brain activity analysis. It addresses challenges in cross-domain fMRI analysis, improving generalization in clinical diagnostics.
16. A systematic review of machine learning-based tumor-infiltrating lymphocytes analysis in colorectal cancer: Overview of techniques, performance metrics, and clinical outcomes
This review systematically explores ML-based approaches for analyzing tumor-infiltrating lymphocytes (TILs) in colorectal cancer. It summarizes key techniques, models, and datasets used, alongside evaluation metrics like accuracy, AUC, and F1-score. The study also discusses clinical implications, highlighting how TIL analysis contributes to prognosis and personalized treatment in CRC.
17. Efficient Paddy Grain Quality Assessment Approach Utilizing Affordable Sensors
The paper proposes an efficient method for assessing paddy grain quality using low-cost sensors. It integrates sensor data with machine learning models to evaluate parameters like moisture, impurity, and grain integrity. The approach is affordable, scalable, and designed to benefit small and medium-scale rice producers, improving grain selection and post-harvest processes.
18. Detection of Brain Tumor based on Features Fusion and Machine Learning
This research presents a brain tumor detection system that combines multiple feature extraction techniques and fuses them for classification using ML algorithms. By integrating texture, shape, and intensity features, the method improves tumor detection accuracy in MRI images. It aims to support early diagnosis and reduce reliance on invasive procedures.
19. Skin Cancer Detection: Leveraging Hybrid Deep Learning Modelsand Traditional Machine Learning Classifiers
The study develops a hybrid model that merges deep learning feature extractors with traditional ML classifiers to detect skin cancer from dermoscopic images. CNNs capture deep features, which are then classified using models Random Forest. This hybrid strategy enhances diagnostic precision and supports dermatological decision-making.
20. Artifcial intelligence‑Enabled deep learning model for multimodal biometric fusion
This paper introduces an AI-enabled deep learning framework that fuses multiple biometric traits—such as face, fingerprint, and iris—for robust authentication. It utilizes CNN-based feature extraction and fusion layers to combine data, improving recognition accuracy and resistance to spoofing, with potential applications in secure identity verification systems.
21. Symmetry Alignment–Feature Interaction Network for Human Ear Similarity Detection and Authentication
The study presents a deep learning-based method for human ear recognition using a symmetry alignment–feature interaction network. The model aligns ear structures and enhances discriminative features, enabling accurate similarity detection and user authentication. This method is particularly effective in scenarios requiring unobtrusive biometric identification.
22. ML-CSFR: A UNIFIED CROP SELECTION AND FERTILIZER RECOMMENDATION FRAMEWORK BASED ON MACHINE LEARNING
This research proposes ML-CSFR, a unified framework that leverages machine learning to recommend optimal crops and corresponding fertilizers based on soil and environmental data. It integrates predictive analytics and recommendation systems to help farmers make informed decisions, enhancing crop yield and sustainable agriculture practices.
23. DETECTION OF ORAL CANCER FROM CLINICAL IMAGES USING DEEP LEARNING
This study develops a deep learning model to detect oral cancer using clinical images, aiming for early and non-invasive diagnosis. The CNN-based approach identifies cancerous lesions with high accuracy, reducing dependency on biopsies and enabling mass screening, especially in regions with limited access to specialized healthcare.
24. Deep learning based uterine fibroid detection in ultrasound images
The research introduces a deep learning model to detect uterine fibroids in ultrasound images. Using CNN architectures, the system automates lesion identification and segmentation, supporting early diagnosis and treatment planning. It enhances diagnostic efficiency and reduces subjectivity in ultrasound interpretation by medical professionals.
25. From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios
This study focuses on generalizing food recognition systems from controlled environments (e.g., canteens) to real-world daily meal settings. It uses deep learning models trained on diverse datasets to recognize food items under varied lighting, occlusion, and presentation conditions, aiming to support applications in nutrition tracking and dietary assessment.
26. Comparative Study of Forensic Face Recognition and Fingerprint during Crime Scene investigation and the role of Artificial Intelligence tools in Forensics
This study compares the effectiveness of face recognition and fingerprint analysis in forensic investigations, emphasizing how AI tools enhance identification accuracy and investigative efficiency. It analyzes traditional forensic methods alongside AI-driven technologies such as facial recognition algorithms and fingerprint matching models. The paper highlights the complementary role of AI in crime scene analysis, enabling faster suspect identification, improved accuracy, and automation of evidence processing.
27. AI-driven microscopy: from classical analysis to deep learning applications
This research traces the evolution of microscopy from classical image analysis techniques to modern AI-driven approaches, particularly deep learning. It discusses how convolutional neural networks and other DL models are revolutionizing cellular and tissue-level image interpretation in biomedical research. AI-driven microscopy enables automated detection, classification, and quantification of microscopic structures, improving diagnostic speed and precision in fields such as pathology and molecular biology.
28. EMPOWERING THE VISUALLY IMPAIRED WITH A MOBILE APPLICATION FOR DOCUMENT READER
This project presents a mobile application designed to assist visually impaired users in reading printed documents using AI and optical character recognition (OCR) technology. By converting text to speech, the app allows users to interact with written content in real time. The solution integrates voice navigation, text detection, and language processing, offering a user-friendly and affordable tool for independent living and accessibility enhancement.
29. CO2WOUNDS-V2: EXTENDED CHRONIC WOUNDS DATASET FROM LEPROSY PATIENTS
This study introduces CO2WOUNDS-V2, an extended and annotated dataset of chronic wounds from leprosy patients, aimed at improving wound assessment through AI models. The dataset includes high-resolution images with labeled wound regions, severity gradings, and healing stages. It provides a valuable resource for developing machine learning models for wound detection, classification, and monitoring, contributing to better patient care in dermatology and infectious diseases.
30. Machine Learning-Enabled Drug-Induced Toxicity Prediction
This research explores the use of machine learning algorithms to predict drug-induced toxicity, a critical step in pharmaceutical development and safety assessment. It uses large-scale chemical and biological datasets to train models capable of identifying toxic compounds based on molecular structure and biological activity. The approach improves early-stage drug screening, reducing the cost and time associated with experimental toxicity testing.
31. Revolutionizing Diabetic Retinopathy Screening:Integrating AI-Based Retinal Imaging in Primary Care
This paper discusses the integration of AI-powered retinal imaging tools into primary healthcare settings for early detection of diabetic retinopathy. It highlights the use of deep learning algorithms that analyze fundus images to identify retinal abnormalities with high accuracy. The AI system facilitates timely screening, especially in underserved areas, enabling early intervention and potentially preventing vision loss in diabetic patients.
32. AcciAlert: Instant Accident Detection & Notification
The Traffic Accident Detection System is an automated solution designed to identify traffic accidents in real-time by analyzing video footage from traffic cameras. It employs computer vision and machine learning techniques, utilizing a pre-trained TensorFlow SSD MobileNet model to detect vehicles like cars, buses, and trucks. The system tracks vehicle movements, detects anomalies, and uses Intersection over Union (IoU) metrics to identify high-confidence collisions. Upon detecting an accident, it sends immediate email alerts with images and details to relevant authorities, improving emergency response times. Additionally, it offers an API for users to upload videos, monitor processing status, and access results, ensuring easy integration with traffic management systems. Scalable and capable of processing multiple streams concurrently, this system can be integrated into smart city infrastructures. Future plans include real-time streaming, multi-camera support, AI-enhanced anomaly detection, and linking with external traffic data. Ultimately, it aims to enhance road safety, speed up accident responses, and reduce human intervention in traffic monitoring.
33. DETECTION OF CARDIOVASCULAR DISEASES IN ECGIMAGES USING MACHINE LEARNING AND DEEPLEARNING METHODS
This study explores the application of machine learning and deep learning techniques to detect cardiovascular diseases (CVDs) through analysis of ECG (electrocardiogram) images. Techniques like CNNs and support vector machines are employed to extract patterns, classify abnormalities, and identify diseases such as arrhythmia and myocardial infarction. The approach enhances non-invasive diagnosis, providing faster and more accurate detection compared to traditional ECG interpretation.
34. ENHANCED SECURITY FOR ATMS WITH FACIAL RECOGNITION FEATURES AND OTP
This project proposes a dual-layer ATM security system combining facial recognition and one-time passwords (OTP). Facial authentication ensures biometric verification, while the OTP adds an extra layer of dynamic user validation. By integrating AI-powered facial detection and secure OTP protocols, the system significantly reduces ATM fraud, ensuring safer transactions and improved user experience.
35. Tropical Cyclone Intensity Prediction Using Spatio-Temporal Data Fusion
This research focuses on predicting the intensity of tropical cyclones by fusing spatial and temporal meteorological data using machine learning models. It leverages satellite imagery, pressure readings, wind speeds, and time-series patterns to train predictive algorithms. The fusion technique enhances forecast accuracy, aiding in early disaster warning and effective resource allocation during extreme weather events.
36. Integrating IoT and Machine Learning to Provide Intelligent Security in Smart Homes
This study presents an intelligent smart home security system that combines IoT devices with machine learning algorithms for real-time threat detection and automation. Sensors and cameras collect data which is analyzed using ML to identify intrusions, detect anomalies, and control home devices. The system provides proactive security, energy management, and user convenience in smart living environments.
37. Systematic Review: AI Applications in Liver Imaging with a Focus on Segmentation and Detection
This systematic review evaluates recent advancements in AI techniques applied to liver imaging, particularly in the segmentation and detection of hepatic lesions, tumors, and structural anomalies. It highlights the use of convolutional neural networks and other DL models to enhance accuracy in CT, MRI, and ultrasound image analysis. The review also discusses clinical relevance, challenges, and the potential of AI to support hepatologists in early and precise diagnosis.
38. Hybrid Deep Learning for Detecting Fake Medical Images and Heart Sounds.
This research proposes a hybrid deep learning framework to detect fake or manipulated medical images and synthetic heart sounds, addressing growing concerns around medical data integrity. It combines CNNs for image analysis and RNNs or spectrogram-based techniques for audio validation. The model improves healthcare security by ensuring the authenticity of diagnostic media used in AI-driven systems.
39. Review of Poultry Monitoring using Computer Vision
This review examines how computer vision techniques are applied in poultry farming to monitor bird health, behavior, and environmental conditions. It covers applications like weight estimation, disease detection, activity tracking, and automated feeding systems. The review highlights how AI-driven visual monitoring increases efficiency, reduces labor, and ensures animal welfare in modern poultry production.
40. Clinical applications of artificial intelligence in robotic surgery
This paper reviews the integration of artificial intelligence into robotic-assisted surgical systems, emphasizing improvements in precision, safety, and intraoperative decision-making. AI enhances robotic control through real-time image guidance, gesture recognition, and predictive analytics. The study highlights clinical use cases across specialties such as urology, gynecology, and general surgery, showing how AI augments surgeon performance and patient outcomes.
41. Phase I trial of hES cell-derived dopaminergic neurons for Parkinson’s disease
Parkinson's disease (PD) is a progressive neurodegenerative disorder characterized by motor symptoms, including tremors, rigidity, and bradykinesia. Early detection and accurate diagnosis of PD are crucial for effective management and treatment. This study explores the use of advanced machine learning techniques, specifically Long Short-Term Memory (LSTM) networks and Convolutional Neural Networks (CNN), for identifying Parkinson's disease from clinical data and medical imaging. Using a dataset from Parkinson.csv, which includes various patient features such as speech and motor data, LSTM networks are employed to model temporal dependencies and predict the likelihood of Parkinson’s disease. Additionally, CNNs are applied to Magnetic Resonance Imaging (MRI) scans of Parkinson’s patients to detect subtle structural changes in the brain, which can serve as an indicator of disease progression.The study also includes data visualization to illustrate the distribution of Parkinson-affected individuals and compares the model's accuracy in detecting PD using both clinical data and MRI scans. The results show the potential of combining LSTM and CNN for improving the early detection and diagnosis of Parkinson’s disease, paving the way for more effective, personalized treatment strategies.
42. A robust fragile watermarking approach for image tampering detection and restoration utilizing hybrid transforms
This project presents a comprehensive approach to image tampering detection and classification using deep learning techniques. The methodology encompasses three primary models: a Convolutional Neural Network (CNN) based on ResNet for binary classification of tampered versus non-tampered images, a Generative Adversarial Network (GAN) for generating synthetic tampered images, and an Autoencoder for detecting anomalies through reconstruction error analysis.The ImageMaskDataset class facilitates loading images and their corresponding labels by evaluating the folder structure. The dataset is split into training and testing subsets, and data loaders are employed for efficient batching. The ResNet model is fine-tuned for binary classification, achieving robust performance through a standard training loop, while the GAN is designed to generate realistic tampered images to augment the training dataset.The Autoencoder model is implemented for unsupervised anomaly detection, leveraging reconstruction errors to classify images as tampered or not. Evaluation metrics such as accuracy, precision, recall, and F1-score are calculated for each model, alongside ROC curves and confusion matrices to visualize performance.
Finally, a user-friendly interface is developed to allow users to upload images for real-time tampering detection, providing insights into the integrity of images based on the trained models. This work highlights the effectiveness of deep learning methodologies in addressing the critical issue of image tampering in various applications.
43. A Comprehensive Review of Image Enhancement Techniques
With increasing demand for high-quality images in fields like computer vision, photography, and medical imaging, advanced image enhancement techniques are essential. This project introduces the Improved Representative Color Transform (IRCT) algorithm, designed to enhance the color and spatial quality of low-resolution images by applying a color transformation process.
The IRCT improves color accuracy, sharpness, and visual quality, making it suitable for resource-constrained environments.
To compare the IRCT, the Super-Resolution Convolutional Neural Networks (CNN) method is also evaluated. Using the DIV2K dataset, results show that IRCT outperforms CNN in key metrics like PSNR, SSIM, and color accuracy, while being more computationally efficient.
This study highlights the IRCT’s ability to produce enhanced images with superior color accuracy, detail preservation, and reduced computational cost, positioning it as an effective solution for image enhancement.
44. LFDT-Fusion - A latent feature-guided diffusion Transformer model for general image fusion
This paper proposes a novel multi-sensor image fusion method that enhances image contrast and preserves critical structural details across diverse imaging modalities, such as medical, infrared-visible, and multi-focus images.
The fusion process is based on a combination of a Guided Image Filter (GIF), Non-Subsampled Shearlet Transform (NSST), and a Deep Convolutional Neural Network (DCNN).
First, the source images are pre-processed using low-rank representation to decompose them into principal components, which are then enhanced with guided image filtering.
The deep features of these components are extracted using a DCNN to capture the salient characteristics of the images.
Fusion is performed using a weighted-average rule for principal components and a sum rule for salient features to ensure the preservation of complementary information from different sources.
The fused images are evaluated using multiple performance metrics, including Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), Fusion Performance Index (FPI), and entropy, alongside subjective visual assessments.
Experimental results demonstrate the superiority of the proposed method in terms of improved image quality, enhanced visual interpretation, and robustness across various sensor modalities.
The method shows significant potential for applications in medical imaging and other real-world scenarios where accurate and high-quality image fusion is critical.
45. Classification of E-Waste Types Using Machine Learning and Digital Image Processing
This study presents a method for classifying different types of electronic waste (e-waste) by combining digital image processing techniques with machine learning algorithms. High-resolution images of e-waste items are processed to extract features such as shape, texture, and color. These features are then used to train classifiers like SVM, Random Forest, or CNNs to accurately identify and categorize various e-waste components. The approach aims to improve automated sorting in recycling facilities, promoting efficient waste management and environmental sustainability.
46. A robust underwater image enhancement algorithm
Underwater images often suffer from color distortion, low contrast, and diminished details due to light absorption and scattering. This study proposes an advanced underwater image enhancement algorithm combining adaptive color correction and an improved Retinex approach to address these challenges. Utilizing the UIEB dataset, the method first applies adaptive color correction to neutralize blue-green bias, followed by image decomposition and enhancement via Non-Subsampled Shearlet Transform (NSST) and Retinex-based techniques. Performance is evaluated using metrics like PCQI, UCIQE, UIQM, and Information Entropy (IE), with SIFT-based feature matching validating structural preservation. Results demonstrate superior performance over state-of-the-art methods in perceptual quality and detail enhancement. The implementation leverages Python (Spyder IDE), MySQL (WAMP Server), and Flask for front-end deployment, offering a robust solution for underwater image restoration.
47. Preserving medical information from doctor’s prescription ensuring relation among the terminology
This project focuses on automating the extraction of medicine names from handwritten prescriptions using transfer learning techniques. The approach involves pre-processing steps such as image resizing, augmentation, grayscale conversion, and noise reduction to prepare the dataset of handwritten prescription images. YOLO is employed for object detection to identify regions where medicine names are written, while transfer learning fine-tunes pre-trained models like MobileNetV2 and DenseNet121 for efficient feature extraction and text recognition. The system predicts medicine names by detecting relevant regions with YOLOv5, applying Optical Character Recognition (OCR) for text extraction, and performing post-processing to clean and standardize the results. Model performance is evaluated using metrics such as accuracy, precision, recall, F1-score, and AUC-ROC. The system also includes visualization techniques like bounding box overlays and confusion matrices to assess and display results effectively.
48. Toward a privacy-preserving face recognition system: A survey of leakages and solutions
This project develops an automated face recognition system to determine whether an individual is authorized based on facial features. It employs two algorithms: Eigenfaces (PCA-based) and Convolutional Neural Networks (CNNs).
The Eigenfaces approach reduces facial image dimensions using Principal Component Analysis (PCA) to capture key features and compares them to a database of authorized faces for classification. The CNN approach, on the other hand, uses deep learning to automatically learn and compare complex facial features for more robust and accurate recognition. The system outputs ""authorized"" or ""unauthorized"" based on the similarity between the input image and stored templates, with the CNN approach providing higher accuracy in varied and real-world conditions.
49. Lung Cancer Detection Systems Applied to Medical Images A State-of-the-Art Survey
This project aims to detect both lung and colon cancer from medical images using advanced machine learning techniques. The dataset, sourced from Kaggle, undergoes preprocessing steps such as resizing, normalization, grayscale conversion, denoising, and Gabor filtering to enhance feature extraction. The Tuna Swarm Algorithm (TSA) optimizes model parameters, while the Echo State Network (ESN) classifies whether an image indicates cancer (benign/malignant) for both lung and colon tissues. The data is split into training (70-80%) and test (20-30%) sets, with performance evaluated using accuracy, precision, recall, F1-score, AUC-ROC, and confusion matrix. A Flask-based web interface allows users to upload medical images for real-time predictions, which can then be reviewed by healthcare professionals for further diagnosis. The system is developed in Python using Anaconda (Spyder), ensuring an efficient and scalable solution for early cancer detection.
50. Alzheimer's disease neuropathology and its estimation with fluid and imaging biomarkers
Alzheimer's disease (AD) is a progressive neurodegenerative disorder that affects millions of people worldwide. Early detection is critical for managing the disease and providing patients with timely interventions.
This study proposes a machine learning-based approach for detecting Alzheimer's disease using a dataset of neuroimaging and clinical data. The dataset includes MRI scans, genetic information, and cognitive test results from a large cohort of individuals with varying stages of Alzheimer's.
We explore the effectiveness of several machine learning models, including Random Forests, and Convolutional Neural Networks (CNN), to classify individuals as either having Alzheimer's or being cognitively healthy.
The results show that the proposed method achieves high classification accuracy, with CNN-based models outperforming traditional methods in terms of sensitivity and specificity. This approach offers a promising tool for early detection of Alzheimer's disease, contributing to more efficient diagnosis and personalized treatment planning.
51. LAMB- An open-source software framework to create artificial intelligence assistants deployed and integrated into learning management systems
This study benchmarks probabilistic deep learning methods for license plate recognition (LPR), focusing on enhancing accuracy and reliability under real-world conditions. Utilizing a dataset of license plate images, the approach includes comprehensive preprocessing steps such as resizing, normalization, augmentation, and super-resolution to handle low-quality inputs. The dataset is split into training, validation, and testing subsets, with the test set emphasizing out-of-distribution (OOD) scenarios. The system employs convolutional neural networks (CNNs), probabilistic models like SR2 methods to estimate prediction uncertainty. A multi-task learning model is introduced to simultaneously perform LPR and image super-resolution, leveraging shared features for improved performance. Evaluation metrics include accuracy, precision, recall, F1-score, mean squared error, and novel uncertainty-based measures such as prediction uncertainty and error detection rate. Results demonstrate a 109% accuracy improvement with the multi-task model and a 29% increase in error detection via uncertainty quantification, highlighting the system's robustness and practical value in uncertain environments.
52. Diagnosis and typing of leukemia using a single peripheral blood cell through deep learning
This study proposes a deep learning model that analyzes images of individual peripheral blood cells to diagnose and classify different types of leukemia. By leveraging convolutional neural networks (CNNs), the system extracts detailed morphological features from single-cell images, enabling precise identification of leukemia subtypes. This approach offers a rapid, non-invasive, and accurate alternative to traditional multi-cell and lab-intensive diagnostics.
53. Diabetic Foot Ulcers Detection Model Using a Hybrid Convolutional Neural Networks–Vision Transformers
This research introduces a hybrid deep learning architecture combining convolutional neural networks (CNNs) with vision transformers (ViTs) to detect diabetic foot ulcers from medical images. The CNN layers capture local spatial features while the transformer components model long-range dependencies, improving detection accuracy and robustness. This model aims to assist clinicians in early diagnosis and monitoring of diabetic foot complications, reducing the risk of severe outcomes.
54. IAE-CDNet: A Remote Sensing Change Detection Network for Buildings With Interactive Attention-Enhanced
Currently, the development of deep learning has had a positive impact on remote sensing image change detection tasks, but many current methods still face challenges in effectively processing global and local features, especially in the task of building change detection in high-resolution images containing complex scenes. The extraction of target-related features is typically difficult, and changes in scene conditions further increase the difficulty of identifying real changes. To address these challenges, we propose the interactive attention-enhanced change detection network (IAE-CDNet). We design the local–global interaction attention module, which effectively establishes the interactive relationship between local and global features and realizes information interaction between branches, enhancing the ability to obtain architectural detail features. Additionally, our change perception attention enhancement module enhances the feature perception ability of the real change area through the joint action of the internal comprehensive feature extractor and the fusion attention mechanism. We conduct extensive experiments on three datasets. Results indicate that the evaluation indicators and performance of our IAE-CDNet are better than those of other state-of-the-art methods.
55. AlertDriveNet - A Deep Learning Model for Predicting Driver Distraction
AlertDriveNet is a deep learning-based model designed to predict driver distraction by analyzing real-time visual and behavioral data. Using convolutional neural networks (CNNs) to process images or video frames from in-cabin cameras, the model detects signs of inattention such as gaze diversion, head pose, and hand movements. By accurately identifying distracted driving states, AlertDriveNet aims to enhance road safety through timely alerts and preventive interventions.
56. Plant Leaf Identification Using Feature Fusion of Wavelet Scattering Network and CNN With PCA Classifier
Accurate plant species identification is vital in medicine, agriculture, and the food industry, yet traditional machine learning methods often struggle to capture complex leaf features. Deep learning models, especially Convolutional Neural Networks (CNNs), excel in this task but typically require large datasets and significant computational power. To address these challenges, we propose a novel hybrid approach combining Wavelet Scattering Networks (WSNs) and MobileNetV2 for efficient leaf classification. WSNs extract texture patterns using fixed filters without needing training, making them effective with smaller datasets. MobileNetV2 complements this by capturing high-level features such as shapes and edges. The combined features are classified using a Principal Component Analysis (PCA)-based classifier, reducing feature redundancy and improving accuracy. The model was evaluated on Flavia and Folio datasets, achieving accuracies of 98.75% and 98.7%, respectively. Further testing on the Cope dataset demonstrated scalability across diverse classes, while evaluation on the UK Leaf dataset showed robustness under varying backgrounds and noise. This method balances high accuracy with reduced computational requirements, offering a practical solution for automated leaf classification in resource-constrained environments.
57. Jointly Learning From Unimodal and Multimodal-Rated Labels in Audio-Visual Emotion Recognition
Audio-Visual Emotion Recognition (AVER) plays a crucial role in advancing human-computer interaction by interpreting emotional states through multimodal data. Traditional AVER models rely on annotations derived from raters observing audio-visual stimuli, often overlooking the nuanced variations in human emotional perception across different stimulus modalities. This study explores the impact of integrating annotation labels obtained from audio-only, face-only, and combined audio-visual stimuli to enhance AVER system performance. We propose a novel two-stage training framework that leverages these diverse annotation stimuli, assigning them to corresponding model layers to effectively capture unimodal and multimodal emotional perceptions. Experiments conducted on the CREMA-D database demonstrate that our approach achieves superior macro and weighted F1-scores compared to conventional methods. Furthermore, we assess model calibration, performance bias, and fairness across demographic factors such as age, gender, and race, highlighting the robustness and equitable behavior of the proposed system. Our findings underscore the importance of incorporating multimodal annotation variability in training to better reflect human emotion perception and improve the accuracy and fairness of AVER systems.
58. Small Object Detection Method for UAV Remote Sensing Images Based on αS-YOLO
Accurate detection of small objects in remote sensing imagery, especially for unmanned aerial vehicle (UAV) applications, remains challenging due to difficulties in precise localization and capturing global feature dependencies while maintaining a lightweight network. To address these issues, we propose αS-YOLO, a novel object detection framework built on the YOLO architecture. αS-YOLO introduces a new cross-convolution design with two filters: a global context module and an efficient channel attention module. This combination enhances the modeling of long-range dependencies by aggregating global contextual information at the pixel level, all while reducing network parameters to preserve a lightweight structure. Additionally, we develop α-SIOU, a novel loss function featuring an adaptive angular control coefficient that dynamically adjusts the distance loss based on angular variations between predicted and ground truth bounding boxes. This adaptive mechanism improves small object localization accuracy by focusing the training on the most critical directions. Experiments on the VisDrone-DET2019 dataset demonstrate that αS-YOLO outperforms the state-of-the-art YOLOv8, achieving improvements of 1% in mAP and 1.1% in mAP50 for small vehicle detection. These results highlight the framework’s effectiveness in balancing accuracy and efficiency for UAV-based small object detection.
59. Continuous Sign Language Recognition With Multi-Scale Spatial-Temporal Feature Enhancement
Continuous Sign Language Recognition (CSLR) plays a vital role in bridging communication gaps for hard of hearing and mute individuals by translating gestures into natural language. Existing CSLR methods often underutilize fine-grained continuous frame information and lack effective multi-scale feature integration during decoding. To address these challenges, we propose STNet, a spatial-temporal feature-enhanced network tailored for CSLR tasks. STNet introduces a spatial resonance module based on the optimal transport algorithm to extract global spatial features across adjacent frames, enhancing continuous frame information capture. Additionally, a novel frame-wise loss preserves distinctive features of each frame, improving representation. On the decoder side, a multi-temporal perception module is designed to facilitate multi-scale feature fusion, enabling frames to attend over larger temporal ranges for richer contextual understanding. Extensive evaluations on three benchmark datasets—PHOENIX14, PHOENIX14-T, and CSL-Daily—show that STNet consistently outperforms state-of-the-art methods, achieving a significant 2.9% improvement in recognition accuracy. Ablation studies confirm the effectiveness of each proposed module. STNet lays a strong foundation for real-world applications such as sign language education and communication tools, advancing CSLR research and practical deployment.
60. Adversarial Network-Based Classification for Alzheimer’s Disease Using Multimodal Brain Images: A Critical Analysi
Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that represents a significant and growing public health challenge. This work concisely summarizes AD, encompassing its pathophysiology, risk factors, clinical manifestations, diagnosis, treatment, and ongoing research. The main goal of managing AD is to reduce symptoms while improving the lives of those impacted. This letter has conducted a systematic review to analyze the prediction of AD using the Preferred Reporting Item for Systematic Review and Meta-Analysis (PRISMA) guidelines. The major scientific databases such as Scopus, Web of Science (WoS), and IEEE Xplorer are explored, where 2018–2023 publications are considered. The article selection process is based on keywords like “Alzheimer’s disease,” “Brain Images,” “Deep Learning (DL),” etc. After rigorous analysis, 946 articles were extracted, and 42 were identified for final consideration. Further, several investigations based on the previous work are discussed along with its Proposed Solutions (PS). Finally, a case study on AD detection using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset and AD Detection Network (ADD-NET) implementation is presented.
61. An End-to-End Concatenated CNN Attention Model for the Classification of Lung Cancer with XAI Techniques
Deep learning (DL) has significantly advanced cancer detection and classification in medical imaging, addressing the need for accurate and efficient diagnosis. Lung cancer detection, in particular, requires precise localization of cancerous regions to guide treatment decisions. Manual diagnosis is time-consuming and depends heavily on expert radiologists, limiting scalability amid rising case numbers. To overcome these challenges, we propose an end-to-end concatenated Convolutional Neural Network (CNN) attention model for automatic lung cancer classification. This model integrates two CNN architectures followed by a multi-layer perceptron (MLP) and a multi-head attention (MHA) mechanism to improve feature representation and classification accuracy. Additionally, explainable AI techniques such as gradient-weighted class activation mapping (grad-CAM) and Shapley additive explanations (SHAP) are employed to highlight image regions influencing the model’s predictions, enhancing interpretability. Evaluated on a comprehensive lung cancer dataset, the model achieves outstanding performance with 99.54% accuracy, 99.31% precision, 99.95% recall, 99.66% F1-score, and 99.97% AUC. This approach not only outperforms existing methods but also offers a reliable and interpretable diagnostic tool, reducing the need for manual intervention and enabling faster, more precise lung cancer detection, ultimately supporting timely and effective patient care.
62. Copy-Move Forgery Detection Technique Using Graph Convolutional Networks Feature Extraction
This study proposes a deep learning model that analyzes images of individual peripheral blood cells to diagnose and classify different types of leukemia. By leveraging convolutional neural networks (CNNs), the system extracts detailed morphological features from single-cell images, enabling precise identification of leukemia subtypes. This approach offers a rapid, non-invasive, and accurate alternative to traditional multi-cell and lab-intensive diagnostics.
63. Achieving Faster and Smarter Chest X-Ray Classification With Optimized CNNs
X-ray imaging is essential in medical diagnostics, particularly for identifying anomalies like respiratory diseases. However, building accurate and efficient deep learning models for X-ray image classification remains challenging, requiring both optimized architectures and low computational complexity. In this paper, we present a three-stage framework to enhance X-ray image classification using Neural Architecture Search (NAS), Transfer Learning, and Model Compression via filter pruning, specifically targeting the ChestX-Ray14 dataset. First, NAS is employed to automatically discover the optimal convolutional neural network (CNN) architecture tailored to the ChestX-Ray14 dataset, reducing the need for extensive manual tuning. Subsequently, we leverage transfer learning by incorporating pre-trained models, which enhances the model’s generalizability and reduces dependency on large volumes of labeled X-ray data. Finally, model compression through filter pruning, driven by evolutionary algorithms, trims redundant parameters to improve computational efficiency while preserving model accuracy. Experimental results demonstrate that this approach not only boosts classification accuracy on the ChestX-Ray14 dataset but also significantly reduces model size, making it suitable for deployment in resource-constrained environments, such as mobile and edge devices. This framework provides a practical, scalable solution to improve both the accuracy and efficiency of medical image classification.
64. Comprehensive Lung Disease Detection Using Deep Learning Models and Hybrid Chest X-ray Data with Explainable AI
Advanced diagnostic instruments are crucial for the accurate detection and treatment of lung diseases, which affect millions of individuals globally. This study examines the effectiveness of deep learning and transfer learning models using a hybrid dataset, created by merging four individual datasets from Bangladesh and global sources. The hybrid dataset significantly enhances model accuracy and generalizability, particularly in detecting COVID-19, pneumonia, lung opacity and normal lung conditions from chest X-ray images. A range of models, including CNN, VGG16, VGG19, InceptionV3, Xception, ResNet50V2, InceptionResNetV2, MobileNetV2 and DenseNet121 are applied to both individual and hybrid dataset. The results show superior performance on the hybrid dataset, with VGG16, Xception, ResNet50V2 and DenseNet121 each achieving an accuracy, precision, recall and f1-score of 99%. This consistent performance across the hybrid dataset highlights the robustness of these models in handling diverse data while maintaining high accuracy. To understand the models implicit behavior, explainable AI techniques are employed to illuminate their black-box nature.
65. AI-Powered IoT: A Survey on Integrating Artificial Intelligence With IoT for Enhanced Security, Efficiency, and Smart Applications
The Internet of Things (IoT) has revolutionized connectivity by linking everyday physical objects to the internet, enabling seamless communication and smarter services. The integration of Artificial Intelligence (AI) with IoT, known as AI-enabled IoT (AIoT), enhances this ecosystem by enabling intelligent data analysis and automation, simplifying complex tasks with improved efficiency. This paper surveys the foundational architectures of IoT and AIoT, highlighting their roles in building robust, scalable networks. Emphasizing security, the study reviews state-of-the-art machine learning (ML) and deep learning (DL) approaches tailored for IoT environments, including anomaly and intrusion detection, authentication, access control, DDoS attack prevention, and malware analysis. Additionally, the paper explores how AIoT optimizes network performance and secures infrastructures while addressing emerging challenges. Cutting-edge technologies such as blockchain, 6G-enabled AIoT, federated learning, and hyperdimensional computing are discussed for their transformative potential across various applications, including healthcare and autonomous systems. This comprehensive overview demonstrates how AIoT is poised to drive innovation and resilience in connected systems, ensuring smarter, safer, and more efficient IoT deployments.
66. Enhancing Chronic Disease Prediction in IoMT-Enabled Healthcare 5.0 Using Deep Machine Learning: Alzheimer’s Disease as a Case Study
Chronic disease significantly affects health on a global scale. Deep machine learning algorithms have found widespread application in the diagnosis of chronic diseases. Early diagnosis and treatment reduce the chance of a disease getting worse and, as a result, raise related mortality. The main objective of this work is to present a deep machine learning-based approach that provides better results in terms of accuracy. These findings have significance for tailored healthcare 5.0, enabling healthcare professionals to predict chronic disease more efficiently. A comparative examination of the most recent methods has been provided in our work reveals that it might be more advantageous to use the proposed model in which segmentation of the MRI is performed using U-net architecture and then classification is done using transfer learning for chronic disease prediction. Our proposed model provides 96.06% accuracy, it advances our understanding of deep machine learning’s potential for chronic disease prediction and emphasizes the need to tailor model selection to specific disease types using data from IoMT enabled devices. In order to make advanced improvement in the field of healthcare 5.0, future studies should focus on refining these models and investigating how well they work with a wider range of datasets.
67. D-DDPM: Deep Denoising Diffusion Probabilistic Models for Lesion Segmentation and Data Generation in Ultrasound Imaging
The Denoising Diffusion Probabilistic Model (DDPM) has gained significant attention for its powerful image generation and segmentation capabilities, particularly in biomedical applications where accuracy is critical. In breast cancer detection, ultrasound imaging is widely used due to its safety, affordability, and non-ionizing nature. However, the inherent challenges of ultrasound data, such as noise and artifacts, make accurate tumor segmentation difficult, often leading to misdiagnosis. We propose a novel Deep Denoising Probabilistic Diffusion Model (D-DDPM) designed to enhance tumor segmentation in breast ultrasound images to address these limitations. Our model incorporates a nested U-Net architecture with Residual U-blocks (RSU), significantly improving feature learning and segmentation precision. In addition to performing segmentation, D-DDPM generates synthetic data, augmenting existing real datasets to improve data size with a diverse range of high-quality samples. We validated D-DDPM on several breast ultrasound datasets, comparing its performance to state-of-the-art methods. The proposed D-DDPM achieves a Dice score improvement of 2.26%, 4.24%, and 5% over the runner-up model, demonstrating superior performance on all BUS datasets. Both qualitative and quantitative results demonstrate the ability of D-DDPM to deliver more accurate and reliable segmentation results, offering promising potential to improve clinical decision-making in cancer diagnosis.
68. Facial Emotion Recognition (FER) Through Custom Lightweight CNN Model: Performance Evaluation in Public Datasets
Facial Emotion Recognition (FER) remains a challenging yet vital task across various applications such as human-computer interaction and affective computing. Traditional deep learning methods like Convolutional Neural Networks (CNNs) offer strong performance but often demand significant computational resources, limiting their practical use in real-time and resource-constrained environments. To address this, we propose a lightweight CNN model named Custom Lightweight CNN-based Model (CLCM), inspired by the MobileNetV2 architecture. CLCM significantly reduces parameters (2.3 million) compared to MobileNetV2 (3.5 million) and ShuffleNetV2 (3.9 million) while maintaining competitive accuracy. Evaluations on four public datasets—FER-2013, RAF-DB, AffectNet, and CK+—demonstrate that CLCM achieves comparable or better results than existing lightweight models. For example, on FER-2013, CLCM reached 63% accuracy versus MobileNetV2’s 58%, and on RAF-DB, it achieved 84% compared to MobileNetV2’s 73%. The model’s reduced computational demand and reliable performance make it well-suited for real-world applications such as real-time driver emotion detection, medical assessments, and care for vulnerable individuals, where efficient, fast, and accurate FER is critical.
69. Helmet And Number Plate Detection Using Deep Learning
In India, six two-wheeler riders die every hour in road accidents. Also, we have seen that during this pandemic people wear masks, and to avoid congestion they do not wear helmets which attracted our concern and we decided to work on a project where these helmetless people can be penalized for violating traffic rules. To achieve an efficient helmet detection model, we have used the YOLOv5 object detection model using transfer learning. Further to check whether the biker is wearing a helmet or not we are using two methods, one being checking for overlapping between bounding boxes and the second method is, checking if a helmet exists in the specified range of coordinates above the motorcycle. Our model gives a mAP of 0.995 and to the best of our knowledge, we used overlapping methods for interlinking objects for finding the person not wearing a helmet. For number plate recognition we are using EasyOCR.
70. Enhancing Plant Disease Detection Using Attention-Augmented Residual Networks and Faster Region–Convolutional Networks
Rapid and accurate detection of plant diseases is crucial for agricultural productivity and food security. Traditional methods are labor-intensive and often unreliable. To overcome these limitations, this research introduces an innovative approach that integrates attention mechanisms into residual networks (ResNets) and utilizes Generative Adversarial Networks (GANs) for data augmentation. The method incorporates Attention-Augmented Residual Networks (AARN), which enhance feature extraction and classification by focusing on critical image regions. A Conditional GAN (cGAN) generates synthetic images of diseased and healthy plants, increasing dataset diversity. By combining AARN with Faster Region-Convolutional Neural Network (Faster-RCNN), detection capabilities are further enhanced. Training the AARN model on this augmented dataset improves generalization, achieving an impressive 98.78% accuracy in plant disease classification. The attention-augmented residuals boost the Faster-RCNN’s effectiveness by 23.84%, improving feature relevance and reducing overfitting. Comparative analysis shows that this method outperforms existing techniques in accuracy, precision, recall, and F1-score, offering a robust solution for plant disease detection. This integration of advanced deep learning techniques significantly improves automated plant disease identification, benefiting agricultural management practices.
71. Next-Generation Automation in Neuro-Oncology: Advanced Neural Networks for MRI-Based Brain Tumor Segmentation and Classification
Brain tumors pose a critical healthcare challenge due to their complexity and lethal potential, requiring timely and accurate diagnosis for effective treatment. Magnetic Resonance Imaging (MRI) is the standard diagnostic tool but depends heavily on expert manual analysis, which can be time-consuming and inconsistent. To address these challenges, we propose a novel multi-task learning framework employing advanced neural network architectures to automate brain tumor detection, segmentation, and classification from MRI scans. Our approach aims to streamline diagnosis, reduce manual intervention, and provide rapid, reliable results to facilitate earlier treatment. We evaluated three neural architectures—UNet, Attention-UNet, and Residual-Attention-UNet—on benchmark datasets. The Residual-Attention-UNet outperformed others with superior segmentation accuracy and classification precision, achieving a Jaccard Similarity Index of 89.30%, Dice Coefficient of 91.10%, and overall accuracy of 93.35%. For binary classification, it attained 98.60% precision, 98.06% recall, 99.40% accuracy, and an F1 score of 96.57%. In multiclass tasks, the model consistently exceeded 95% across all metrics, demonstrating robustness. These results validate the effectiveness of our multi-task learning approach in enhancing brain tumor diagnosis accuracy and efficiency.
72. PDCNET: Deep Convolutional Neural Network for Classification of Periodontal Disease Using Dental Radiographs
Dental health is critical to overall well-being, with dental caries being one of the most widespread health issues globally. Traditional diagnostic methods, such as visual inspections and radiographic analysis, rely heavily on expert professionals and can be time-consuming and prone to inaccuracies. To address these limitations, this study proposes a novel deep learning model, the Periodontal Disease Classification Network (PDCNET), based on Convolutional Neural Networks (CNN), for accurate diagnosis of periodontal disease using dental radiographs. The PDCNET was evaluated on two publicly available dental caries datasets, with imbalanced classes addressed using the SMOTE TOMEK technique to generate synthetic samples for minority categories. The model achieved outstanding performance, including a 99.79% AUC, 98.39% accuracy, recall, precision, and a 98.31% F1-score. Comparative analysis with six baseline pre-trained models—EfficientNet-B0, DenseNet-201, VGG-16, VGG-19, Inception-V3, and MobileNet—showed PDCNET’s superior accuracy and robustness. These results demonstrate the effectiveness of PDCNET in supporting dental disease diagnosis, offering a reliable and efficient tool to assist dentists and improve patient care.
73. Detection Of Fake Indian Currency Using Deep Learning
Counterfeit currency remains a major threat due to the high face value, durability, and ease of storage and transfer of paper money. Although hardware-based detection methods exist, they are often complex and inaccessible to ordinary individuals. This paper presents a hybrid deep learning approach combining Convolutional Neural Networks (CNN) and Residual Networks (ResNet) to accurately detect counterfeit currency. The proposed system processes scanned currency images through stages of image preprocessing, feature extraction, and classification. Preprocessing enhances image quality and reduces noise, while CNN and ResNet architectures extract critical features such as width, colors, and serial numbers to distinguish genuine notes from counterfeit ones. Focusing on Indian currency, the system demonstrates strong applicability in banking and other sectors where counterfeit notes pose a rising concern. The deep learning model is optimized to operate on handheld devices such as smartphones and tablets, enabling convenient and real-time detection. Experimental results show the system achieves an impressive testing accuracy of approximately 98.3%, highlighting its potential as an accessible, efficient, and accurate tool for counterfeit currency identification.
74. uFOIL: An Unsupervised Fusion of Image Processing and Language Understanding
In academic institutions, processing and evaluating documents such as exam scripts remains a labor-intensive process susceptible to human error. Traditional digitization systems face significant challenges in handling the complexities of mixed handwritten and printed text and varying document structures. These challenges are exacerbated by the absence of annotated datasets due to privacy concerns, particularly in contexts involving sensitive exam evaluations. To address these issues, this study introduces uFOIL, an unsupervised ensemble-based framework that integrates advanced image and language processing techniques to automate the extraction and validation of key information. The framework employs a majority voting mechanism that combines four state-of-the-art optical character recognition systems. Furthermore, a transformer architecture is incorporated to enhance contextual understanding and the structured formatting of extracted text that follows a post-processing confidence scoring mechanism. The proposed framework achieves high performance, with accuracies of 95.77% and 96.48% for student names and IDs, respectively; and 95.07% for total mark validation based on a dataset of exam script samples. Additionally, experiments on the benchmark ICDAR2013 dataset suggest the framework’s strong applicability achieving precision, recall, and F1-scores of 95.89%, 97.86%, and 96.87%, respectively.
75. Machine Learning-Based Cardiovascular Disease Detection Using Optimal Feature Selection
Cardiovascular disease (CVD) remains a leading cause of global mortality, responsible for approximately 19.1 million deaths in 2022, according to the World Health Organization. Early detection is critical to reduce fatalities, with electrocardiogram (ECG) signals widely used for automated diagnosis through machine learning. However, selecting optimal features from ECG data presents challenges that impact predictive accuracy. To address this, we propose a scalable machine learning architecture for early CVD detection based on advanced feature selection methods. The system incorporates data collection, storage, and processing modules and employs feature selection techniques such as Fast Correlation-Based Filter (FCBF), Minimum Redundancy Maximum Relevance (MrMR), Relief, and Particle Swarm Optimization (PSO) to identify the most relevant features from ECG signals. Classifiers including Extra Trees and Random Forest trained on these optimized features achieve impressive accuracy rates of 100%. Comparative analyses with existing state-of-the-art methods on both small and large datasets demonstrate the superiority of the proposed approach. This architecture offers a promising solution to enhance timely diagnosis and treatment, potentially reducing CVD-related mortality and improving patient outcomes in advanced healthcare systems.
76. An Improved Framework for Detecting Thyroid Disease Using Filter-Based Feature Selection and Stacking Ensemble
Rapid and accurate detection of plant diseases is crucial for agricultural productivity and food security. Traditional methods are labor-intensive and often unreliable. To overcome these limitations, this research introduces an innovative approach that integrates attention mechanisms into residual networks (ResNets) and utilizes Generative Adversarial Networks (GANs) for data augmentation. The method incorporates Attention-Augmented Residual Networks (AARN), which enhance feature extraction and classification by focusing on critical image regions. A Conditional GAN (cGAN) generates synthetic images of diseased and healthy plants, increasing dataset diversity. By combining AARN with Faster Region-Convolutional Neural Network (Faster-RCNN), detection capabilities are further enhanced. Training the AARN model on this augmented dataset improves generalization, achieving an impressive 98.78% accuracy in plant disease classification. The attention-augmented residuals boost the Faster-RCNN’s effectiveness by 23.84%, improving feature relevance and reducing overfitting. Comparative analysis shows that this method outperforms existing techniques in accuracy, precision, recall, and F1-score, offering a robust solution for plant disease detection. This integration of advanced deep learning techniques significantly improves automated plant disease identification, benefiting agricultural management practices.
77. Multi-Modal Biometric Authentication: Leveraging Shared Layer Architectures for Enhanced Security
In this study, we introduce a novel multi-modal biometric authentication system that integrates facial, vocal, and signature data to enhance security measures. Utilizing a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), our model architecture uniquely incorporates dual shared layers alongside modality-specific enhancements for comprehensive feature extraction. The system undergoes rigorous training with a joint loss function, optimizing for accuracy across diverse biometric inputs. Feature-level fusion via Principal Component Analysis (PCA) and classification through Gradient Boosting Machines (GBM) further refine the authentication process. Our approach demonstrates significant improvements in authentication accuracy and robustness, paving the way for advanced secure identity verification solutions.
78. Randomization-Driven Hybrid Deep Learning for Diabetic Retinopathy Detection
Diabetic retinopathy (DR), a serious complication of diabetes, significantly increases the risk of vision impairment, making early detection vital to prevent irreversible vision loss. Current diagnostic approaches face challenges such as resource constraints, inconsistent accuracy, and limited accessibility, particularly in underserved areas. This study proposes a novel framework combining Multi-Scale Discriminative Robust Local Binary Pattern (MS-DRLBP) features with a hybrid Convolutional Neural Network-Radial Basis Function (CNN-RBF) classifier to improve DR detection. Drawing on randomization-based learning principles, the approach integrates stochastic modeling within the CNN-RBF architecture to optimize feature extraction and classification, benefiting from efficient non-iterative training. Advanced preprocessing techniques, including enhanced noise reduction, morphological operations, and Otsu’s thresholding for precise blood vessel segmentation, further enhance diagnostic accuracy. The proposed method outperforms traditional techniques across multiple publicly available datasets, achieving a precision of 96.10%, sensitivity of 95.35%, specificity of 97.06%, and overall accuracy of 96.10%. This work contributes to medical imaging by demonstrating the efficacy of hybrid, randomization-inspired neural networks for accurate and accessible DR diagnosis, offering promising potential to reduce the global burden of diabetic vision loss.
79. MoEE: Mixture of Emotion Experts for Audio-Driven Portrait Animation
Recent advances in talking avatar generation have achieved impressive audio synchronization, yet generating lifelike talking head videos with rich emotional and facial expression diversity remains challenging. Existing methods struggle due to the lack of frameworks for modeling basic emotional expressions and insufficient datasets capturing a wide range of human emotions. To overcome these limitations, we introduce the Mixture of Emotion Experts (MoEE) model, which decomposes six fundamental emotions to enable accurate synthesis of both basic and compound emotional states. Complementing this, we present the DH-FaceEmoVid-150 dataset, specially curated to include six common emotions along with four types of compound emotions, thereby enhancing model training diversity. Additionally, we propose an emotion-to-latents module that integrates multimodal inputs—such as audio, text, and emotion labels—allowing flexible and fine-grained control of emotional expressions, including control solely via audio. Extensive evaluations demonstrate that the MoEE framework, combined with the DH-FaceEmoVid-150 dataset, achieves superior performance in generating complex emotional expressions and subtle facial details, setting a new standard in talking avatar synthesis. The dataset will be made publicly available to support further research.
80. Enhancing EfficientNet-YOLOv4 for Integrated Circuit Detection on Printed Circuit Board (PCB)
Automated visual inspection of printed circuit boards (PCBs) is essential for ensuring product quality and functionality during manufacturing, but detecting integrated circuits (ICs) remains challenging due to varying component sizes, types, and complex board markings. This study proposes an enhanced EfficientNet-YOLOv4 algorithm tailored specifically for IC detection on PCBs. Key improvements include replacing YOLOv4’s original backbone with the powerful EfficientNetv2-L feature extractor, along with extensive hyperparameter tuning such as optimized loss functions and anchor sizes. The approach also integrates diverse data augmentation techniques to improve model generalization across complex PCB layouts and varying lighting conditions. Experimental results demonstrate the robustness and effectiveness of the proposed EfficientNetv2-L-YOLOv4, achieving an F1-score of 99.22 with an inference speed of 0.14 seconds per image. Compared to EfficientNet-B7-FasterRCNN and the original YOLOv4, the model delivers superior accuracy and competitive speed. These findings emphasize the importance of advanced feature extraction networks in object detection tasks. The proposed method not only addresses IC detection challenges but also advances automated inspection technology, offering significant potential to streamline manufacturing by reducing reliance on manual inspection.
81. FsrGAN: A Satellite and Radar-Based Fusion Prediction Network for Precipitation Nowcasting
Precipitation nowcasting, which predicts small-scale precipitation events within 0 to 2 hours, is critical for mitigating impacts on daily life and human activities. Existing deep learning models primarily rely on single radar echo data, limiting their ability to capture complex and rapidly changing precipitation dynamics. To address this, we propose a two-stage fusion network named FsrGAN that integrates meteorological satellite and radar data for enhanced precipitation forecasting. The first stage, FsrNet, uses an encoder-fusion-decoder architecture incorporating a novel spatial-channel attention (SCA) mechanism to effectively filter and fuse multisource, multiscale features. The second stage, FusionGAN, leverages a generative adversarial network to refine and sharpen radar predictions by mining complementary information from satellite imagery. Experiments on the Yangtze River Delta meteorological dataset demonstrate that FsrGAN outperforms traditional optical flow-based methods and advanced deep learning approaches such as ConvLSTM, ConvGRU, TrajGRU, and PredRNN++ in both image quality and forecasting accuracy. Notably, our fusion model effectively predicts convective initiation, showcasing the benefits of multisource data integration for improving short-term precipitation nowcasting.
82. AI in Endoscopic Gastrointestinal Diagnosis: A Systematic Review of Deep Learning and Machine Learning Techniques
Gastrointestinal (GI) diseases are among the most prevalent worldwide, and early detection is crucial for reducing mortality. Endoscopy remains the gold standard for diagnosing and managing disorders in both the upper and lower GI tracts, enabling tissue biopsies for cancer detection, Helicobacter pylori infection identification, and polyp removal via colonoscopy. This systematic review surveyed 33 research papers from databases including PubMed, Scopus, Google Scholar, and IEEE Explore, published up to May 2023. It provides comprehensive insights for clinicians and researchers by analyzing machine learning (ML) techniques—such as preprocessing, segmentation, feature extraction, and classification—applied to GI disease diagnosis. Furthermore, deep learning (DL) methods, including transfer learning, convolutional neural networks (CNNs), optimization strategies, transformers, and reinforcement learning, are evaluated, with CNNs identified as the most widely used architecture. The review also discusses specialized GI research fields and offers technological guidance for future studies. Overall, this work expands the understanding of current AI methodologies in gastroenterology and serves as a valuable reference for developing and evaluating AI models in GI disease diagnosis.
83. Segmentation and Classification of Interstitial Lung Diseases Based on Hybrid Deep Learning Network Model
Interstitial lung diseases (ILD) are diverse diseases that share pathological, radiological, and clinical traits and involve interstitial fibrosis and inflammation. These have a significant impact on lung disease morbidity and mortality. From the lung High-Resolution Computed Tomography (HRCT) image, the region of interest (ROI) had to be manually identified for most of the early ILD classification investigations, which was time-consuming. Additionally, the clinical signs of various disorders are identical, which makes precise detection difficult. In recent studies, outstanding results were achieved in categorizing medical photos using deep learning techniques. For ILD classification, a hybrid deep learning network model has been developed in this research. The lung portion of the HRCT images was initially segmented using an improved U-Net++ model. The multi-scale improved U-Net++ module has been applied for effective lung segmentation with lung anomalies. The segmented lung image’s features were extracted for categorization in the second stage using a Refined Attention Pyramid Network (RAPNet). Then, we developed a MobileUNetV3 to classify five ILD classes. The ILD database is used to test the proposed approach. Due to the stage-by-stage improvement in the DL method performance, the proposed hybrid deep learning network model’s performance has significantly increased.
84. A Lesion-Based Diabetic Retinopathy Detection Through Hybrid Deep Learning Model
Diabetic retinopathy (DR) is a leading cause of blindness worldwide, affecting approximately 191 million people due to diabetes-related damage to retinal blood vessels. While previous studies have focused primarily on early-stage lesions such as exudates, aneurysms, hemorrhages, and blood vessels, severe-stage lesions—including cotton wool spots, venous beading, advanced intraretinal abnormalities, and retinal pigment epithelium damage—have received less attention. This study proposes a deep learning framework for classifying DR severity levels using retinal fundus images. The approach combines GoogleNet and ResNet models optimized by an adaptive particle swarm optimizer (APSO) to enhance feature extraction. The features extracted by this hybrid model are subsequently classified using various machine learning algorithms, including random forest, support vector machine, decision tree, and linear regression. Experimental results on a benchmark dataset demonstrate that the proposed framework achieves a high classification accuracy of 94%, outperforming existing methods. The model also shows improvements in precision, recall, and F1 score across different DR severity levels, indicating its potential for more comprehensive and accurate DR diagnosis.
85. Thyroid Nodule Ultrasound Image Segmentation Based on Improved Swin Transformer
To address the issue of inaccurate segmentation caused by blurred edges and strong noise in thyroid nodule ultrasound images, an image segmentation method based on an improved Swin Transformer is proposed. First, depthwise convolutional layers are integrated into the encoder/decoder of the Swin Transformer to enhance global-local feature representations. Second, a multi-scale feature fusion module is introduced through skip connections between the encoder and decoder to improve information flow and feature integration. Additionally, a multi-level patch embedding convolution is designed to enable layer-by-layer feature extraction from coarse to fine levels. Experimental results show that the proposed method achieves superior segmentation accuracy compared to state-of-the-art methods such as Attention U-Net, with Dice scores of 82.26% and 78.64% and IoU values of 73.00% and 67.93% on the TN3K and DDTI datasets, respectively.
86. Vehicle Detection and Tracking Based on Improved YOLOv8
With the rapid increase in transportation demand, efficient traffic recognition and tracking systems are essential. Traditional approaches often face challenges such as heavy model weights and limited detection accuracy in complex traffic scenarios. To address these issues, we propose a novel method based on YOLOv8n. Our approach introduces SCC_Detect, leveraging SCConv in the detection head to reduce redundant feature computations. We also replace standard convolutional kernels with dual convolutional kernels to build a lightweight deep neural network. Additionally, the Focaler-EIoU loss function is incorporated to enhance detection accuracy. For tracking, the BotSORT algorithm is embedded during inference, providing more accurate and stable recognition and tracking results in traffic scenes. Experimental results on the UA-DETRAC dataset demonstrate that our model reduces parameters and weight by approximately 36.5% and 25%, respectively, with only a minor 0.2% drop in mAP@0.5 compared to YOLOv8n. Furthermore, BotSORT outperforms DeepSORT and ByteTrack in tracking metrics such as MOTA, IDF1, and MOTP, improving accuracy and reducing lost tracks. The proposed method shows strong potential for practical traffic detection and deployment applications.
87. Vitamin Deficiency Detection Using Image Processing and Neural Network
This paper presents a cost-free, AI-based smartphone application designed to detect vitamin deficiencies in humans through images of specific body organs. Traditional vitamin deficiency detection relies on expensive laboratory tests, whereas many deficiencies manifest visually in areas such as the eyes, lips, tongue, and nails. The application enables users to self-diagnose potential vitamin deficiencies by analyzing photos of these body parts, eliminating the need for blood samples. Upon detection, it suggests nutritional sources to address the deficiencies and prevent related complications through targeted micro-nutritional correction. The AI model is trained to accurately identify and differentiate various vitamin deficiencies by recognizing structural changes in tissue imagery linked to nutritional deficits. Additionally, the platform supports collaboration with medical professionals who can contribute and verify patient image data, enhancing detection accuracy and refining feature extraction beyond human diagnostic capabilities. This tool addresses a widespread global health issue stemming from insufficient nutritional awareness and offers a valuable resource for individuals and healthcare workers to improve diagnosis and nutritional interventions.
88. Residual Channel-attention (RCA) network for remote sensing image scene classification
High-resolution remote sensing (HRRS) image scene classification is critical for many applications, yet traditional convolutional neural networks (CNNs) struggle to capture complex semantic relationships across varying scales and long-distance feature dependencies inherent in HRRS images. They also face challenges handling substantial intra-class variation and inter-class similarity. To address these limitations, we propose a novel Residual Channel-attention (RCA) network that integrates a lightweight residual structure for enhanced multi-scale spatial feature extraction and a channel attention mechanism to selectively emphasize relevant features while suppressing irrelevant ones. Additionally, a squeeze-and-excitation (SE) module is incorporated to enable self-attention, further refining the network’s focus on critical image regions and reducing background noise. Evaluations on three public datasets—RSSCN7, PatternNet, and EuroSAT—demonstrate superior classification accuracies of 97%, 99%, and 96%, respectively, outperforming state-of-the-art methods. Visualization via Grad-CAM++ confirms the effectiveness of the channel attention mechanism and the RCA network’s strong feature representation, highlighting its potential for advancing HRRS image classification.
89. Automatic Player Face Detection and Recognition for Players in Cricket Games
In this paper, we have developed an augmented reality cricket broadcasting application that uses player face recognition during play to display player personal data. The system utilizes the AdaBoost algorithm for player and player face detection, employing a PAL based face recognition model to recognize the faces of players on the field. The system is trained on a large dataset of cricket game footage and achieves high accuracy in detecting and recognizing players’ faces even with several conditions such as occlusion, non-uniform illumination, expression and pose variation. The system has the potential to enhance the viewing experience of cricket games by providing real-time player identification and statistics. The system can also be used in other sports to provide similar benefits. The paper discusses the system’s methodology, results, and implications for the future of sports broadcasting. Overall, the system provides a promising solution for automatic player face detection and recognition in sports broadcasting.
90. Underwater image enhancement via multiscale disentanglement strategy
Underwater images often suffer from color distortion, low contrast, and diminished details due to light absorption and scattering. This study proposes an advanced underwater image enhancement algorithm combining adaptive color correction and an improved Retinex approach to address these challenges. Utilizing the UIEB dataset, the method first applies adaptive color correction to neutralize blue-green bias, followed by image decomposition and enhancement via Non-Subsampled Shearlet Transform (NSST) and Retinex-based techniques. Performance is evaluated using metrics like PCQI, UCIQE, UIQM, and Information Entropy (IE), with SIFT-based feature matching validating structural preservation. Results demonstrate superior performance over state-of-the-art methods in perceptual quality and detail enhancement. The implementation leverages Python (Spyder IDE), MySQL (WAMP Server), and Flask for front-end deployment, offering a robust solution for underwater image restoration.
91. Key Intelligent Pesticide Prescription Spraying Technologies for the Control of Pests, Diseases, and Weeds: A Review
This project develops a Plant Disease Detection, Prevention, and Pesticides Recommendation System using machine learning techniques. It employs transfer learning with the ResNet50 convolutional neural network and feature extraction using MobileNetV2 combined with a Random Forest classifier to accurately identify 38 classes of plant leaf diseases from images. The system includes image preprocessing steps for consistent input and uses evaluation metrics like accuracy and confusion matrices to validate performance. Additionally, it integrates Google’s Generative AI to provide detailed disease descriptions and recommend context-specific pesticides and integrated pest management strategies, enabling effective and sustainable disease control for improved crop health.
Topic Highlights
Image Processing Projects for Students
Being an Engineering Projects is a must attained one in your final year to procure degree. In fact, image processing projects is one of the best platform to give a shot. Because it is easy to understand the discipline. Therefore, Elysium Pro ECE Final Year Projects gives you better ideas on this field.
ElysiumPro Final Year Projects
To determine, the Image Processing is nothing but the use of computer algorithm to act on the image segmentation projects digitally. So, that one can extract the Information from that image for further use. Nowadays, every techniques are incorporated or impacted by Signal Processing Projects. Some of the common applications are in the Medical stream, Color and video processing. As well as, remote sensing, transmission and encoding process.

