Journal of Artificial Intelligence,Machine Learning and Neural Network

AI literacy in India through the lens of national policy: a four-dimensional evaluation

2026-01-22T07:43:52+00:00

As India seeks to become a global leader in Artificial Intelligence (AI), its national AI policy plays a crucial role in promoting AI literacy. However, a gap exists in the literature regarding analyses of AI policy through the lens of established AI literacy models. This study qualitatively evaluates India's national AI policy document, India AI 2023: Expert Group Report – First Edition, using a four-aspect AI literacy framework. A thematic analysis examines how the policy addresses the dimensions of understanding, application, evaluation, and ethics. Results show that the policy is in line with the four aspects of the theoretical model. The study also identifies three additional sociocultural themes, inclusion, equity, and AI for social good. Gaps that could potentially hinder the policy's ability to fully promote inclusive AI literacy, and solutions, are also discussed. The study highlights the strengths and action ability of India’s national AI policy, and it's relevance to fostering AI literacy in India.

A comparative study of cloud-native vs. edge computing architectures for real-time data processing

2026-04-03T11:09:28+00:00

The fast adoption of Internet of Things (IoT) devices, autonomous systems and latency-sensitive applications has increased the need to have effective real-time data processing architectures. The paper will provide a detailed comparative analysis of cloud-native and edge computing systems in processing real-time data with a systematic literature review (SLR) and empirical benchmarking experiments. On the basis of a PRISMA-directed review of 87 papers (22 of which have been ultimately included) found after screening, we evaluate latency, throughput, energy consumption, scalability, fault tolerance, and security profiles of both paradigms. The experimental findings show that edge computing has a mean latency of 8.3 ms compared to cloud-native deployment of 142.7 ms, and the cloud-native architecture has higher availability at 99.95% and is scaled 3.8× times horizontally. It is suggested to use a hybrid framework, combining edge inference with cloud orchestration that is 94.2% times faster and has the same cloud-grade reliability. The ANOVA, regression modelling, and multi-criteria decision analysis (MCDA) data analysis shows that the choice of the optimal architecture is determined by application specific latency tolerance (α), data locality requirements and the budget constraint in the infrastructure. These results are applicable to the system architects operating in such sectors as smart healthcare, industrial IoT, autonomous vehicles, and smart grid management.

Hybrid gradient boosting with SMOTE-augmented feature engineering for high-accuracy cardiac arrhythmia detection: a comparative supervised machine learning study

2026-05-30T09:26:40+00:00

Background: Cardiac arrhythmias are a significant problem in the world and are the cause of around 15-20% of sudden cardiac deaths each year. Electrocardiogram (ECG) signal automated detection at the right time and place is still a major challenge in clinical practice because of signal complexity, inter-patient variation and significant class imbalance in clinical data sets. Objective: This study seeks to propose and test a supervised machine learning pipeline for the automated binary classification of cardiac arrhythmias based on multi-dimensional features extracted from the ECG, which involves gradient boosting classification, data augmentation using SMOTE, feature selection using SelectKBest and systematic hyper parameter optimization using 5-fold stratified cross-validated grid search. Methods: A total of 2,000 ECG samples (970 normal and 1,030 arrhythmic) were collected, pre-processed by Z-score normalization and mean imputation, and then selected the top 12 features from 20 candidate features using chi-squared feature selection. To deal with class imbalance, SMOTE was only employed on the training partition. 6 classifiers (Gradient Boosting, Random Forest, Support Vector Machine, Decision Tree, K-Nearest Neighbors, and Logistic Regression) were trained, tuned and benchmarked using the same experimental conditions. Results: The proposed Gradient Boosting model attained a classification accuracy of 95.8%, a precision score of 96.1%, a recall score of 95.4%, F1-Score of 95.7% and AUC-ROC of 0.989, which is an improvement of 1.6–11.6 percentage points compared to the other baselines. The ablation experiments showed that each of the pipeline stages was indeed a significant contributor to the overall performance and that the combination of SMOTE and hyper parameter optimization resulted in a 5.3% F1-gain compared to the baseline configuration. Conclusion: The proposed ECG arrhythmia detection framework shows competitive performance with recent state-of-the-art ECG classifiers and offers an interpretable and computational efficient method for clinically deployable arrhythmia detection. The pipeline is generalizable to other bio-signal classification applications, and is fully reproducible using open-source code.

HAC-UML: A hybrid autoencoder-enhanced clustering framework for unsupervised anomaly detection in industrial IIoT sensor networks

2026-05-30T10:05:48+00:00

Industrial Internet-of-Things (IIoT) sensor networks generate massive, high-dimensional, and temporally correlated data streams wherein anomalous patterns often signal critical equipment failures, cyber-physical attacks, or process deviations. Conventional supervised anomaly detectors are impractical in IIoT environments due to the acute scarcity of labeled anomaly instances and the non-stationary nature of operational data. Unsupervised learning therefore represents the most tractable paradigm, yet existing methods suffer from limited representational capacity, susceptibility to the curse of dimensionality, and poor generalization across heterogeneous sensor modalities. This study proposes HAC-UML, a Hybrid Autoencoder-Enhanced Clustering framework for Unsupervised Machine Learning, designed to simultaneously learn compact latent representations of multivariate IIoT time series and perform joint deep clustering with adaptive anomaly scoring. HAC-UML integrates a Bi-directional Long Short-Term Memory (BiLSTM) autoencoder with a Deep Embedded Clustering (DEC) module trained via a composite loss function combining Mean Squared Error (MSE) reconstruction loss and Kullback–Leibler (KL) divergence-based cluster assignment loss. Anomaly scores are computed through reconstruction error thresholding at μ+3σ, complemented by cluster membership entropy analysis. Experiments were conducted on three public benchmarks: SWAT, WADI, and MSL, encompassing 87,004 multivariate sensor readings across 12 heterogeneous features. HAC-UML achieves a Precision of 0.937, Recall of 0.924, F1-Score of 0.930, and AUC-ROC of 0.963 on the SWAT benchmark, outperforming six state-of-the-art baselines including DAGMM, USAD, LSTM-AE, and OmniAnomaly by margins of 2.9%–8.3% in F1-Score. Ablation studies confirm the contribution of the joint clustering module (+4.1% F1 over AE-only) and the skip-connection mechanism (+2.3%). The proposed HAC-UML framework demonstrates strong generalizability, computational efficiency (inference latency <12 ms per window), and practical deployability on edge hardware.

ClinFormer: a multi-modal clinical transformer for explainable major adverse cardiovascular event prediction from electronic health records

2026-05-30T11:28:35+00:00

Background: Major adverse cardiovascular events (MACE), including acute myocardial infarction, stroke, and cardiovascular death, account for over 8 million deaths globally each year. Conventional prediction models such as Framingham Risk Score, SCORE2, and Pooled Cohort Equations rely on limited traditional risk factors and linear assumptions, restricting their ability to capture complex temporal and non-linear relationships within longitudinal electronic health records (EHRs). Methods: We propose ClinFormer, a multi-modal clinical Transformer designed to integrate five EHR modalities: laboratory results, diagnosis codes, medication records, clinical notes, and vital signs. The model employs cross-modal attention mechanisms with 12 attention heads and a model dimension of 512. ClinFormer was pre-trained using contrastive patient similarity learning on 127,438 patients from MIMIC-IV and externally validated on 38,924 patients from the eICU database. Model interpretability was provided through SHAP analysis and calibrated probability outputs. Results: On external validation, ClinFormer achieved an AUROC of 0.943 (95% CI: 0.937–0.949), significantly outperforming the strongest baseline model, ClinicalBERT (AUROC: 0.912; p < 0.001). Calibration performance was strong with an expected calibration error (ECE) of 0.031. SHAP analysis identified BNP, troponin I, and eGFR as the most influential predictors. Conclusions: ClinFormer provides accurate, interpretable, and well-calibrated MACE prediction directly from routinely collected EHR data, supporting its potential deployment in both resource-rich and resource-constrained clinical environments.

GABP-net: a hybrid genetic algorithm–back propagation neural network with adaptive fitness-driven weight optimization for predictive fault detection in industrial internet of things

2026-06-01T07:34:27+00:00

The Industrial Internet of Things (IIoT) environments generate massive streams of sensor data from rotating machinery, requiring highly reliable fault detection systems to prevent catastrophic failures and costly downtime. Conventional backpropagation (BP) neural networks often suffer from premature convergence to local optima, sensitivity to initial weight initialization, and poor generalization under noisy industrial conditions. To address these limitations, this study proposes a hybrid Genetic Algorithm–Backpropagation Network (GABP-Net) for intelligent fault diagnosis in IIoT applications. The proposed framework integrates a multi-objective Genetic Algorithm (GA) with an adaptive BP neural network to optimize network topology, initial weight matrices, layer-wise learning rates, and momentum coefficients simultaneously. GABP-Net employs a real-coded GA using tournament selection, blend crossover (BLX-α), and adaptive non-uniform mutation to evolve optimal neural configurations and synaptic weights. The evolved network is subsequently fine-tuned using the resilient Backpropagation (Rprop) algorithm, while isotonic-regression threshold calibration is applied to address class imbalance. Experimental evaluation was conducted on three benchmark datasets: the CWRU Bearing Fault Dataset, the PRONOSTIA Machine Degradation Dataset, and a proprietary IIoT motor dataset containing 1.2 million sensor observations. A total of 64 discriminative features were extracted through feature engineering, including time-domain statistics, frequency-domain spectral descriptors, and wavelet packet energy coefficients. The proposed GABP-Net achieved classification accuracies of 99.14%, 98.76%, and 97.83% across the three datasets, outperforming conventional BP (91.23%), PSO-BP (95.67%), Adam-DNN (96.12%), LSTM (96.45%), and CNN-LSTM hybrid models (97.21%) with statistical significance (p < 0.001). Furthermore, all fault categories obtained AUC-ROC values above 0.993. The model contains only 8,247 parameters and achieves 1.23 ms inference latency on NVIDIA Jetson AGX Xavier, demonstrating suitability for real-time IIoT edge deployment with high computational efficiency and robust generalization performance.

HAFEM: Hybrid attention-driven facial expression mapping for real-time multi-class emotion recognition in unconstrained environments

2026-06-01T10:44:06+00:00

HAFEM: Hybrid Attention-Driven Facial Expression Mapping Facial Expression Recognition (FER) is kind of a major challenge in affective computing, with uses across healthcare monitoring, human-computer interaction, autonomous systems, and surveillance. Even with the progress we still see, many current approaches fall short when occlusion shows up, lighting changes too much, classes become ambiguous, and when real time computation becomes a problem. So here we introduce HAFEM (Hybrid Attention-Driven Facial Expression Mapping), a deep learning framework that kind of meshes an EfficientNet-B5 convolutional backbone with a lightweight multi-head self-attention Transformer block, plus a Convolutional Block Attention Module (CBAM). This mixed design aims for the sweet spot between recognition quality and inference speed, and it reaches about 52 FPS on a NVIDIA RTX 3090 GPU, which is clearly over the typical 30 FPS threshold for real-time. HAFEM gets trained and evaluated on four standard benchmark datasets, FER2013, RAF-DB, AffectNet, and FERPlus. For robustness we use 68-point facial landmark alignment, a broad set of data augmentation tricks, and a compound multi-objective loss. The loss combines cross-entropy loss, center loss, and distribution-aware label smoothing, so the training is more stable in practice. For tuning the settings we run Bayesian search, and for interpretability we rely on Grad-CAM visualizations and SHAP analysis, just to see what the model actually attends to, rather than guessing. On FER2013, RAF-DB, AffectNet, and FERPlus, HAFEM reports state-of-the-art accuracies of 94.7%, 95.1%, 88.9%, and 92.4% respectively. Also, statistical checks using a paired t-test (p < 0.001) suggest HAFEM is better than all 10 competing methods, in terms of precision, recall, F1-score, and AUC, with AUC reaching 0.982. Overall, these outcomes indicate that the combination of hybrid attention components, efficient backbone choice, and compound loss strategies can effectively fix longstanding.

HATN: hierarchical adaptive transformer network for real-time medical image segmentation using hybrid CNN-ViT architecture with multi-scale attention and uncertainty-aware loss functions

2026-06-02T09:58:44+00:00

Medical image segmentation is kind of a cornerstone in modern clinical medicine, it helps with accurate volumetric tracing of anatomical structures using CT, MRI, and endoscopic imagery so clinicians can do diagnosis and treatment planning. Even with all the big improvements brought by U-Net and later ideas, three issues still show up as bottlenecks for real-world adoption: (i) the global context modeling is still not enough for long-range anatomical relationships; (ii) the skip connection feature selection is weak, so some unhelpful low-level signals can mess up decoder representations. And (iii) there is no good uncertainty quantification, which is basically a requirement before clinical teams accept AI-driven diagnostic systems. In this work, we put forward HATN (Hierarchical Adaptive Transformer Network), a hybrid CNN-ViT segmentation design. It uses a Swin-Transformer style hierarchical backbone, then applies Multi-Scale Deformable Attention (MSDA) in the bottleneck area. For skip connections, we add Multi-Scale Channel Attention (MSCA), so the network keeps more relevant details while suppressing the rest. Training uses a compound uncertainty-aware objective, L_HATN = 0.50*L_CE + 0.35*L_Dice + 0.15*L_UC.We test HATN on five well-known benchmark datasets : Synapse Multi-Organ CT, ACDC Cardiac MRI, Polyp Segmentation, ISIC Skin Lesion, and NIH Pancreas-CT. Bayesian hyper parameter selection is done with Optuna, running 120 trials total and using 5-fold cross-validation to cover variation properly. For epistemic uncertainty, we use Monte Carlo Dropout with T = 20 forward passes, giving uncertainty estimates that can be checked downstream. Results show HATN reaches 92.38% Dice and 4.9 mm HD95 on Synapse. It beats the closest competitor, which is 89.16% Dice, by 3.22 Dice points and also reduces HD95 by 3.7 mm. For cross-dataset generalization, we obtain 91.74%, 88.62%, and 90.44% Dice on ACDC, Polyp, and ISIC benchmarks respectively, and notably , all of this is without fine-tuning. During inference, the method runs at 48 FPS on an NVIDIA RTX 3090 with TensorRT FP16 optimization, hitting real time clinical thresholds. All nine baseline comparisons end up statistically significant (p < 0.001, Bonferroni corrected), no question there. Ablation studies back up each HATN piece, SHAP and Grad-CAM also show attention maps that are anatomically sensible and consistent. The full codebase plus pre-trained weights are openly released so people can more quickly do community research.

Supervised machine learning models for cancer prognosis and treatment response prediction: A systematic review of algorithm performance, feature importance, and clinical deployment

2026-06-04T07:40:41+00:00

Recently, Supervised machine learning (SML) has become an exciting paradigm in clinical oncology for building prediction models based on the available clinical, genomic, imaging and treatment data, to predict outcomes and responses to cancer treatment. Although numerous studies in SML have been published, there is no systematic evaluation of the performance of the algorithms, the extent of consistency across SML studies and algorithms, the quality of calibration, or the readiness for clinical implementation. This review aims to bridge this gap by summarising the findings of 36 studies in a variety of cancers.

Methods: We searched for the PubMed/MEDLINE, Embase, IEEE Xplore, Web of Science and the ACM Digital Library for publications between January 2018 and January 2025, according to the PRISMA 2020 guidelines and registered in PROSPERO (CRD42025412104). Cancer prognosis and/or prediction of treatment response studies that developed or had models validated by other groups were included. A five domain PROBAST-AI quality assessment was used.

Results: 36 studies with 34 different oncology datasets comprising of more than 3.1 million patient records were eligible. Random Forest was the most frequently deployed algorithm (n = 24, 67%), followed by XGBoost (n = 21, 58%) and SVM (n = 16, 44%). Median best-reported AUC was 0.908 (IQR: 0.887–0.927). In 33 out of 36 studies (92%), the SML models outperformed clinical staging and the average AUC gain was 0.108. Tumour stage and a number of key biomarkers turned out to be consistently important predictors. There were significant methodological gaps in reporting calibration in just 41% of studies.

Conclusions: Conventional oncological models are not as effective as SML models, and the SML models provide clinically meaningful performance improvements that are consistent. However, there are gaps in the prospective validation, reporting of features used for the result, as well as standardized representations of the features' importance. A checklist of 16 items (SML-ONCO-Report) is proposed to help overcome the reporting failures. The systematic results here reported are actionable to inform clinical trials for the use of SML in oncology.

QEML-Net: Quantum-enhanced machine learning for predictive maintenance in industrial IoT environments using hybrid classical-quantum neural networks

2026-06-05T06:12:03+00:00

The economic value of predictive maintenance (PdM) for industrial IoT machinery is undeniable, as unplanned equipment downtime is estimated to cost industries USD 50 billion a year worldwide. Deep learning techniques have achieved good fault classification results on benchmark datasets, but they are not robust enough to cope with noise in industrial environments, are computationally intensive for use at the edge, and are unable to make good use of the capabilities of quantum computing. In this paper, a new hybrid network, called QEML-Net (Quantum-Enhanced Machine Learning Network), is proposed to combine the ResNet-50 deep residual network with Convolutional Block Attention Modules (CBAM), variational quantum feature enhancement, and a compound hybrid loss function, for efficient fault diagnosis. The framework features a 6-layer, 4-qubit Parameterized Quantum Circuit (PQC) with angle encoding and linear CNOT entanglement that is implemented using PennyLane. Experiments were performed on six harmonized datasets from public PdM which have 57164 samples across eight unified fault categories. The framework was validated with the help of Bayesian hyperparameter optimization, 5-fold stratified cross-validation, ablation studies, and statistical testing. By optimizing the deployment on the edges, QEML-Net got 97.3% accuracy, 96.9% macro-F1 score, AUC of 0.988, and real-time inference performance with 19ms latency and 52FPS throughput on the benchmark dataset. The statistical analysis revealed significant improvement when compared to other methods (p < 0.001). Even when evaluated cross-dataset without fine-tuning, good generalization was achieved with an accuracy of 92.8–94.6%. The results reveal the advantages of combining the quantum variational circuits with deep learning to enhance the classification accuracy, interpretability, and deployment efficiency for the fault diagnosis of industrial IoT.

Pathformer: a hierarchical vision transformer for pan-cancer grade classification, survival prediction, and biomarker status inference from whole-slide histopathology images

2026-06-06T10:37:48+00:00

Computational pathology is one of the key areas of artificial intelligence (AI) that is able to assist with the analysis of large and complex whole slide images (WSIs) that visual/naked-eye analysis is typically challenging for pathologists. Current approaches for WSI analysis using deep learning, however, still present a number of limitations, such as the inability to process gigapixel WSIs as a whole, problems accounting for spatial context of WSI patches, and relying on using a single model to optimize one clinical goal. This research aims to solve these problems while presenting PathFormer, a hierarchical vision Transformer, specifically tailored for efficient and interpretable WSI analysis. PathFormer features a windowed, self-attention mechanism with 32 non-overlapping, small patches in lower layers and global attention in higher layers, with computational complexity O(N log N). The gated attention-based multiple instance learning (MIL) aggregator provides a slide-level representation for variable patch sequences (from 8,000 to 25,000 patches/slide). A total of 4,312 WSIs were used to train and validate the model; all obtained from The Cancer Genome Atlas (TCGA) spanning seven cancer types, and 1,247 WSIs were used for external validation from CPTAC and institutional cohorts. Strong performance on multiple clinical tasks was shown. PathFormer obtained a mean AUROC of 0.941 for cancer-grades classification which was significantly higher than the mean AUROC of 0.921 given by TransMIL (p < 0.01). For survival prediction, it had a mean C-index between 0.774 and 0.812, superior to SurvTRACE (0.748). The model was also trained with the same data on MSI-H status prediction with an AUROC of 0.924 and IDH1 mutation inference with an AUROC of 0.938. In addition, 79% of the activation maps produced by the model correlated with the pathologist annotations, which indicates good interpretability. In summary, PathFormer offers a single interpretable and convenient framework for computational pathology.

SC-VAEGAN: spectral-constrained variational autoencoder with generative adversarial networks for robust unsupervised deep clustering with density-aware latent representations

2026-06-08T06:10:33+00:00

This paper introduced a principled and empirically effective deep unsupervised clustering framework, called SC-VAEGAN, which integrates variational generative modeling, adversarial latent space regularization, spectral graph topological constraints, UMAP-guided initialization and contrastive auxiliary learning within a single end-to-end learnable objective. SC-VAEGAN is statistically validated using five heterogeneous datasets and extensively ablated to improve the state-of-the-art results, which are 5.0–6.0% higher than the best previous method on primary metrics. In addition to showing the application of these theoretical contributions to their particular problem, they give a general framework for future work on topology-preserving generative clustering. Although SC-VAEGAN has shown good empirical results, it has some limitations to be recognized. The k-NN graph is constructed with computational effort per batch of O(B²•d_z), where d_z is the dimensionality of the data. The approximation of nearest neighbor methods does help alleviate this computational burden. K (number of clusters) is a hyperparameter which should be specified in the model, with further work which could involve automatic determination of the cluster number through non-parametric methods still to be undertaken. Possible future directions include: (i) incorporating hyperbolic geometry for hierarchically structured data; (ii) extending to federated clustering with guarantees on differential privacy; and (iii) extending to dynamic cluster number estimation using sequential hypothesis testing on eigenspectrum gaps.

Evaluating model performance and prediction accuracy of fuzzy logic and bayesian networks with business expert inputs

2026-06-23T11:36:31+00:00

This paper evaluates and compares two expert-in-the-loop decision-support paradigms Fuzzy Logic (FL) and Bayesian Networks (BN) for barangay-level business risk assessment using the BizLocator Analytics dataset of Cauayan City, Philippines. Local governments need transparent, data-driven tools that can operate under uncertainty, sparse data, and evolving economic conditions. FL and BN are both well-established approaches for modeling uncertainty, yet they are rarely examined side by side on the same dataset with the same expert knowledge. To address this gap, the study develops parallel FL and BN models grounded in identical features and informed by the same pool of business and experts. The FL model uses expert-defined triangular and trapezoidal membership functions, together with a compact set of IF–THEN rules that encode linguistic concepts such as “Low Compliance,” “Vulnerable Barangay,” and “High Risk.” The BN model encodes expert-elicited causal relationships as a directed acyclic graph and learns conditional probability tables from data under Dirichlet priors. A unified preprocessing pipeline is applied, and nested stratified cross-validation is used to avoid optimistic bias and to support paired statistical tests. Both models are evaluated on discrimination (Accuracy, F1-score, ROC–AUC, PR–AUC) and probabilistic quality (Brier score, Expected Calibration Error, reliability diagrams). Results show that BN achieves slightly higher discrimination and notably better calibration, while FL offers superior case-level interpretability through rule and membership visualizations. Expert validation confirms that most BN edges are causally plausible and FL rules covers the majority of decisions. The findings suggest that a hybrid deployment using BN as the calibrated scoring backbone and FL as an explanation and policy-communication layer can provide accurate, transparent, and actionable decision support for local business risk governance and long-term planning. Overall, the study demonstrates how expert-guided artificial intelligence can strengthen evidence-based regulation while preserving human oversight and accountability in practice across diverse barangay.

Cotton care: YOLO-powered cotton leaf disease detection

2026-07-08T10:25:29+00:00

Early detection of plant diseases is an important factor for maintaining crop health and enhancing agricultural productivity. Early observation of symptoms of diseases, followed by accurate treatment in cotton growing, can greatly reduce loss of yield and save unnecessary pesticides. Recent developments in deep learning and computer vision have pushed the automated disease detection based on leaf images to be scalable and applicable in real-world agricultural practices. In this paper, we introduce a cotton disease detection technique with four YOLO (You Only Look Once) based object detection models: YOLOv5, YOLOv6, YOLOv8, and the latest YOLOv11. These models are trained on a generated dataset of annotated cotton leaf images capturing multiple symptoms like leaf enation, sooty mold, various curly forms, as well as healthy leaves. The purpose is to accurately classify and localize the disease-affected patches so as to enable real-time decisions in precision farming. The result showed that among the tested models, YOLOv11 presented the best performance with 98.2% precision, 99.3% recall, 99.6% mAP@0.5, and 0.78 mAP@0.5:0.95. YOLOv8, YOLOv6, and YOLOv5 performed well too. The input devices emphasize the significance of preprocessing for real-world applications to achieve robustness of the model to different lighting conditions. The results indicate that the proposed method can be used as a good and effective tool for automated cotton disease prediction and integrated pest management.