•  
  •  
 

Document Type

Original Study

Keywords

Computer Engineering

Abstract

The k-Nearest Neighbors (kNN) algorithm is widely used for classification due to its simplicity and effectiveness. However, its computational cost remains a significant challenge, particularly for embedded systems with limited processing power and memory. To address this Issue, we propose the Group-Based Sample Partitioning (k²NN) Algorithm, which introduces a two-phase approach to reduce computational complexity while maintaining classification accuracy. In the first phase, the algorithm pre-groups training samples by iteratively selecting anchor points and partitioning their k-nearest neighbors, thereby reducing redundancy in the dataset. In the second phase, the test sample dynamically selects local anchor points, constructing a smaller, more relevant neighborhood for efficient classification. Experimental results using the Breast Cancer Dataset from Kaggle (KGBC) demonstrate that k²NN significantly reduces training and testing iterations while preserving high classification accuracy (95.78%), with a recall of 100%. Compared to exhaustive kNN, our approach achieves a substantial reduction in distance computations (21.79% of exhaustive kNN) without requiring additional storage. While tested on a relatively small dataset, k²NN shows promise for scalable implementation in embedded systems. Also, the proposed approach shows the computation cost reduction can reach 75.5% for larger datasets when we tested different datasets ranging from 100 to 30,000 samples. Future work will explore an extended kⁿNN framework, introducing multiple k-parameters for adaptive scaling to high-dimensional datasets while maintaining computational efficiency. https://github.com/AyadMDalloo/K2NN.

Share

COinS