Quantum K‐Means algorithm - RPIQuantumComputing/QuantumCircuits GitHub Wiki

Quantum K-Means algorithm is an extension of classical k-means clustering algorithm into quantum mechanics, this algorithm in its classical state was mainly used for clustering data points into k numbers of distinct groups and is used mainly in the fields of data analysis, pattern recognition, and exploratory data mining, the quantum version leverages quantum computing principles to improve and provide computational advantages over its classical counterpart.

First we have to look at and understand how a classical k-means clustering algorithm works as its quantum version is built off of it, the k-means clustering algorithm is an iterative algorithm that aims to minimize the intra-cluster variance, or the representation of the sum of the distance between the data points and their assigned cluster centroids, this is similar to another algorithm, the support vector machines that also takes into account the distance between data points and their assigned hyperplane which also segregates the data sets. A step by step representation of how a k-means cluster algorithm works begins with determining the number of clusters or k, this can be done through user inputs, after getting the number of clusters, the algorithm will randomly assign k data points as a centroid and will represent as the center of the initial cluster, then each data point will be assigned to a cluster that is closest to it, this step is usually done with the Euclidean distance and the mean distance will also be calculated after this step ends to update the centroids of each cluster to better fit the data. The cluster assignment and the updating will be repeated until convergence occurs, basically when the centroids no longer change significantly or when a maximum number of iterations is reached, after convergence, the algorithm will output the final cluster result where each data point is assigned a cluster based on optimal proximity.

Quantum k-means algorithm, however, differs from classical k-means clustering algorithm, by tapping into quantum mechanics, quantum k-means algorithm is able to leverage quantum properties to become more efficient at going through data and training artificial intelligence. In the comparison between how a quantum k-means algorithm and its classical counterpart functions, there actually aren't too many differences. First in quantum k-means algorithm, the data are encoded into quantum states using a technique called quantum state preparation, each state represents each data point in a higher dimensional quantum space, the algorithm then applies techniques to manipulate these states to acquire the desired results. One of the most important components used here is the quantum distance calculation which when compared to the classic algorithm is the step where the distance is calculated, here quantum states are manipulated to calculate the distance between each data point and each centroid in a parallel and potentially more efficient manner. The algorithm starts by randomly picking quantum states as initial centroids and then quantum operations stated before are performed in order to assign data points to the nearest centroids and update the centroids based on the mean of the assigned data points, these steps are repeated until once again convergence is reached.

Quantum k-means clustering has many advantages, but one of the most prominent would be its ability to explore more data dimensions simultaneously due to the inherent parallelism in quantum computations as well as the ability to reach higher quantum dimensions thus being able to handle larger sets of inputted data at a faster and better accuracy than its classical counterparts. Overall, quantum k-means algorithm explores quantum computing principles to enhance the efficiency and effectiveness of the classical k-means clustering algorithm.