InsightsResults - UST-QuAntiL/qhana GitHub Wiki
Within this page, we try to document all the insights and results we gain from experimenting with different ML techniques. The page is split up into the four stages of the ML pipeline: Data Preparation, Feature Engineering, Clustering and Classification.
- Template
- Data Preparation
- Feature Engineering
-
Clustering
- Rot. Quantum KMeans highly depend on initial centroids
- The amount of qubits is not essential for Rot. Quantum KMeans
- Rot. Quantum KMeans only uses a handful of iterations
- Rot. Quantum KMeans do not need many circuit shots
- Rot. Quantum KMeans perform better on QASM- than on Statevector-Simulator
- Rot. Quantum KMeans centroids do not need to be standardized/normalized
- Classical KMeans performs better than rot. QKMeans with the same hyper-parameters
- Rot. Quantum KMeans performs worse when choosing data points as initial centroids
- Classification
In order to have a consistent way of describing the results, each experiment should follow the following template:
A short and catching name of the result or insight
An explanation of the preparation of the experiment.
What has been observed in the experiment - what was the outcome?
How can the observations be explained - any possible explanation is welcome.
When performing clustering with the rot. Quantum KMeans algorithms, one often ends up in having pictures like the following left one:
Obviously, from a human point of view, we would expect the right picture, i.e. having a left and right cluster. This is also the expected outcome looking at the selected dataset with 40 costumes. This behavior can be seen for all rot. Quantum KMeans algorithms. All these algorithms have once in common - they use angles in order to calculate the distance between data points. More precisely, the data points will be mapped to a unit sphere and the angle will be calculated with respect to a so-called "base-vector". In the implementation we always use the x-Axis as base-vector. Taking this into consideration, one might argue that the picture on the left side can be influenced by this base-vector.
We observe, that changing the base-vector to the y-axis more often produces pictures like the left one. However, even when using the x-axis as bae-vector we end up having the same outcome some times.
Taking the quantum circuits of the algorithms into account, we can see, that rotations around the x-axis are used. Hence, choosing the x-axis as base-vector is needed and it is therefore no hyper parameter. Because of the randomness of the outcome it can be conclude that the initial centroids are most likely responsible for pictures like the left one.
Using the datasets 25 and 40 as well as all three different kinds of rot. Quantum KMeans algorithms, we keep all parameters on default, except the amount of qubits, that will be increased from 2 to 16.
No significant difference in the clustering quality can be found.
The NegativeRotationQuantumKMeans algorithm uses only single qubits for computation. Even if more qubits are allowed, they are not entangled. The same holds true for only pairwise entangled qubits for the StatePreparationQuantumKMeans and the DestructiveInterferenceQuantumKMeans algorithms. Hence, it doesn't matter if these qubits are used in parallel or sequentially.
Using the datasets 25 and 40 as well as all three different kinds of rot. Quantum KMeans algorithms, we keep all parameters on default, except the amount of iterations, that will be increased from 10 to 20.
No significant difference in the clustering quality can be found.
If one take a look into the logging output, one can see, that the algorithms only use between 2 to 5 iterations to converge. Hence, increasing the number of iterations has no effect on the outcome.
Using the datasets 25 and 40 as well as all three different kinds of rot. Quantum KMeans algorithms, we keep all parameters on default, except the amount of shots per circuit, that will be increased from 100 to 8192.
No significant difference in the clustering quality can be found.
This result seems not intuitive as an increase of shots per circuit always increases the statistical confident. It needs further investigations to understand this behavior.
Using the datasets 25 and 40 as well as all three different kinds of rot. Quantum KMeans algorithms, we keep all parameters on default, except the Quantum Backend, that will be set to AER_QASM_Simulator and AER_Statevector_Simulator.
The outcome when using the statevector simulator shows more noise.
In order to explain this observation, the theory of a state vector simulator needs to be studied in more detail.
Using the datasets 25 and 40 as well as all three different kinds of rot. Quantum KMeans algorithms, we keep all parameters on default. In the code, we use three different approaches:
- Standardize and normalize the centroids in every iteration
- Standardize the centroids in every iteration
- Take the centroids just as they are
The outcome for all three different approaches does not differ significantly.
Because all rot. Quantum KMeans algorithms rely on angle differences, a mapping of the centroids onto the unit sphere is done implicitly. Hence, a normalization step is not necessary. However, disabling the standardization should show some influence in the outcome as the location of the centroids will change. One possible explanation lies in the fact that the datasets are chosen to show two different clusters. Within this setting, the standardization task could shift the location of the centroids even more into the real centers of the two clusters, to fullfil unit variance and zero mean.
We observe that the classical KMeans implementation in sklearn always performs better or about the same compared to the rot. quantum KMeans implementations. Because the two different approaches have different hyper-parameters, we adjust the ones for the classical approach and check if the performance still remains better. In concrete terms we set
- "init" to "random" in the code, i.e. we do not use the KMeans++ approach but choose random centroids out of the data
- "n_init" to 1, i.e. no multi-start
- "algorithm" to "full", i.e. we perform a classical KMeans and no optimized version for well defined clusters
and compare the results with the rot. Quantum KMeans Algorithms (on perfectly simulated qubits). See hyper-parameters KMeans for reference.
The left figure shows the output of the classical KMeans approach for all different configurations (and a mix of them). The output looks always the same, i.e. the two clusters for the 40 custom subset is identified perfectly. The output for the quantum KMeans often shows an error of 2-3 data points that are not in the right cluster.
After adjusting the hyper-parameters both approaches should work the same. However, this statement is restricted to the same data as input. With the chosen set of hyper-parameters, the only difference in the settings between the classical and quantum version is the data-preparation and the convergence criteria. While the classical approach operates on the euclidean space that we get from the MDS, the quantum versions project all data points onto the 2D unit sphere and perform a standardization. Additionally, the quantum versions generate random data points as initial centroids, while the classical version chooses existing data points. Moreover, the quantum versions use a percentage of how many data points change from one iteration to the next, while the classical version uses a distanced based measurement of the centroid evolution.
The right figure above shows a representative output of the quantum KMeans. It is worth mentioning that the wrong colored data points always lay on either the bottom or the top. Because the clustering itself is done on a circle, the top and bottom areas in the 2D figures are curved in the space where the clustering is done. The two wrong data points in the right figure show a relatively big distance to the red cluster. However, on the circle it is likely that they lay close to each other. This shows that the data preprocessing could indeed be responsible for the bad performance of the rotational quantum KMeans algorithms.
One possible reason why the rot. Quantum KMeans algorithms perform worse then the classical version could be the data preprocessing. Hence, in this experiment we perform the same preprocessing step for the classical approach, i.e. we project the data onto 2D unit sphere before apply classical KMeans. This means, we standardize and normalize the data.
Nothing changed. The performance of the classical KMeans algorithm remains good. Using the 40 subset and 20 runs, each output identified perfectly the two expected clusters.
No explanation so far.
The classical KMeans algorithm uses random data points as initial centroids, while the rot. Quantum KMeans algorithms are using random points on the 2D unit sphere. Within this experiment, we change the preprocessing step for the rot. QKMeans algorithms to use the same approach like the classical KMeans algorithm.
The performance becomes worse compared to before. One ends up often, but not always, in only one single cluster, see the figure above.
No explanation so far.