Cloud detection from multi- and hyperspectral satellite images using quantum algorithms
Cloud detection from multi- and hyperspectral satellite images using quantum algorithms
All authors
Abstract
Cloud detection is one of the necessary yet laborious tasks in the data reduction chain of Earth's satellite imagery. In particular, this problem concerns processing a vast amount of data generated by imaging satellites, e.g., within the European Copernicus constellation. Hence, developing new, perhaps more efficient computational methods toward this task is of both research and practical importance. The purpose of this project is to design and thoroughly investigate the utility of quantum algorithms in cloud detection. Specifically, we focus on the Quantum Support Vector Machine (QSVM), which provides an exponential speedup of the supervised learning, with respect to its classical counterpart. QSVM will be exploited to classify pixels from satellite images to the cloud vs. non-cloud classes. While initially, we will exploit multispectral images, an extension to hyperspectral data will also be explored. Our pilot studies confirm that the approach can be effectively implemented on the available quantum computers. To thoroughly verify our approaches over satellite data, we will utilize emulators of quantum computers and actual superconducting quantum computers provided in cloud services and will perform quantitative, qualitative, and statistical analysis of the experimental results. Finally, we will confront QSVM with deep learning cloud detection techniques.
Satellite imaging plays an increasing role in various aspects of human activity. The spectrum of applications ranges from cartographic purposes through meteorology, ecology, and agronomy to security. Consequently, dozens of terabytes of raw imaging data are generated daily from the satellite constellations, such as the one built within the European Copernicus Programme. The data includes both optical (including NIR) and synthetic-aperture radar (SAR) imaging. In the project, we will focus on the multi- and hyperspectral optical data.
An advantage of such imagery is the possibility to extract information about the complex physical properties of the reflecting surface, as we acquire multiple (possibly contiguous) spectral bands that effectively capture fine-grained characteristics of the scanned materials. On the other hand, the large volume of multi- and hyperspectral images makes them difficult to transfer, store, and ultimately analyze. Hence reducing their size is a critical issue in real-life applications. An essential step in the data reduction processing chain is the identification of clouds, as such cloudy regions may be pruned from further processing. This issue is especially relevant for high-quality imaging of the surface of Earth. Because the reduction is conducted on the vast amount of the raw data, the process's efficiency is a key factor.
The project aims to propose and thoroughly investigate the utility of quantum algorithms for cloud detection from satellite images. While there are various possible approaches toward this task, we will focus on Quantum Support Vector Machines (QSVMs) [1], as it is possible to implement them on near-term quantum computers. Also, the computational efficiency of this approach has already been proved [1]. QSVM is a quantum version of a standard SVM, which is one of the most widely-researched supervised learners [2]. Importantly, QSVM was implemented on a molecular quantum computer already in 2015 [3].
The QSVM algorithm is composed of two main steps. The first one concerns calculating the so-called kernel function, which is a specific function of the scalar products between data vectors. The second step utilizes the kernel to solve a system of equations determining the position of the decision hyperplane. In the classical SVM, the computational complexity of algorithms is polynomial in both the number of data vectors (M) and their dimension (N): O(M^2(N+M)) [1]. It has been proven that for QSVM, the computational complexity is reduced to the logarithmic one in both M and N: O(log NM) [1]. This is an example of the so-called exponential speedup.
Furthermore, the first step of QSVM can be performed with a quantum register having the number of logical qubits equal to N (dimension of the data vector). Therefore, in light of the present and near-term quantum computers, the method is suitable for the multispectral data with several or a dozen wavebands. Moreover, by applying feature selection or extraction techniques, such as, e.g., principal component analysis (PCA), QSVM can be made suitable for the hyperspectral data [4]. Ultimately, future quantum computers operating on hundreds of qubits may allow for the direct application of QSVM to the hyperspectral satellite data.
Nevertheless, the second step of QSVM, equivalent to finding the inverse of the (1+M)x(1+M) matrix, requires the number of qubits, which scales linearly with the number of data vectors. Therefore, with present and near-term quantum computers, this can be applied for small training data sets only. Therefore, in our consideration, the second step of the QSVM algorithm will be initially conducted classically. In consequence, the computational advantage of the QSVM will enter only via evaluation of the kernel. However, we plan to exploit our training set selection techniques in this context in order to drastically decrease the cardinality of the training set while maintaining the most important data points that are likely to be selected as support vectors (i.e., the vectors that determine the position of the decision hyperplane) [2,5-7]. In the next step, an attempt to implement the fully quantum QSVM, including the quantum matrix inversion step, will be made. We also plan to investigate another method, the quantum variational classifier (VQC) [8], which uses a variational quantum circuit to classify a training set in direct analogy to conventional SVMs. In this method, the first step is the same as in QSVM, but in the second step, instead of matrix inversion, the parametrized circuit is used. The training is about minimizing a certain cost function with respect to these circuit parameters. The number of qubits in VQC is independent of the number of training data.
In the context of the cloud detection procedure, the data vectors are pixels of satellite images. Dimensions of the vectors (N) are either directly equal to the number of considered wavebands or correspond to the dimension of the reduced feature space (obtained, e.g., through applying PCA). The components of the raw data vectors are (normalized) intensities at the given wavebands. In this project, the data vectors will be classified as clouds and non-clouds. However, one has to keep in mind that in practice, the distinction is not sharp since the pixels can be partially (with a different fraction) covered by the clouds. Therefore, the binary classification should be understood as there is still some margin contribution of clouds for the no-cloud pixels. Finally, it is possible to extend QSVMs to multi-class problems through, e.g., one-vs-rest or one-vs-one approach. Note that the approaches developed within this project will be directly applicable to other classification tasks in, e.g., precision agriculture or environmental monitoring, given an appropriate training data is available.
References:
- P. Rebentrost, M. Mohseni, and S. Lloyd, Quantum Support Vector Machine for Big Data Classification, Phys. Rev. Lett. 113, 130503 (2014).
- J. Nalepa, M. Kawulok, Selecting training sets for support vector machines: a review, Artif. Intell. Rev. 52(2): 857-900 (2019).
- Z. Li, X. Liu, N. Xu, and J. Du, Experimental Realization of a Quantum Support Vector Machine, Phys. Rev. Lett. 114, 140504 (2015).
- P. Ribalta et int J. Nalepa, Hyperspectral Band Selection Using Attention-Based Convolutional Neural Networks, IEEE Access 8: 42384-42403 (2020).
- J. Nalepa, M. Kawulok, Adaptive memetic algorithm enhanced with data geometry analysis to select training data for SVMs, Neurocomputing 185: 113-132 (2016).
- J. Nalepa, M. Kawulok, A memetic algorithm to select training data for support vector machines, GECCO 2014: 573-580.
- J. Nalepa et al., Memetic Evolution of Training Sets with Adaptive Radial Basis Kernels for Support Vector Machines, Proc. IEEE ICPR 2021, 2021 (in press).
- V. Havlíček, A. D. Córcoles, K. Temme, et al. Supervised learning with quantum-enhanced feature spaces, Nature 567, 209–212 (2019).
To preliminarily verify the proposed approach, we performed pilot studies with the multispectral data from the Landsat 8 satellite. The images are characterized by a 30 m spatial resolution.
In the supplementary materials (available under this link), a Python notebook containing the results of the studies can be found. A sample of 100 pixels, each represented by a four-dimensional vector, has been considered in the studies. The pixels were classified according to the binary values mentioned above. The four wavebands under consideration are B (450-515 nm), G (520-600 nm), R (630-680 nm), and NIR (845-885 nm). In the pilot studies, two alternative approaches have been considered. First, the initial four-dimensional data space has been reduced to the relevant two-dimensional (N=2) subspace in the study for simplicity. Second, the original four-dimensional data vectors (N=4) were quantum embedded, which required much more complex quantum circuits. Evaluations of the circuits were conducted on an emulator of a quantum computer.
Then, 80% of the original data vectors (M=80) have been used for training QSVM. The remaining 20 points contributed to the testing data. It has been shown that for N=2, the sample data under consideration, the QSVM gives up to ~20% (for N=2 and N=4) advantage in the classification (with respect to SVM) for some kernel functions. The source of the difference is the form embedding the data vectors in the quantum circuit. However, the conclusions cannot be considered generic and may change for a broader dataset. This will be investigated - quantitatively, qualitatively, and statistically (using appropriate statistical testing) - in the proposed project. Note that some further results have also been presented in the attached file.
Based on the results of the performed pilot studies, the project will be conducted toward verification of the practical implementation of QSVM on the multispectral and hyperspectral satellite data. In the project, we will utilize the emulators of quantum computers and superconducting quantum computers provided by IBM in cloud services. The computations will be performed with the Qiskit package, operating in the Python environment.
The project is divided into the following research tasks:
-
Preparing, documenting, and bundling a benchmark cloud detection dataset containing satellite images acquired using different satellites (at least Landsat and Sentinel-2 images) that will be used in the experimental study. Note that we plan to exploit the data provided by ESA (https://zenodo.org/record/4172871#.X_ThkdhKiUm).
-
Implementation of the “hybrid” quantum-classical SVM algorithm for the multispectral satellite data. In this case, the kernel is evaluated with the use of a quantum algorithm, while the matrix inversion part is performed employing classical algorithms.
-
Investigation of the role of various quantum embeddings of the data vectors and choices of kernels on the accuracy of classification.
-
Elaborating training subsets from the full training sets using evolutionary algorithms (especially hybrid genetic algorithms, also referred to as memetic techniques).
-
Full implementation of the QSVM, including the quantum matrix inversion algorithm. A variational approach to the inversion problem, implemented in the VQC algorithm, will also be considered.
-
Quantitative, qualitative, and statistical comparison of QSVM and SVM for the datasets under consideration. This concerns both the quality of classification and time efficiency.
-
Confronting (quantitatively, qualitatively, and statistically) QSVM with deep learning for cloud detection (especially U-Net-based deep architectures).
- Co-funded research activity
I confirm
Jagiellonian University, KP Labs
jakub.mielczarek@uj.edu.pl
1000027509
The project is proposed in collaboration between Quantum Cosmos Lab (https://quantumcosmos.org/) at the Jagiellonian University and the space company KP Labs (https://www.kplabs.pl/).
Quantum Cosmos Lab (QCL) is an interdisciplinary research team exploring physics at the interface between quantum mechanics and the theory of gravity. The activity of QCL is mainly devoted to fundamental studies within theoretical physics and cosmology. However, QCL links this theoretical research to advanced technologies, such as quantum computing, quantum communication, and space technologies.
KP Labs was established in 2016 by a group of engineers and scientists from the Silesian University of Technology, Gliwice, Poland, who saw an upcoming business potential in applying machine learning algorithms to solve real-life problems. Our mission is to accelerate space exploration through the advancement of autonomous spacecraft operations and robotic technology. The main project of KP Labs is an EU funded project of Intuition-1. In this space mission (period of implementation: January 2018-December 2023), we aim to observe the Earth using a satellite equipped with a hyperspectral instrument and advanced processing of the data on board of the satellite using deep convolutional neural networks. Intuition-1 will be a 6U-class satellite in the shape of a cuboid with dimensions of 10x22x36 cm and a weight of approximately 10 kg. In 2018 the company received ESA funding for HYPERNET (contract no. 4000123760/18/NL/CBi/fg) project. The main technical objective of this project was to design and implement an extensive framework for experimental validation of hyperspectral deep learning segmentation algorithms and to design, implement and evaluate deep neural networks alongside the methodology of developing new deep network architectures, with a particular emphasis put on deep convolutional networks, for segmentation of hyperspectral satellite images. In 2020, the company received ESA funding for BEETLES (contract no. 4000130210/20/I-DT), which is a continuation of HYPERNET. The general technical objective of this project is to implement, evaluate (using modern GPU architectures), improve, and integrate algorithms and techniques for creating robust deep neural networks for effective hyperspectral image segmentation. We were also awarded the DeepSent project by ESA, which aims to improve the resolution of multi-image type photos (several photos from the same scene) using convolution neural networks. One of our latest projects is the Antelope project, co-financed by the National Centre for Research and Development. The project aims to develop an innovative onboard computer for nano- and microsatellites (from 1 to 100 kg) with increased reliability of operation in space and deep learning-powered intelligent detection of anomalies in telemetry data.
Members of the research team:
Grzegorz Czelusta - Ph.D. student in theoretical physics at the Jagiellonian University, specializing in research at the interface between quantum mechanics and gravitational physics.
Dr. hab. inż. Michał Kawulok is an Associate Professor at the Silesian University of Technology and a Research Scientist at KP Labs. He has led numerous successfully accomplished projects in industry and academia, including ESA-funded projects on super-resolution reconstruction of satellite images. He has also been the Principal Investigator in two research projects focused on selecting training sets for support vector machines using evolutionary algorithms. Dr. Kawulok has published over 100 papers (see goo.gl/EvyBzo for a full list). Recently, he gave a keynote speech at PhiWeek 2020, and he received the IEEE GRSS Symposium Interactive Session Prize Paper Award for his work presented at IGARSS 2019. His general research interests are concerned with image processing, pattern recognition, and machine learning, with particular attention given to support vector machines, training data selection, super-resolution, linear and non-linear dimensionality reduction.
Dr. hab. Jakub Mielczarek - a theoretical physicist, specializing in quantum gravity. He is currently developing a novel research direction of simulating the quantum gravitational systems on quantum computers. Author and co-author of over 50 research articles (https://scholar.google.com/citations?user=jExSKHkAAAAJ&hl=en). He is leading Quantum Cosmos Lab at the Jagiellonian University. Dr. Mielczarek is also involved in numerous activities at the interface of basic science, new technologies, and entrepreneurship. In particular, he is leading Garage of Complexity makerspace at the Jagiellonian University, and he was CEO of the Space Garden space company.
Dr. Jakub Nalepa is currently an Assistant Professor at the Silesian University of Technology and Head of AI at KP Labs. He has been a PI in HYPERNET and BEETLES (ESA-funded projects on hyperspectral image analysis). His research interests encompass machine learning (with special emphasis put on deep learning and support vector machines) and satellite image analysis. So far, Dr. Nalepa has published more than 100 papers in these fields (the list of his publications is available here: https://scholar.google.com/citations?user=kt6EnKcAAAAJ&hl=en) and received several best paper awards, including the IEEE GRSS Symposium Interactive Session Prize Paper Award (2020).
We plan to publish outcomes of the proposed research in the scientific journals/conferences (e.g., IEEE GRSL, IEEE TGRS, IEEE IGRASS, PhiWeek). No IP issues are expected at this stage. Also, there are no additional constraints on the image data that would be used in the project.
Tags
2nd Round idea
No Data to Display
STATISTICS
- Jan 7, 2021
- 2,295 Views
- 91 Visitors
- 22 Comments
- 8 Followers