Noisy labels often occur in vision datasets, especially when they are obtained from crowdsourcing or Web scraping. We propose a new regularization method, which enables learning robust classifiers in presence of noisy data. To achieve this goal, we propose a new adversarial regularization scheme based on the Wasserstein distance. Using this distance allows taking into account specific relations between classes by leveraging the geometric properties of the labels space. Our Wasserstein Adversarial Regularization (WAR) encodes a selective regularization, which promotes smoothness of the classifier between some classes, while preserving sufficient complexity of the decision boundary between others. We first discuss how and why adversarial regularization can be used in the context of noise and then show the effectiveness of our method on five datasets corrupted with noisy labels: in both benchmarks and real datasets, WAR outperforms the state-of-the-art competitors.
Shrub decline and expansion of wetland vegetation revealed by very high resolution land cover change detection in the Siberian lowland tundra
Rúna Í. Magnússon, Juul Limpens, David Kleijn, and
4 more authors
Vegetation change, permafrost degradation and their interactions affect greenhouse gas fluxes, hydrology and surface energy balance in Arctic ecosystems. The Arctic shows an overall “greening” trend (i.e. increased plant biomass and productivity) attributed to expansion of shrub vegetation. However, Arctic shrub dynamics show strong spatial variability and locally “browning” may be observed. Mechanistic understanding of greening and browning trends is necessary to accurately assess the response of Arctic vegetation to a changing climate. In this context, the Siberian Arctic is an understudied region. Between 2010 and 2019, increased browning (as derived from the MODIS Enhanced Vegetation Index) was observed in the Eastern Siberian Indigirka Lowlands. To support interpretation of local greening and browning dynamics, we quantified changes in land cover and transition probabilities in a representative tundra site in the Indigirka Lowlands using a timeseries of three very high resolution (VHR) (0.5 m) satellite images acquired between 2010 and 2019. Using spatiotemporal Potts model regularization, we substantially reduced classification errors related to optical and phenological inconsistencies in the image material. VHR images show that recent browning was associated with declines in shrub, lichen and tussock vegetation and increases in open water, sedge and especially Sphagnum vegetation. Observed formation and expansion of small open water bodies in shrub dominated vegetation suggests abrupt thaw of ice-rich permafrost. Transitions from open water to sedge and Sphagnum, indicate aquatic succession upon disturbance. The overall shift towards open water and wetland vegetation suggests a wetting trend, likely associated with permafrost degradation. Landsat data confirmed widespread expansion of surface water throughout the Indigirka Lowlands. However, the increase in the area of small water bodies observed in VHR data was not visible in Landsat-derived surface water data, which suggests that VHR data is essential for early detection of small-scale disturbances and associated vegetation change in permafrost ecosystems.
2020
A deep learning framework for matching of SAR and optical imagery
Lloyd Haydn Hughes, Diego Marcos, Sylvain Lobry, and
2 more authors
ISPRS Journal of Photogrammetry and Remote Sensing, 2020
SAR and optical imagery provide highly complementary information about observed scenes. A combined use of these two modalities is thus desirable in many data fusion scenarios. However, any data fusion task requires measurements to be accurately aligned. While for both data sources images are usually provided in a georeferenced manner, the geo-localization of optical images is often inaccurate due to propagation of angular measurement errors. Many methods for the matching of homologous image regions exist for both SAR and optical imagery, however, these methods are unsuitable for SAR-optical image matching due to significant geometric and radiometric differences between the two modalities. In this paper, we present a three-step framework for sparse image matching of SAR and optical imagery, whereby each step is encoded by a deep neural network. We first predict regions in each image which are deemed most suitable for matching. A correspondence heatmap is then generated through a multi-scale, feature-space cross-correlation operator. Finally, outliers are removed by classifying the correspondence surface as a positive or negative match. Our experiments show that the proposed approach provides a substantial improvement over previous methods for SAR-optical image matching and can be used to register even large-scale scenes. This opens up the possibility of using both types of data jointly, for example for the improvement of the geo-localization of optical satellite imagery or multi-sensor stereogrammetry.
RSVQA: Visual Question Answering for Remote Sensing Data
Sylvain Lobry, Diego Marcos, Jesse Murray, and
1 more author
IEEE Transactions on Geoscience and Remote Sensing, 2020
This article introduces the task of visual question answering for remote sensing data (RSVQA). Remote sensing images contain a wealth of information, which can be useful for a wide range of tasks, including land cover classification, object counting, or detection. However, most of the available methodologies are task-specific, thus inhibiting generic and easy access to the information contained in remote sensing data. As a consequence, accurate remote sensing product generation still requires expert knowledge. With RSVQA, we propose a system to extract information from remote sensing data that is accessible to every user: we use questions formulated in natural language and use them to interact with the images. With the system, images can be queried to obtain high-level information specific to the image content or relational dependencies between objects visible in the images. Using an automatic method introduced in this article, we built two data sets (using low- and high-resolution data) of image/question/answer triplets. The information required to build the questions and answers is queried from OpenStreetMap (OSM). The data sets can be used to train (when using supervised methods) and evaluate models to solve the RSVQA task. We report the results obtained by applying a model based on convolutional neural networks (CNNs) for the visual part and a recurrent neural network (RNN) for the natural language part of this task. The model is trained on the two data sets, yielding promising results in both cases.
2019
Water Detection in SWOT HR Images Based on Multiple Markov Random Fields
Sylvain Lobry, Loïc Denis, Brent Williams, and
2 more authors
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019
One of the main objectives of the surface water and ocean topography (SWOT) mission, scheduled for launch in 2021, is to measure inland water levels using synthetic aperture radar (SAR) interferometry. A key step toward this objective is to precisely detect water areas. In this article, we present a method to detect water in SWOT images. Water is detected based on the relative brightness of the water and nonwater surfaces. Water brightness varies throughout the swath because of system parameters (i.e., the antenna pattern), as well as the phenomenology such as wind speed and surface roughness. To handle the effects of brightness variability, we propose to model the problem with one Markov random field (MRF) on the binary classification map, and two other MRFs to regularize the estimation of the class parameters (i.e., the land and water background power images). Our experiments show that the proposed method is more robust to the expected variations in SWOT images than traditional approaches.
Half a Percent of Labels is Enough: Efficient Animal Detection in UAV Imagery Using Deep CNNs and Active Learning
Benjamin Kellenberger, Diego Marcos, Sylvain Lobry, and
1 more author
IEEE Transactions on Geoscience and Remote Sensing, 2019
We present an Active Learning (AL) strategy for reusing a deep Convolutional Neural Network (CNN)-based object detector on a new data set. This is of particular interest for wildlife conservation: given a set of images acquired with an Unmanned Aerial Vehicle (UAV) and manually labeled ground truth, our goal is to train an animal detector that can be reused for repeated acquisitions, e.g., in follow-up years. Domain shifts between data sets typically prevent such a direct model application. We thus propose to bridge this gap using AL and introduce a new criterion called Transfer Sampling (TS). TS uses Optimal Transport (OT) to find corresponding regions between the source and the target data sets in the space of CNN activations. The CNN scores in the source data set are used to rank the samples according to their likelihood of being animals, and this ranking is transferred to the target data set. Unlike conventional AL criteria that exploit model uncertainty, TS focuses on very confident samples, thus allowing quick retrieval of true positives in the target data set, where positives are typically extremely rare and difficult to find by visual inspection. We extend TS with a new window cropping strategy that further accelerates sample retrieval. Our experiments show that with both strategies combined, less than half a percent of oracle-provided labels are enough to find almost 80% of the animals in challenging sets of UAV images, beating all baselines by a margin.
2018
Correcting rural building annotations in OpenStreetMap using convolutional neural networks
John E. Vargas-Muñoz, Sylvain Lobry, Alexandre X. Falcão, and
1 more author
ISPRS Journal of Photogrammetry and Remote Sensing, 2018
Rural building mapping is paramount to support demographic studies and plan actions in response to crisis that affect those areas. Rural building annotations exist in OpenStreetMap (OSM), but their quality and quantity are not sufficient for training models that can create accurate rural building maps. The problems with these annotations essentially fall into three categories: (i) most commonly, many annotations are geometrically misaligned with the updated imagery; (ii) some annotations do not correspond to buildings in the images (they are misannotations or the buildings have been destroyed); and (iii) some annotations are missing for buildings in the images (the buildings were never annotated or were built between subsequent image acquisitions). First, we propose a method based on Markov Random Field (MRF) to align the buildings with their annotations. The method maximizes the correlation between annotations and a building probability map while enforcing that nearby buildings have similar alignment vectors. Second, the annotations with no evidence in the building probability map are removed. Third, we present a method to detect non-annotated buildings with predefined shapes and add their annotation. The proposed methodology shows considerable improvement in accuracy of the OSM annotations for two regions of Tanzania and Zimbabwe, being more accurate than state-of-the-art baselines.
Fine-grained landuse characterization using ground-based pictures: a deep learning solution based on globally available data
Shivangi Srivastava, John E. Vargas Muñoz, Sylvain Lobry, and
1 more author
International Journal of Geographical Information Science, 2018
ABSTRACTWe study the problem of landuse characterization at the urban-object level using deep learning algorithms. Traditionally, this task is performed by surveys or manual photo interpretation, which are expensive and difficult to update regularly. We seek to characterize usages at the single object level and to differentiate classes such as educational institutes, hospitals and religious places by visual cues contained in side-view pictures from Google Street View (GSV). These pictures provide geo-referenced information not only about the material composition of the objects but also about their actual usage, which otherwise is difficult to capture using other classical sources of data such as aerial imagery. Since the GSV database is regularly updated, this allows to consequently update the landuse maps, at lower costs than those of authoritative surveys. Because every urban-object is imaged from a number of viewpoints with street-level pictures, we propose a deep-learning based architecture that accepts arbitrary number of GSV pictures to predict the fine-grained landuse classes at the object level. These classes are taken from OpenStreetMap. A quantitative evaluation of the area of Île-de-France, France shows that our model outperforms other deep learning-based methods, making it a suitable alternative to manual landuse characterization.
2016
Multitemporal SAR Image Decomposition into Strong Scatterers, Background, and Speckle
Sylvain Lobry, Loïc Denis, and Florence Tupin
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2016
Speckle phenomenon in synthetic aperture radar (SAR) images makes their visual and automatic interpretation a difficult task. To reduce strong fluctuations due to speckle, total variation (TV) regularization has been proposed by several authors to smooth out noise without blurring edges. A specificity of SAR images is the presence of strong scatterers having a radiometry several orders of magnitude larger than their surrounding region. These scatterers, especially present in urban areas, limit the effectiveness of TV regularization as they break the assumption of an image made of regions of constant radiometry. To overcome this limitation, we propose in this paper an image decomposition approach. There exist numerous methods to decompose an image into several components, notably to separate textural and geometrical information. These decomposition models are generally recast as energy minimization problems involving a different penalty term for each of the components. In this framework, we propose an energy suitable for the decomposition of SAR images into speckle, a smooth background, and strong scatterers, and discuss its minimization using max-flow/min-cut algorithms. We make the connection between the minimization problem considered, involving the L0 pseudonorm, and the generalized likelihood ratio test used in detection theory. The proposed decomposition jointly performs the detection of strong scatterers and the estimation of the background radiometry. Given the increasing availability of time series of SAR images, we consider the decomposition of a whole time series. New change detection methods can be based on the temporal analysis of the components obtained from our decomposition.
International Conferences
2023
An a contrario approach for plant disease detection (to appear)
Rebecca Leygonie, Sylvain Lobry, and Laurent Wendling
In Workshop on Machine Vision for Earth Observation at BMVC, 2023
Detecting plant diseases or abnormalities is not a trivial task, as they can be caused by multiple factors such as environmental conditions, genetics, pathogens, etc. Because there is a need to help farmers make decisions to maximize crop yields, many studies have emerged in recent years using deep learning on agricultural images to detect plant diseases, which can be considered as an anomaly detection task. However, these approaches are often limited by the availability of annotated data or prior knowledge of the existence of an anomaly. We propose an approach that can detect part of the anomalies without prior knowledge of their existence, thus overcoming some of these limitations. To this end, we train a model on an auxiliary prediction task (plants’ age regression). We then use an explicability model to retrieve heatmaps whose distributions are studied. For each new observation, we propose to study how closely its heatmap follows the desired distribution and we derive a score indicating potential anomalies. Experiments on the GrowliFlower dataset indicate how our proposed method can help potential end-user to automatically find anomalies.
Multi-task prompt-RSVQA to explicitly count objects on aerial images (to appear)
Christel Chappuis, Charlotte Sertic, Nicolas Santacroce, and
4 more authors
In Workshop on Machine Vision for Earth Observation at BMVC, 2023
Introduced to enable a wider use of Earth Observation images using natural language, Remote Sensing Visual Question Answering (RSVQA) remains a challenging task, in particular for questions related to counting. To address this specific challenge, we propose a modular Multi-task prompt-RSVQA model based on object detection and question answering modules. By creating a semantic bottleneck describing the image and providing a visual answer, our model allows users to assess the visual grounding of the answer and better interpret the prediction. A set of ablation studies are designed to consider the contributions of different modules and evaluation metrics are discussed for a finer-grained assessment. Experiments demonstrate competitive results against literature baselines and a zero-shot VQA model. In particular, our proposed model predicts answers for numerical Counting questions that are consistently closer in distance to the ground truth.
Transforming multidimensional data into images to overcome the curse of dimensionality (to appear)
Rebecca Leygonie, Sylvain Lobry, Guillaume Vimont, and
1 more author
In IEEE International Conference on Image Processing ICIP, 2023
When dealing with high-dimensional multivariate time series classification problems, a well-known difficulty is the \textitcurse of dimensionality.
In this article, we propose an original approach of transposition of multidimensional data into images to tackle the task of classification. We propose a lightweight hybrid model that take this transposed data as an input. This model contains convolutional layers as a feature extractor followed by a recurrent neural network. We apply our method to a large dataset consisting of individual patient medical records. We show that our approach allows us to significantly reduce the size of a network and increase its performance by opting for a transformation of the input data.
Automatic simulation of SAR images: comparing a deep-learning based method to a hybrid method (to appear)
Nathan Letheule, Flora Weissgerber, Sylvain Lobry, and
1 more author
In IEEE International Geoscience and Remote Sensing Symposium IGARSS, 2023
Recent research in demography focuses on linking population data to environmental indicators. Satellite imagery can support such projects by providing data at a large scale and a high frequency. Moreover, population surveys often provide geolocations of households, yet sometimes with an offset, to guarantee data confidentiality. In such cases, the proper management of this incertitude is required, to accurately link environmental indicators such as land cover/land use maps or spectral indices to population data. In this paper, we introduce a method based on the random sampling of possible households geolocations around the coordinates provided. Then, we link a land cover map generated using semi-supervised deep learning and a Malaria Indicator Survey in Burkina Faso. After linking households to their close environment, we distinguish several types of environment conducive to high malaria rates, beyond the urban/rural dichotomy.
Seasonal semi-supervised domain adaptation for linking population studies and Local Climate Zones
Basile Rousse, Sylvain Lobry, Géraldine Duthé, and
2 more authors
Environment and demographic dynamics are strongly linked. However, relevant data to study this interaction may be scarce especially in sub-Saharan Africa where it is not always possible to perform such studies with a high temporal frequency. Satellite imagery, when linked to demographic data, can be a significant asset to estimate missing data as it covers every country with both high spatial and temporal resolution. We aim to take advantage of satellite data to characterize the environment in inter-tropical areas. This environment is regulated by the changing of two seasons that are essential to consider. We introduce a semi-supervised domain adaptation strategy for neural networks based on seasonal changes. This strategy can be used to produce land cover maps in regions of the world where limited labeled datasets are available. We apply this method to produce environmental indicators and link them to malaria rates from the Malaria Indicator Survey of Burkina Faso. We show that malaria rates are correlated not only to urbanisation but also to the environmental characterisation of studied areas.
2022
Prompt–RSVQA: Prompting visual context to a language model for Remote Sensing Visual Question Answering
Chritel Chappuis, Valerie Zermatten, Sylvain Lobry, and
2 more authors
In EarthVision at IEEE/CVF Conference on Computer Vision and Pattern Recognition CVPR, 2022
Remote sensing visual question answering (RSVQA) was recently proposed with the aim of interfacing natural language and vision to ease the access of information contained in Earth Observation data for a wide audience, which is granted by simple questions in natural language. The traditional vision/language interface is an embedding obtained by fusing features from two deep models, one processing the image and another the question. Despite the success of early VQA models, it remains difficult to control the adequacy of the visual information extracted by its deep model, which should act as a context regularizing the work of the language model. We propose to extract this context information with a visual model, convert it to text and inject it, i.e. prompt it, into a language model. The language model is therefore responsible to process the question with the visual context, and extract features, which are useful to find the answer. We study the effect of prompting with respect to a black-box visual extractor and discuss the importance of training a visual model producing accurate context.
Embedding Spatial Relations in Visual Question Answering for Remote Sensing
Maxime Faure, Sylvain Lobry, Camille Kurtz, and
1 more author
In 26TH International Conference on Pattern Recognition ICPR, 2022
Remote sensing images carry a wealth of information that is not easily accessible to end-users as it requires strong technical skills and knowledge.
Visual Question Answering (VQA), a task that aims at answering an open-ended question in natural language from an image, can provide an easier access to this information. Considering the geographical information contained in remote sensing images, questions often embed an important spatial aspect, for instance regarding the relative position of two objects. Our objective is to better model the spatial relations in the construction of a ground-truth database of image/question/answer triplets and to assess the capacity a VQA model has to answer these questions. In this article, we propose to use histograms of forces to model the directional spatial relations between geo-localized objects. This allows a finer modeling of ambiguous relationships between objects and to provide different levels of assessment of a relation (e.g. object A is slightly/strictly to the west of object B). Using this new dataset, we evaluate the performances of a classical VQA model and propose a curriculum learning strategy to better take into account the varying difficulty of questions embedding spatial relations. With this approach, we show an improvement in the performances of our model, highlighting the interest of embedding spatial relations in VQA for remote sensing applications.
Language transformers for remote sensing visual question answering
Christel Chappuis, Vincent Mendez, Eliot Walt, and
3 more authors
In IEEE International Geoscience and Remote Sensing Symposium IGARSS, 2022
Remote sensing visual question answering (RSVQA) opens new avenues to promote the use of satellites data, by interfacing satellite image analysis with natural language processing. Capitalizing on the remarkable advances in natural language processing and computer vision, RSVQA aims at finding an answer to a question formulated by a human user about a remote sensing image. This is achieved by extracting representations from images and questions, and then fusing them in a joint representation. Focusing on the language part of the architecture, this study compares and evaluates the adequacy to the RSVQA task of two language models, a traditional recurrent neural network (Skip-thoughts) and a recent attention-based Transformer (BERT). We study whether large transformer models are beneficial to the task and whether fine-tuning is needed for these models to perform at their best. Our findings show that the models benefit from fine-tuning language models and that RSVQA with BERT is slightly but consistently better when properly fine-tuned.
Matching environmental data produced from remote sensing images to demographic data in Sub-Saharan Africa
Lys Thay*, Basile Rousse*, Sylvain Lobry, and
3 more authors
Visual question answering (VQA) has recently been intro- duced to remote sensing to make information extraction from overhead imagery more accessible to everyone. VQA considers a question (in nat- ural language, therefore easy to formulate) about an image and aims at providing an answer through a model based on computer vision and natu- ral language processing methods. As such, a VQA model needs to jointly consider visual and textual features, which is frequently done through a fusion step. In this work, we study three different fusion methodologies in the context of VQA for remote sensing and analyse the gains in ac- curacy with respect to the model complexity. Our findings indicate that more complex fusion mechanisms yield an improved performance, yet that seeking a trade-off between model complexity and performance is worthwhile in practice.
RSVQA Meets Bigearthnet: A New, Large-Scale, Visual Question Answering Dataset for Remote Sensing
Sylvain Lobry, Begüm Demir, and Devis Tuia
In 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, 2021
Visual Question Answering is a new task that can facilitate the extraction of information from images through textual queries: it aims at answering an open-ended question for- mulated in natural language about a given image. In this work, we introduce a new dataset to tackle the task of visual question answering on remote sensing images: this large- scale, open access dataset extracts image/question/answer triplets from the BigEarthNet dataset. This new dataset contains close to 15 millions samples and is openly avail- able. We present the dataset construction procedure, its characteristics and first results using a deep-learning based methodology. These first results show that the task of vi- sual question answering is challenging and opens new in- teresting research avenues at the interface of remote sensing and natural language processing. The dataset and the code to create and process it are open and freely available on https://rsvqa.sylvainlobry.com/
2020
Better Generic Objects Counting When Asking Questions to Images: A Multitask Approach for Remote Sensing Visual Question Answering
Sylvain Lobry, Diego Marcos, Benjamin Kellenberger, and
1 more author
In ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2020
Training deep neural networks requires well-annotated datasets. However, real world datasets are often noisy, es- pecially in a multi-label scenario, i.e. where each data point can be attributed to more than one class. To this end, we propose a regularization method to learn multi-label classifi- cation networks from noisy data. This regularization is based on the assumption that semantically close classes are more likely to appear together in a given image. Hereby, we encode label correlations with prior knowledge and regularize noisy network predictions using label correlations. To evaluate its effectiveness, we perform experiments on a mutli-label aerial image dataset contaminated with controlled levels of label noise. Results indicate that networks trained using the pro- posed method outperform those directly learned from noisy labels and that the benefits increase proportionally to the amount of noise present.
Interpretable Scenicness from Sentinel-2 Imagery
Alex Levering, Diego Marcos, Sylvain Lobry, and
1 more author
In IGARSS 2020-2020 IEEE International Geoscience and Remote Sensing Symposium, 2020
Landscape aesthetics, or scenicness, has been identified as an important ecosystem service that contribute to human health and well-being. Currently there are no methods to inventorize landscape scenicness on a large scale. In this paper we study how to upscale local assessments of scenicness provided by human observers, and we do so by using satellite images. Moreover, we develop an explicitly interpretable CNN model that allows assessing the connections between landscape scenicness and the presence of specific landcover types. To generate the landscape scenicness ground truth, we use the ScenicOrNot crowdsourcing database, which provides geo-referenced, human-based scenicness estimates for ground based photos in Great Britain. Our results show that it is feasible to predict landscape scenicness based on satellite imagery. The interpretable model performs comparably to an unconstrained model, suggesting that it is possible to learn a semantic bottleneck that represents well the present landcover classes and still contains enough information to accurately predict the location’s scenicness.
2019
Semantically Interpretable Activation Maps: what-where-how explanations within CNNs
Diego Marcos, Sylvain Lobry, and Devis Tuia
In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019
A main issue preventing the use of Convolutional Neural Networks (CNN) in end user applications is the low level of transparency in the decision process. Previous work on CNN interpretability has mostly focused either on localizing the regions of the image that contribute to the result or on building an external model that generates plausible explanations. However, the former does not provide any semantic information and the latter does not guarantee the faithfulness of the explanation. We propose an intermediate representation composed of multiple Semantically Interpretable Activation Maps (SIAM) indicating the presence of predefined attributes at different locations of the image. These attribute maps are then linearly combined to produce the final output. This gives the user insight into what the model has seen, where, and a final output directly linked to this information in a comprehensive and interpretable way. We test the method on the task of landscape scenicness (aesthetic value) estimation, using an intermediate representation of 33 attributes from the SUN Attributes database. The results confirm that SIAM makes it possible to understand what attributes in the image are contributing to the final score and where they are located. Since it is based on learning from multiple tasks and datasets, SIAM improve the explanability of the prediction without additional annotation efforts or computational overhead at inference time, while keeping good performances on both the final and intermediate tasks.
Visual question answering from remote sensing images
Sylvain Lobry, Jesse Murray, Diego Marcos, and
1 more author
In IGARSS 2019-2019 IEEE International Geoscience and Remote Sensing Symposium, 2019
Remote sensing images carry wide amounts of information beyond land cover or land use. Images contain visual and structural information that can be queried to obtain high level information about specific image content or relational dependencies between the objects sensed. This paper explores the possibility to use questions formulated in natural language as a generic and accessible way to extract this type of information from remote sensing images, i.e. visual question answering. We introduce an automatic way to create a dataset using OpenStreetMap 1 data and present some preliminary results. Our proposed approach is based on deep learning, and is trained using our new dataset.
Deep learning models to count buildings in high-resolution overhead images
Sylvain Lobry, and Devis Tuia
In 2019 Joint Urban Remote Sensing Event (JURSE), 2019
This paper addresses the problem of counting buildings in very high-resolution overhead true color imagery. We study and discuss the relevance of deep-learning based methods to this task. Two architectures and two loss functions are proposed and compared. We show that a model enforcing equivariance to rotations is beneficial for the task of counting in remotely sensed images. We also highlight the importance of robustness to outliers of the loss function when considering remote sensing applications.
2018
Scale equivariance in CNNs with vector fields
Diego Marcos, Benjamin Kellenberger, Sylvain Lobry, and
1 more author
In International Conference on Machine Learning (ICML)/FAIM workshop on Towards learning with limited labels: Equivariance, Invariance, and Beyond, 2018
We study the effect of injecting local scale equivariance into Convolutional Neural Networks. This is done by applying each convolutional filter at multiple scales. The output is a vector field encoding for the maximally activating scale and the scale itself, which is further processed by the following convolutional layers. This allows all the intermediate representations to be locally scale equivariant. We show that this improves the performance of the model by over 20% in the scale equivariant task of regressing the scaling factor applied to randomly scaled MNIST digits. Furthermore, we find it also useful for scale invariant tasks, such as the actual classification of randomly scaled digits. This highlights the usefulness of allowing for a compact representation that can also learn relationships between different local scales by keeping internal scale equivariance.
Correcting misaligned rural building annotations in open street map using convolutional neural networks evidence
John E Vargas-Munoz, Diego Marcos, Sylvain Lobry, and
3 more authors
In IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, 2018
Mapping rural buildings in developing countries is crucial to monitor and plan in those vulnerable areas. Despite the existence of some rural building annotations in OpenStreetMap (OSM), those are of insufficient quantity and quality to train models able to map large areas accurately. In particular, these annotations are very often misaligned with respect to the buildings that are present in updated aerial imagery. We propose a Markov Random Field (MRF) method to correct misaligned rural building annotations. To do so, our method uses i) the correlation between candidate aligned OSM annotations and buildings roughly detected on aerial images and ii) the local consistency of the alignment vectors.
Land-use characterisation using Google Street View pictures and OpenStreetMap
Shivangi Srivastava, Sylvain Lobry, Devis Tuia, and
1 more author
In 21st AGILE Conference on Geographic Information Science (2018), 2018
This paper presents a study on the use of freely available, geo-referenced pictures from Google Street View to model and predict land-use at the urban-objects scale. This task is traditionally done manually and via photointerpretation, which is very time consuming. We propose to use a machine learning approach based on deep learning and to model land-use directly from both the pictures available from Google Street View and OpenStreetMap annotations. Because of the large availability of these two data sources, the proposed approach is scalable to cities around the globe and presents the possibility of frequent updates of the map. As base information, we use features extracted from single pictures around the object of interest; these features are issued from pre-trained convolutional neural networks. Then, we train various classifiers (Linear and RBF support vector machines, multi layer perceptron) and compare their performances. We report on a study over the city of Paris, France, where we observed that pictures coming from both inside and outside the urban-objects capture distinct, but complementary features.
Speckle reduction in PolSAR by multi-channel variance stabilization and Gaussian denoising: MuLoG
Charles-Alban Deledalle, Loic Denis, Florence Tupin, and
1 more author
In EUSAR 2018; 12th European Conference on Synthetic Aperture Radar, 2018
Due to speckle phenomenon, some form of filtering must be applied to SAR data prior to performing any polarimetric analysis. Beyond the simple multilooking operation (i.e., moving average), several methods have been designed specifically for PolSAR filtering. The specifics of speckle noise and the correlations between polarimetric channels make PolSAR filtering more challenging than usual image restoration problems. Despite their striking performance, existing image denoising algorithms, mostly designed for additive white Gaussian noise, cannot be directly applied to PolSAR data. We bridge this gap with MuLoG by providing a general scheme that stabilizes the variance of the polarimetric channels and that can embed almost any Gaussian denoiser. We describe MuLoG approach and illustrate its performance on airborne PolSAR data using a very recent Gaussian denoiser based on a convolutional neural network.
2017
Double MRF for water classification in SAR images by joint detection and reflectivity estimation
Sylvain Lobry, Loïc Denis, Florence Tupin, and
1 more author
In 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2017
Classification of SAR images is a challenging task as the radiometric properties of a class may not be constant throughout the image. The assumption made in most classification algorithms that a class can be modeled by constant parameters is then not valid. In this paper, we propose a classification algorithm based on two Markov random fields that accounts for local and global variations of the parameters inside the image and produces a regularized classification. This algorithm is applied on airborne TropiSAR and simulated SWOT HR data. Both quantitative and visual results are provided, demonstrating the effectiveness of the proposed method.
Unsupervised detection of thin water surfaces in SWOT images based on segment detection and connection
Sylvain Lobry, Florence Tupin, and Roger Fjortoft
In 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2017
The objective of the Surface Water and Ocean Topography (SWOT) mission is to regularly monitor the height of the earth’s water surfaces. One of the challenges toward obtaining global measurements of these surfaces is to detect small water areas. In this article we introduce a method for the detection of thin water surfaces, such as rivers, in SWOT images. It combines a low-level step (segment detection) with a high-level regularization of these features. The method is then tested on a simulated SWOT image.
Urban area change detection based on generalized likelihood ratio test
Weiying Zhao, Sylvain Lobry, Henri Maitre, and
2 more authors
In 2017 9th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp), 2017
Change detection methods often use denoised data because the original speckle noise has a strong influence on the detection results. The effect of using different data sources (different equivalent number of looks, original data, denoised data) and different threshold methods are studied based on four kinds of generalized likelihood ratio test approaches. NL-SAR [1] denoised data and the corresponding spatially varying equivalent number of looks are taken into account in the detection procedure. The bi-temporal experimental results on simulated data, realistic synthetic Sentinel-1 SAR data show the improvement of using equivalent number of looks of denoised data and corresponding adaptive thresholds for change detection in urban areas.
2016
A decomposition model for scatterers change detection in multi-temporal series of SAR images
Sylvain Lobry, Florence Tupin, and Loïc Denis
In 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2016
This paper presents a method for strong scatterers change detection in synthetic aperture radar (SAR) images based on a decomposition for multi-temporal series. The formulated decomposition model jointly estimates the background of the series and the scatterers. The decomposition model retrieves possible changes in scatterers and the date at which they occurred. An exact optimization method of the model is presented and applied to a TerraSAR-X time series.
Non-Uniform Markov Random Fields for Classification of SAR Images
Sylvain Lobry, Florence Tupin, and Roger Fjortoft
In Proceedings of EUSAR 2016: 11th European Conference on Synthetic Aperture Radar, 2016
When dealing with SAR image classification, the class parameters may vary along the swath for several reasons. Traditional classification algorithms are then not well adapted, as they assume constant class parameters. In this paper, we propose a binary classification algorithm based on Markov Random Fields that take into account the parameters variations in the swath, and we present results obtained on airborne TropiSAR and simulated SWOT HR data.
2015
Sparse + smooth decomposition models for multi-temporal SAR images
Sylvain Lobry, Loïc Denis, and Florence Tupin
In 2015 8th International Workshop on the Analysis of Multitemporal Remote Sensing Images (Multi-Temp), 2015
SAR images have distinctive characteristics compared to optical images: speckle phenomenon produces strong fluctuations, and strong scatterers have radar signatures several orders of magnitude larger than others. We propose to use an image decomposition approach to account for these peculiarities. Several methods have been proposed in the field of image processing to decompose an image into components of different nature, such as a geometrical part and a textural part. They are generally stated as an energy minimization problem where specific penalty terms are applied to each component of the sought decomposition. We decompose temporal series of SAR images into three components: speckle, strong scatterers and background. Our decomposition method is based on a discrete optimization technique by graph-cut. We apply it to change detection tasks.
National Conferences
2023
Évaluation du couvert neigeux à partir d’images SAR par apprentissage profond basé sur des images optiques de référence (to appear)
Mathias Montginoux, Flora Weissgerber, Sylvain Lobry, and
1 more author
Optical satellite images are commonly used to evaluate the snow cover. However, part of the information is lost due to clouds.
To fill this gap we propose to detect the snow from Sentinel-1 SAR images using a convolutional neural network trained with labels obtained from MODIS optical images. A binary semantic segmentation is computed from two polarimetric SAR inputs: a wet snow ratio and a dry snow ratio.
The model, called SESAR U-net, is trained on a small area and then tested over a whole watershed. The missing labels are interpolated and the uncertainty due to clouds is considered. Our proposed method gives an overall accuracy higher than 80%.
Transposition de données mutlidimensionelles en images pour pallier le fléau de la dimension
Rebecca Leygonie, Sylvain Lobry, Guillaume Vimont, and
1 more author
When dealing with high-dimensional multivariate time series classification problems, a well-known difficulty is the \textitcurse of dimensionality.
In this article, we propose an original approach of transposition of multidimensional data into images to tackle the task of classification. We propose a small hybrid model containing convolutional layers as a feature extractor followed by a recurrent neural network that take this transposed data as an input. We apply our method to a large dataset consisting of individual patient medical records. We show that our approach allows us to significantly reduce the size of a network and increase its performance by opting for a transformation of the input data.
2022
Apprentissage profond pour la classification de QR Codes bruités
Rebecca Leygonie, Sylvain Lobry, and Laurent Wendling
We wish to define the limitations of a classical classification model based on deep learning when applied on abstract images, which do not represent visually identifiable objects.
QR Codes fall into this category of abstract images: one bit corresponding to one encoded character, QR codes were not designed to be decoded by the naked eye. To understand the limitations of a deep learning-based model for abstract image classification, we train an image classification model on QR codes generated from the information obtained when reading a health pass. We compare the performance of a classification model with that of a classical (deterministic) decoding method in the presence of noise. This study allows us to conclude that a model based on deep learning can be relevant for the understanding of abstract images.
2021
Segmentation Sémantique pour la Simulation d’Images SAR
Nathan Letheule, Flora Weissgerber, Sylvain Lobry, and
1 more author
Simulation of Synthetic Aperture Radar (SAR) images is an essential component of SAR applications development. This can be done using style transfer methods or through physical simulators. We propose a hybrid approach : physical simulation of a SAR image from a material map ob- tained by a deep network taking the optical image as input. We compare the simulations with those from a style transfer method. The first results show the potential of our approach.
2017
Détection de l’eau dans les images radar du futur satellite SWOT
Sylvain Lobry, Roger Fjortoft, Loı̈c Denis, and
1 more author
One of the objectives of the SWOT mission conducted by CNES and JPL is to obtain a global measurement of water heights. In order to apply an interferometric processing on SWOT images over continents, a first step is to obtain a classification indicating the presence of water. We introduce two methods adapted to the unusual acquisition parameters of the sensor for the detection of compact areas (i.e. lakes) and linear networks (i.e. rivers).
2016
Un modèle de décomposition pour la détection de changement dans les séries temporelles d’images RSO
This paper presents a method for strong scatterers change detection in synthetic aperture radar (SAR) images based on a decomposition for multi-temporal series. The formula- ted decomposition model jointly estimates the background of the series and the scatterers. The decomposition mo- del retrieves possible changes in scatterers and the date at which they occurred. An exact optimization method of the model is presented and applied to a TerraSAR-X time series.