Duplicate infrastructure locations in Infra API
K
Kris Moreau
At SkyTruth, we've observed duplicates in the Infrastructure data where single pieces of infrastructure appear multiple times in the DB.
When comparing with Sentinel-1 imagery, only a single piece of infrastructure appears to be present. To see in Cerulean (which relies on the GFW API), view the infrastructure locations on the map with the squiggly line button next to each infrastructure source: https://cerulean.skytruth.org/?zoom=11.419613&lat=57.192989&lng=0.99397&slickId=3573395 - Also shown in screenshot. Structure IDs are 352531 and 116096 in this example, but we observe this relatively frequently.
This is an issue for our advocates and journalists investigating causes of marine pollution because it looks like there are 2 possible polluters when there is actually just 1 likely culprit.
--Kris
J
Jona Raphael
To be clear, this most often happens when an object has been in a single location for a number of years, and the infra algorithm found it for some amount time (assigning it a Structure ID), then lost track of it for a long time (even though it was still there), then found it again (assigning it another Structure ID that is different from the first one).
We would like to see these either be combined, or indicated as part of the same object in some way. Note, we would also need a record/history of any such recombinations over time so that our historic DB can be updated.