DSNE blog: Reasons for caution with large environmental datasets – Dan Clarkson

Image
Greenland map

In the context of DSNE’s work, let’s say we wanted to examine surface temperature measurements for Greenland (in the context of ice melt) by applying extreme value models. Among the many choices available – weather stations and satellite datasets being the main sources– we choose MODIS satellite data for its high spatial and temporal resolution, with around 20 years of 1km daily data available. This gives us over 7,000,000 observations in each of the nearly 7,500 days. While this is more than enough data to get a clear overview of the ice sheet, for increasingly complex models the scale of the dataset quickly becomes a challenge. For spatial extremes this is particularly true, with some models currently able to handle hundreds of points compared to the millions that would be required for an analysis of the entire dataset. For more information, please find in Data Science of Natural Environment (DSNE) blog from Dan Clarkson

Dan Clarkson

Member (Student)
LU - School of Mathematics and Statistics