top of page

Developing Enhanced Deep Super-Resolution Networks for DEM

  • john06025
  • Jun 20
  • 5 min read

Geospatial projects often rely on the availability of high-resolution DEMs, which are generated via InSAR, Lidar, photogrammetry, or ground survey. Commercial VHR DEMs, such as Airbus WorldDEM Neo (5m), are suitable for small AOIs, but prohibitively costly for large regions (~$10+ AUD per km2). Global DEMs, such as Copernicus GLO-30 [2] and SRTM GL1 [3], are open-source, but limited to 30m resolution.


In Australia, DEMs can be accessed via 'Elvis Elevation and Depth' [4]. Many states provide access to at least some VHR DEMs, with NSW [5] providing state-wide 5m DEM, and VIC 10m [6]. However, for large regions of Australia, including most of NT and QLD, coverage is limited to specific regions that were subject to Lidar survey.


Super-resolution (SR) represents an emerging technology for obtaining pseudo VHR DEMs. SR models aim to reconstruct higher-resolution images, from lower-resolution inputs, by enhancing spatial detail beyond the sensor’s native resolution. Traditional methods, like bicubic interpolation, upscale images by estimating new pixel values based on the weighted average of surrounding pixels. In contrast, SR models aim to learn image features, and context, from large training datasets. I.e. these models are content-aware, meaning they can infer plausible, high-frequency details like edges, textures, or terrain structure, resulting in higher quality reconstructions (see Wang et al. for a review [7]).


Whereas the primary development of SR modalities continues in the RGB space, DEM SR is an active area of research. DEM SR publications have evaluated SRCNN [8], deep residual networks [9], EDSR [10], and ZSSR [11]. RCAN [12] and SwinIR [13] are two leading deep learning architectures, which have not yet been evaluated for DEM. Most recently, Yu et al. reported the development of a reversible SR network model based on normalized flow [14].


We chose to implement EDSR for minimal POC. EDSR was proposed by Bee Lim et al. in 2017 [15] and still achieves SOA performance on SR benchmarks [7]. EDSR is less prone to hallucination than GAN-based models, it can be tailored for various SR scales (X2, X3, etc), and can be modified to accept single dimension inputs, after pre-training on 3-dimensional RGB benchmarks.


We began by implementing a baseline EDSR model de novo. We then trained EDSR X2 and X3 models on the DIV2K dataset, a widely used RGB dataset for bench-marking SR models [16]. Based on peak signal-to-noise ratio (PSNR), our X2 and X3 baseline model performance was comparable with the paper results [15] (Fig.1).


Fig.1. Top: EDSR X3 validation results (PSNR) for DIV2K. Bottom: Example EDSR X3 LR (Left), SR (middle), and HR (right).
Fig.1. Top: EDSR X3 validation results (PSNR) for DIV2K. Bottom: Example EDSR X3 LR (Left), SR (middle), and HR (right).

The early convolutional layers, of a EDSR model that has been trained on DIV2K, capture universal visual patterns like edges and textures, making them transferable to other data types, like DEMs. We selected a region of NSW (Howes Valley), and generated a dataset from Copernicus GLO-30 (30m), and corresponding 5m DEM from NSW Spatial Services [5]. This required pixel-level alignment and tiling. We then ran chained X2 and X3 model inference, using the EDSR DIV2K models. I.e. super-resolving from 30m to 5m DEM resolution.


As expected, the DIV2K models delivered reasonable performance, based on PSNR and visual inspection. Note that X6 is likely at the upper limit of what is possible for DEM SR. The resolving of absolute elevation values, and slope preservation, appeared excellent. However, GLO-30 artifacts (such as the scan lines), were retained. Imputation of small terrain features was not optimal, reflecting their absence from the training set. We can also observe the model is imputing image texture over regions of DEM, which should be smooth, reflecting domain-specific differences (Fig.2).


Fig.2. Chained EDSR X2X3 performance on GLO-30, versus 5m DEM ground truth. Validation PSNR (top), truncated. Example LR, SR, HR (bottom). Middle: note the GLO-30 scan line artifact (a), and lack of DEM-specific imputation of fine tributaries (b).
Fig.2. Chained EDSR X2X3 performance on GLO-30, versus 5m DEM ground truth. Validation PSNR (top), truncated. Example LR, SR, HR (bottom). Middle: note the GLO-30 scan line artifact (a), and lack of DEM-specific imputation of fine tributaries (b).

Future development would proceed through fine-tuning of the X2 and X3 EDSR DIV2K models on DEM datasets, and evaluation against 5m ground truth. DIV2K models can serve as a strong initialization, but fine-tuning should improve artifact handling, discourage the generation of inappropriate terrain patterns, and improve imputation of terrain-specific features.


Fine-tuning might also include the implementation of DEM-specific loss functions and training metrics, and this area has considerable room for innovation. Whereas

traditional SR approaches focus on pixel-wise loss functions, DEM SR may benefit from task-aware composite objectives, which incorporate geomorphological knowledge (slope, aspect, terrain roughness index, etc) [17]. Lastly, approaches to manage GLO-30 artifacts should be considered, such as adding speckle, geometric distortions, and scan line artifacts to the image augmentation stack.


In summary:


  • Large gaps exist in the availability of open-source VHR DEM over Australia, particularly in NT and QLD.

  • Commercial VHR DEMs, such as Airbus WorldDEM Neo (5m), are prohibitively expensive for large regions.

  • Applying SR to DEMs is an active area of research, with published studies providing support. There is also room for innovation (model selection, loss function, dataset).

  • A large amount of training data is available to leverage (open source state-wide 5m NSW DEM, smaller 5m DEM Lidar surveys in QLD and NT, corresponding GLO-30 30m DEM).

  • Initial POC produced reasonable results on GLO-30 using chained DIV2K X2X3 SR, versus 5m ground truth.

  • Fine-tuning on a DEM dataset can reasonably be expected to improve SR performance on GLO-30, versus the current DIV2K models.



References:


[1] Airbus WorldDEM Neo (5m), UP42. https://up42.com/marketplace/data/archive/worlddem-neo



[3] Shuttle Radar Topography Mission (SRTM GL1) Global 30m DEM.


[4] Elvis Elevation and Dept. https://elevation.fsdf.org.au/




[7] Wang, Z., Chen, J. and Hoi, S.C., 2020. Deep learning for image super-resolution: A survey. IEEE transactions on pattern analysis and machine intelligence, 43(10), pp.3365-3387.


[8] Chen, Z., Wang, X., Xu, Z. & Hou, W. Convolutional neural network based dem super resolution. ISPRS Arch. 41, 247–250 (2016).


[9] Jiao, D., Wang, D., Lv, H. & Peng, Y. Super-resolution reconstruction of a digital elevation model based on a deep residual network. Open Geosci. 12, 1369–1382.


[10] Zhou, A. et al. An enhanced double-filter deep residual neural network for generating super resolution DEMs. Remote Sens. 13, 3089. https://doi.org/10.3390/rs13163089 (2021).


[11] Lin, X. et al. A DEM super-resolution reconstruction network combining internal and external learning. Remote Sens. 14, 2181. https://doi.org/10.3390/rs14092181 (2022).


[12] Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B. and Fu, Y., 2018. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 286-301).


[13] Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L. and Timofte, R., 2021. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1833-1844).


[14] Yu, J., Li, Y., Bai, X., Yang, R., Cui, M., Wu, H., Li, Z., Su, F., Li, Z., Liang, T. and Yan, H., 2025. A DEM super resolution reconstruction method based on normalizing flow. Scientific Reports, 15(1), p.10681.


[15] Lim, B., Son, S., Kim, H., Nah, S. and Mu Lee, K., 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 136-144).


[16] Agustsson, E. and Timofte, R., 2017. Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 126-135).


[17] Zhang, Y., Yu, W. and Zhu, D., 2022. Terrain feature-aware deep learning network for digital elevation model superresolution. ISPRS Journal of Photogrammetry and Remote Sensing, 189, pp.143-162.

 
 
 

Comments


bottom of page