#---------------------#
# General information #
#---------------------#

Efficient deep learning surrogate method for predicting the transport of particle patches in coastal environments
Authors:
 J.M. Fajardo-Urbina
 Y. Liu
 S. Georgievska
 U. Grawe
 H.J.H Clercx 
 T. Gerkema
 M. Duran-Matute


Corresponding author:
  M. Duran-Matute (m.duran.matute@tue.nl) 
  Fluids and Flows group, Department of Applied Physics, Eindhoven University of Technology
  P.O. Box 513, 5600 MB Eindhoven, The Netherlands.


#--------------#
# Introduction #
#--------------#

The data provided in this repository can be used to run the main models described in the manuscript "Efficient deep learning surrogate method for predicting the transport of particle patches in coastal environments".

A small sample of this dataset is also stored in the GitHub repository (https://github.com/JeancarloFU/paper_Efficient_Deep_Learning_Surrogate_Method_For_Lagrangian_Transport). Here, scripts, and notebooks (based on Python v3.8) used to run the main models described in the manuscript are archived.


#----------------#
# Data structure #
#----------------#

The following NetCDF files are used to run the python Notebook "surrogate_and_optimal_prediction_example.ipynb" provided in the GitHub repository. The simplified Lagrangian model (Eq. (4) of the manuscript) is implemented in this notebook. In this notebook, all the instructions to run the surrogate and optimal prediction experiments are also provided.

------
* dws_bathymetry_200x200m.nc 
This file contains information about the bathymetry and coordinates of the numerical domain:
 - Dimensions: xc: 820, yc: 486
 - Coordinates:
    xc(xc): x-position along the local axis of the numerical model GETM (rotated anti-clockwise 17 degress with respect to the East direction. 
    yc(yc): y-position along the local axis of the numerical model
    lonc(yc, xc): longitude of the center of each grid cell (degrees east)
    latc(yc, xc): latitude of the center of each grid cell (degrees north)
 - Variables: 
    bathymetry(yc, xc): bathymetry of the numerical domain (m)

------
* dws_boundaries_400x400m.nc
This file contains the xy-coordinates of the islands and mainland to identify if the particles are stuck inside them when running the surrogate and optimal prediction experiments. The boundaries of the Dutch Wadden Sea is also included. All the coordinates are based on the spatial resolution used to compute Lagrangian statistics (400m x 400m) instead of the original resolution of the numerical model GETM (200m x 200m).
 - Dimensions: xy: 2, np_dws: 941, np_bdr0: 572, np_bdr1: 185, np_bdr2: 87, np_bdr3: 173, np_bdr4: 151, np_bdr5: 95
 - Variables:
    bdr_dws(np_dws, xy): xy-coordinates of the boundary of the DWS (m)
    bdr_coast(np_bdr0, xy): xy-coordinates of the boundary of the mainland (m)
    bdr_texel(np_bdr1, xy): xy-coordinates of the boundary of the Texel island (m)
    bdr_vlieland(np_bdr2, xy): xy-coordinates of the boundary of the Vlieland island (m)
    bdr_terschelling(np_bdr3, xy): xy-coordinates of the boundary of the Terschelling island (m)
    bdr_ameland(np_bdr4, xy): xy-coordinates of the boundary of the Ameland island (m)
    bdr_schiermonnikoog(np_bdr5, xy): xy-coordinates of the boundary of the Schiermonnikoog island (m)

------ 
* file_advection_dispersion_for_optimal_prediction.nc
This file is a sample dataset for running the optimal prediction experiment:
 - Dimensions: time: 16, yc: 121, xc: 294
 - Coordinates: 
    time(time): time in datetime64 format with a resolution of 12.42 h (seconds since 2015-01-01T04:53:48)
    xc(xc): local x-position of each 400m x 400m grid cell where statistics are computed
    yc(xc): local y-position of each 400m x 400m grid cell where statistics are computed
 - Variables:
     advx(time,yc,xc): x-component of advection (m)
     advy(time,yc,xc): y-component of advection (m)
     dispxx(time,yc,xc): xx-component of the symmetric 2D dispersion tensor (m2)
     dispyy(time,yc,xc): yy-component of the symmetric 2D dispersion tensor (m2)
     dispxy(time,yc,xc): xy-component of the symmetric 2D dispersion tensor (m2)

------ 
* file_advection_dispersion_for_surrogate_prediction.nc
This file is a sample dataset for running the surrogate prediction experiment. Its structure is identical to the one described for the file file_advection_dispersion_for_optimal_prediction.nc

------
* dws_boundaries_200x200m.nc
This file is only use for plotting and contains the coordinates of boundaries of the Dutch Wadden Sea:
 - Dimensions: xy: 2, np_dws: 1797
 - Variables:
    bdr_dws(np_dws, xy): xy-coordinates of the boundary of the DWS (m)


#-------------------------------------------------------------#
# Information about raw numerical data and the ConvLSTM model#
#-------------------------------------------------------------#

The netCDF files provided in this repository are generated from the following raw data:

* Eulerian data from the GETM/GOTM model, and its set-up is described in:
    - Duran-Matute et al. (2014): https://doi.org/10.5194/os-10-611-2014
    - Grawe et al. (2016): https://doi.org/10.1002/2016JC011655
* The Lagrangian model Parcels v2.4.2 can be installed from:
    - https://anaconda.org/conda-forge/parcels
    - https://oceanparcels.org
* The ConvLSTM model used in our study is built using Pytorch (https://anaconda.org/pytorch/pytorch), and its implementation is described in:
    - Liu et al. (2021) https://doi.org/10.1175/MWR-D-20-0113.1