Funding suppot is greatly appreciated from National Science Foundation, Department of Energy, USGS, Google.org, Gates foundation, and other organizations. Thank you for enabling innovation!

Selected funding projects (in reverse chronological order of award)

 

12. Cooperative Institute for Research to Operations to Hydrology (CIROH) 

Phase 1. Improving the integration of ML with physically-based hydrologic and routing modeling via large-scale parameter and structure learning schemes.

Penn State joins CIROH in developing capabilities to improve water management for the nation, with Shen as the main Penn State lead.   

11. National Science Foundation EAR-2221880  

Towards better understanding of global low flow dynamics under climate change with next-generation, differentiable global hydrologic models

10. US Dept of Interior G21AC10563-00. 

Process learning in stream temperature modeling.

 

 

9. Gates Foundation INV-018429. 

Low cost pest and climate change stress prediction using Artificial Intelligence

 

 

8. US Department of Energy DE-SC0021979. 

A highly efficient deep-learning-based parameter estimation and uncertainty reduction framework for ecosystem dynamics models

 

7. (Instrument grant) NSF PHY-2018280

MRI: Acquisition of a Purpose-Built Deep Learning Compute System to Advance Fundamental Research and Education at Penn State. 

6. National Science Foundation PRISM OAC #1940190

Website

The natural-human world is characterized by highly interconnected systems, in which a single discipline is not equipped to identify broader signs of systemic risk and mitigation targets. For example, what risks in agriculture, ecology, energy, finance and hydrology are heightened by climate variability and change? How might risks in, for example, space weather, be connected with energy, water and finance? Recent advances in computing and data science, and the data revolution in each of these domains have now provided a means to address these questions. The investigators jointly establish the PRISM Cooperative Institute for pioneering the integration of large-scale, multi-resolution, dynamic data across different domains to improve the prediction of risks (potentials for extreme outcomes and system failures). The investigators' vision is to develop a trans-domain framework that harnesses big data in the context of domain expertise to discover new critical risk indicators, holistically identify their interconnections, predict future risks and spillover potential, and to measure systemic risk broadly. This project is part of the National Science Foundation's Harnessing the Data Revolution (HDR) Big Idea activity.  

 

5. Google AI Impacts Challenge. Tides 1904-57775

This project is one of the 20 grantees of the Google AI Impacts Challenge in 2019. The goals of the project are to (1) create a landslide database focusing on events there were not previously reported in the news; and (2) build a model that improves our predictive capability of the landslide hazards. We will extensively leverage modern AI technologies and big datasets to achieve these goals. The project starts in June 2019. The mission of this project is to minimize the societal impact of landslide hazards with better predictive capability. This site supports the project by providing progress tracking documents (available to project personnel), host wiki materials, links to data, and tutorials, and announce news, updates and results gallery.

Personnel involved: many. See dedicated deepLDB website

4. National Science Foundation EAR#1832294 (finished)

Examining groundwater-flood and soil moisture-flood relationships across scales using national-scale data mining, deep learning and knowledge distillation

In many parts of the United States, it has been shown that groundwater levels and soil moisture, which quantifies the wetness of the soil, are connected via the mechanism of flood production. Water cannot infiltrate into the ground when groundwater is close to the surface and is thus forced to quickly run off to rivers, creating higher flooding risks. However, the relationship between groundwater and floods has been found to be highly diverse and difficult to predict. Depending on terrain, groundwater depth, and many other factors, floods lead groundwater increase in some cases while groundwater can lead floods in others. Previous research from selected experimental watersheds have not resulted in a comprehensive and transferable understanding of the controlling processes. This project will take a big-data, machine learning approach to enhance our understanding of this relationship, allowing us to heuristically exploit previously under-utilized groundwater data for flood predictions and reducing damages. Using learning patterns from national-scale groundwater and streamflow data, the machine learning algorithms will create plausible groundwater-flood relationships. Taking advantage of the big hydrologic data from available satellite missions, this project will create shared undergraduate course modules to enhance student's ability to work with big data and increase their awareness of global water issues. This research advances hydrologic science by answering the following overarching question: at catchment scales, do groundwater levels in the catchment provide predictive power for flood threshold functions and baseflow? We will address this question in multiple small steps. We will identify the kinds of groundwater-rainfall-runoff (GW-P-Q) relations that can be found over the Continental United States.

Personnel involved: Dapeng Feng

List of papers

3. Department of Energy award DE‐SC0016605 (ongoing)

HyperFACETS: A Framework for Improving Analysis and Modeling of Earth System and Intersectoral Dynamics at Regional Scales

All work associated with this project has been categorized into seven interwoven tasks. To accomplish its objectives, Hyperion incorporates (1) continuous outreach and engagement to ensure a focus on stakeholder needs, (2) development and accumulation of metrics associated with processes, features, and outcomes, (3) a software suite capable of directly evaluating the quality of regional climate and hydrological datasets, (4) production of high-quality regional climate and hydrological data that can be used for broader applications, including future projections, (5) model optimization and sensitivity experiments so as to maximize their capability to credibly simulate the integrated hydroclimate system, (6) an assessment of uncertainty linking process-based representations, model coupling, and resolution to stakeholder-relevant outcomes and (7) sensitivity experiments to characterize the magnitude and scale of irrigation influences on the climate system. These tasks are addressed in detail below.

Shen's team is using ML-based hydrologic models as well as process-based hydrologic models, to examine future hydrologic trends and improve hydrologic understanding.

PSU personnel involved: Wen-Ping Tsai (just departed).

2. Department of Energy award DE‐SC0010620 (2013-2018, finished)

Scale-aware, improved hydrological and biogeochemical simulations of the Amazon under a changing climate

With this project, we examined soil moisture scaling across landscapes via fractal, moment matching and principal orthogonal decomposition (in collaboration). We studied water balance of central Amazon basins including evapotranspiration, streamflow and water storage and found how annual precipitation and groundwater are the main controls of streamflow in the region. Through validating our model with experimental data, we also produced substantial insights to the seasonal and inter-annual variability of water sources in a floodplain lake in the Amazon, helping to improve future designs of large-scale models for these systems. We also improved computational infrastructure and data structure in Fortran. Data mining efforts were included in this study. 15 peer-reviewed papers were published from this project, of which 12 involved PSU. One more paper is under review.

Here is the List of publications from this project.

PSU personnel involved: Xinye Ji. Kuai Fang 

1. Evaluating the impact of renewable energy plants water use on groundwater resources in the Chuckwalla Basin, CA (finished)

We studied the natural recharge in the Chuckwalla basin, CA, which has had significant development in solar power. Water and energy is in conflict in this region so an adequate estimate of renewable recharge is important. Our paper, with constraint from moisture and groundwater observational data, put the estimated recharge in the middle between previous ones.

Motivated by this study, a side effort examined the worth of hydraulic conductivity and groundwater head data.

PSU personnel involved: Kuai Fang, Tasnuva Mahjabin

Papers: 1. CA desert recharge paper; 2. Groundwater information content paper

 

Research Activities (under construction) 

We are interested in constantly integrating process-level descriptions, including wetland, desert and mountain processes into the Process-based Adaptive Watershed Simulator (PAWS). Also being incorporated are nutrient/bacteria transport, novel datasets, and improved algorithms. We strive toward a trans-disciplinary, computationally-efficient tool that bridges process scientists, field scientists across a wide range of disciplines and discover potential linkages in the context of accurate hydrologic predictions. Below are some of our recent and current research activities.

I. Hydrologic Deep Learning

 

Fang, K.*, CP. Shen, D. Kifer and X. Yang, Prolongation of SMAP to Spatio-temporally Seamless Coverage of Continental US Using a Deep Learning Neural Network, Geophysical Research Letters, (2017), doi: 10.1002/2017GL075619, preprint accessible at: arXiv:1707.06611.

Here at Multi-scale Hydrology, Processes and Intelligence group, we study how mother nature works using state-of-the-art machine learning technique, especially times series deep learning. We first utilized time series deep learning to examine large, raw hydrologic data. Deep learning is not the end. It is a means to better advance our understanding of hydrology and a path toward stronger predictive capability.

We utilized the Long Short-Term Memory (LSTM) to reproduce soil moisture dynamics. We show it is not only viable but also more robust than simpler statistical methods. This research shows it is possible to hindcast soil moisture dynamics using deep learning, opening up a range of new possibilities for advancing science.

LSTM

In all tests, LSTM shows stronger performance for the test set, when compared to simple feedforward neural network, Auto-regressive models (AR), and regularized linear regression (LR).

LSTM outperforms other methods

 

LSTM is also suitable for long-term hindcasting:

long-term hindcast using LSTM

 

II. Large-scale, integrated assessment of carbon-nutrient-water interactions

 

Strong controls and feedbacks have been identified between water, carbon and nitrogen biogeochemistry in various environments. In ecosystems where vegetation growth is limited by nutrient availability [Galloway et al., 2004; C A Wilson et al., 1999]. The relative importance of N is difficult to evaluate from field surveys alone due to the various time scales of the complex N biogeochemical reactions, its interactions with the carbon cycle, and its covariance with climatic input (e.g. see [Burke et al., 1997]).
Using a hydrologic model equipped with comprehensive land surface processes, we are able to identify and evaluate previously unrecognized linkages between carbon, nutrient and water. Using Analysis of Variance (ANOVA), we quantitatively determine the strengths of these controls on ET, net primary production (NPP) and other important variables. Groundwater flow is found to be the major control on runoff and infiltration, with soil texture ranking next, while vegetation type and nitrogen levels are found to dominate NPP, top soil temperature and transpiration. Soil texture and groundwater are found to have comparable influence on soil moisture, which is in agreement with analysis of field data in the literature. All controls are found to co-limit ET, which serves as the nexus for ecosystem-hydrology interactions. From the simulations results, we find that nitrogen significantly controls transpiration, through which it influences other hydrologic fluxes.

Enhancement: Wetland
As one of the recent changes, PAWS adds a lowland storage (LS) component to the overland flow equation, aiming at simulating wetlands, paddy fields, puddles, potholes and other low-lying features of the landscape. This compartment provides storage capability to the overland flow domain, depending on local topography and land cover. PAWS assumes that the wetlands are all connected to the main flow paths and is formulated from the experience that depressions storages exist after rainfall even if soil on higher ground is dry. This compartment describes local concentration of water in a cell and allows groundwater to exfiltrate prior to the saturation of the entire soil column. In reality, groundwater return flow to the surface occurs when groundwater head rises above the lowest point of surface elevation and full saturation of the soil is not required. Since sub-cell variation of elevation is considered in this formulation, it can partially account for the lack of resolution.

wetland compartment sketch

Model validations (additional validations)

Clinton stream hydrograph

soil temperature

LAI

monthly ET comparison with MODIS
soil moisture
(soil moisture comparison is from Red Cedar River watershed)

Fluxes

spatial fluxes

Relative importance of controls

relative importance of controls for different variables

III. Water - energy nexus

  (more details coming soon....)

IV. Multi-scale, high performance integrated modeling

 

We are in the process of parallelizing PAWS.

V. Multi-scale, high performance sub-surface reactive transport modeling

 

While working at Lawrence Berkeley National Lab, Dr. Shen developed a new approach that combines the robustness of algebraic multigrid with an efficient and accurate finite volume method to solve elliptic problems arising from microscale flow and transport in complex geometries. We use leadership computing facilities to simulate reactive transport at the pore-scale. The high performance PDE package CHOMBO and the geochemical package Crunchflow were employed.
This approach allows reactive transport to be resolved at pore-scale, providing better estimation of reactions rates and a tool to examine scale issues.
Report.
Deficiencies of geometric MG:

Geometric Multigrid
Discrete geometry of crushed calcite packed in a cylinder (not shown) obtained from image data using EB representation. Image obtained at the Advanced Light Source courtesy of Jonathan Ajo-Franklin and Li Yang, Lawrence Berkeley National Laboratory:

geometries of the crushed calcite.

 

The solutions below are obtained from NERSC hopper, using 2048 processors

solution

Pressure (pressure)

VI. Adaptive mesh refinement for multiscale modeling

 

Adaptive mesh refinement (AMR)is a genre of computational fluid dynamics (CFD) approach that attempts to capture fine-scale features and improve numerical accuracy at lower computational cost, by dynamically casting hierarchical computational grid where it is needed. This concept was first popularized by Berger and Colella in 1989 paper.
Shen (Journal of Computational Physics 2011) coupled AMR with Weighted Essentially Non-Oscillatory (WENO) scheme, a shock-preserving high order numerical scheme (3rd and 5th order were demonstrated) to utilize the salient features of both methods. The combined method can simulate hydrodynamics at very high accuracy significantly saved cost.

Grid hierarchy:
Grid structure of AMR

Construction of fine grid and updating of coarse solution

AMR demo

1D Euler equation (gas):

Shock/turbulence problem

1D shallow water equation, dam break problem:

Dam Break

2D Euler equation:

2D Euler double Mach reflection

2 level AMR patches