Predicting equilibrium channel geometry with a neural network

In an attempt to learn more about ML I decided to just jump in and try a project. Predicting channel geometry with a simple neural network.

[source code]

All in all, I’m fairly sure I didn’t learn anything about equilibrium channel geometries, but I had some fun and learned an awful lot about machine learning and neural networks. A lot of the concepts I have been reading about for weeks finally clicked when I actually started to implement the simple network.

I decided to use the data from Li et al., 2015 [paper link here], which contains data for 231 river geometries.

The dataset has variable bankfull discharge, width, depth, channel slope and bed material D50 grain size.

We want to be able to predict the width, depth, and slope from the discharge and grain size alone. This is typically a problem, because we are trying to map two input features into three output features. In this case though, the model works because the output H and B are highly correlated.

The network is a simple ANN, with one hidden layer with 3 nodes. Trained with stochastic gradient descent. Training curve below.

Matlab speed comparison of switch-case and if-then statements and hard-code

I frequently use the extremely methodical approach in scientific programming of “just trying things”. This means that I create a lot of different ways to try to do something to find the best way to do that thing. Sometimes this means getting a realistic result, a stable result, or a fast computation.

All functional programming languages offer if-then statements, whereby sections of code are evaluated on True evaluations of a statement. An if-then statement can be extended with an elseif to provide further evaluations to compare with for True. Using this framework, different “cases” or “methods” of doing a single thing can be quickly tested by switching between them in a script. The purpose of all is to make the code more readable and maintainable, by not having to comment/uncomment blocks of code to make a change in the calculation method.

For example, two methods x = 'test1' | 'test2' could be executed by a function, depending on the definition of x:

if strcmp(x, 'test1')
y = 1;
elseif strcmp(x, 'test2')
y = 2;
end

A similar functionality can be obtained with a switch-case statement:

switch x
case 'test1'
y = 1;
case 'test2'
y = 2;
end

But which is faster?? I suppose I always knew switch-case statements were slower than if-then statements, but I wanted to know how much of a difference it could make. I also often have >4 of these case statements as optional methods, and I wanted to know if the number of cases (or really how “deep” down the list they were) made a difference in speed.

I designed a simple experiment in Matlab to test this out. I looped through a simple set of statements 1 million times, and timed each scenario. You can find the source code here.

It turns out that switch-cases are about ~30% slower than if-then statements. Both are more than an order of magnitude slower than.

Most importantly though, the time increases linearly with each “layer” to the if-then or case-switch statement. To me, this stressed the importance of 1) having few cases that aren’t simply hard-coded in, or 2) at least sort the cases in order of likelihood of being used during program execution.

Markov Chain stratigraphic model

I recently returned from the NCED2 Summer Institute for Earth Surface Dynamics at Saint Anthony Falls Laboratory at University of Minnesota, which is a 10-day long, intense workshop for early-career scientists to talk about novel methods and ideas in our field. This was the ninth, and hopefully not last, year of the meeting.

Typically the participants will do some sort of physical experiment (kind-of what the lab is known for being the best in the world for), but this year’s theme was about mathematics in Earth surface processes. We covered a range of subjects, but a lot of discussion came back to random walks, diffusion-like processes/simulations, and probability. Kelly Sanks and I worked on a project together, which combined a lot of these ideas.

Our question was: is deltaic sedimentation, and the resulting stratigraphy random? Specifically, we hypothesized that a pure “random walk” model can not capture the effect of the “stratigraphic filter”. However, a biased random walk model, where Δz depends on the previous time’s Δz, can capture the dynamics of the system sufficiently to look like an actual stratigraphic sequence. The null hypothesis then is that both random walk models are indistinguishable from a stratigraphic succession.

To attack the problem, we decided to use a simple biased random walk model: Markov chains. These models have no physics incorporated, only probability, which made the project simple and easy to complete over beers at night. The Markov chain can be summarized as a sequence of “states” where the next state is chosen at random based on some probability to change to another given state or stay at the same state. Repeating this over and over gives a chain/sequence of states. In our simulation, the “states” were changes in elevation. Said another way, the next timestep’s change in elevation is dependent on the current timestep’s change in elevation. This hypothesis is grounded in the physics of river-deltas, whereby channel location (and thus locus of deposition/erosion elevation change) does not change immediately and at random, but rather is somewhat gradual.

When a system has more than a few states it becomes complicated to keep track of the probabilities, so we naturally use a matrix to define the probabilities. The so called “transition matrix” records the probability that given a current state (rows), the next state will be any given state (columns).

We used the Tulane Delta Basin experiment 12-1 which has a constant Qw/Qs ratio and RSLR rate for our data. This is kind of necessary for our question because we needed to know what the elevation was at some regular interval, and what the resulting stratigraphy after deposition and erosion was. The experimental delta surface was measured to mm accuracy every hour for 900 hours. We calculated Δz values over the entire experiment spatiotemporal domain, to inform our main biased random walk model.

Markov transition matrix of dz values calculated from the experimental elevation profiles. (axes are in mm/hr)

The main data-trained transition matrix shows that states will tend towards slightly aggradational. This makes sense since this system is deterministically net aggradation due to a RSLR forcing. We compare this data-trained model with two synthetic transition matrices: a Gaussian distribution (intentionally similar to the data) and a uniform distribution (i.e., purely random).

 

 

 

 

 

We then simulated the elevation change sequences predicted by each of the transition matrices and used the stratigraphic filter concept to determine the resultant stratigraphy. We did this a bunch of times, and totally chose one of the more “convincing” results (i.e., where the slope difference between simulations was larger).

a SINGLE realization of the modeled Markov sequences.

We found that the data-trained model reproduced the stratigraphic sequences from the experiment (based only on comparing slopes…). The other models were not too far off, suggesting that deltaic sedimentation is in fact not too far off from purely random.

Ultimately, we would simulate the models hundreds of times and make ensembles of the results to interpret. We also would use more meaningful statistics of the stratigraphic sequences to compare model and data, such as # of “channel cuts”, unconformity-bed thicknesses, “drowned” sequences, etc.

The main source code for all this lives on github here (https://github.com/amoodie/markov_delta) if you are interested in checking anything out or exploring some more. You will need to obtain the data from the SEN website listed above, or ask me to send you a copy of the data formatted into a `.mat` file (it’s too big to put on here…).
If you format the data yourself, produce a 3D matrix with each cell where the x-y-z coordinates are time-across-along.

Authors of the code and project are Kelly Sanks (@kmsanks) and Andrew J. Moodie (@amoodie).

Rivers 2 Stratigraphy

Explore the construction of stratigraphy through the interaction of channel geometry, lateral migration rate, subsidence rate, and avulsion frequency — interactively!

Imagine there is a river in a subsiding basin. As the river laterally migrates, it leaves behind sandy bar deposits, which are encapsulated in a matrix of floodplain muds.

The river deposits are lowered away from the basin surface by subsidence, they become part of the stratigraphic record (Figure below). Depending on the amount of lateral migration, frequency of avulsion, and subsidence rate, the degree of overlap of channel bodies in the stratigraphic deposit will vary. The movie above shows a vertical slice across the basin width, and a channel (the colorful boxes) laterally migrating across a river floodplain.

The movie comes from an educational module I am working on called Rivers2stratigraphy (https://github.com/amoodie/rivers2stratigraphy). The module is open source and relies only on Python dependencies.

This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No.1450681. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors(s) and do not necessarily reflect the views of the National Science Foundation.

A LaTeX package for peer review

I recently needed to do a peer review for a co-author’s manuscript. I rarely use MS Word or LO Writer anymore except for really simple documents (e.g., a quick abstract I will have to copy and paste into a browser form anyway), and instead use LaTeX for nearly all of my writing needs. So naturally, I needed a way to do my peer review in LaTeX.

There were no packages I could find that offered the functionality I was looking for, a simple environment to place comments, numbering of comments, and a place to organize line numbers.

Enter my custom package peer_review: a LaTeX package for commenting and responding throughout the peer review process (hosted on GitHub). This package offers a few simple environments and commands that make doing a peer review pretty painless — and naturally LaTeX quality beauty!

This is a demo of the package, and the demo file is included in the repository. This demo uses a class file from my collaborator Eric Barefoot called compact_proposal (also hosted on GitHub), but the package can be used on top of the base article class too.

If others end up picking up this package, or I really get it refined I may try and release it on CTAN. But for now, it can be downloaded from GitHub as a zip or cloned with:

git clone https://github.com/amoodie/peer_review.git

There are instructions for installing on the project README, and a man page with instructions on using the package, complete with examples.

Python 3 silly random name generator

I was recently working on a project to try to annoy my collaborator Eric by a scheduled script that sends him an email if there are open pull requests that I authored in a github repository he owns (pull-request-bot).

Along the way, I created this fun little function for coming up with random sort-of-realistic-but mostly-funny-sounding names in Python. I’m in the process of switching from Matlab to Python, so little toy functions and projects like this are a good way for me to learn the nuances of Python faster. In fact, I would argue this is the best way to learn a new programming language–no need for books or classes, just try things!

The script is super simple and I have posted it as a GitHub Gist so I won’t describe it all again here. Below the markdown description of the function is an actual Python file you can grab, or grab it from here.

Density stratification in fine-grained rivers

2017 was the year of my first talk at the fall meeting of the American Geophysical Union. It was pretty exciting and I was extremely nervous. In the end, it went okay, and I was able to present my work to a broad range of scientists. I presented my ongoing research based on field survey of the Yellow River, China during flood.

I hypothesized that the river would exhibit a density stratification in the flow. Density stratification occurs in a river because the entrainment of sediment into the flow affects the properties of the flow in bulk. I’ll explain with the help of a few graphics below. The below image is a profile of an open channel flow (thick black line is the channel bed) and the top of the graphic is the water surface.

a. the velocity profile of a steady and uniform open channel flow is well described by the logarithmic “law-of-the-wall” or log-law. This log-law takes the form of which predicts time-averaged velocity (u bar) as a function of the shear velocity (u*) and the log of the height above the bed (z). z0 is a reference height very near the bed, and κ is a constant. The equation, evaluated over the flow depth is shown on the left side of figure a. The implication of higher velocities near the surface means that momentum (ρ=mv) of the flow is higher near the flow surface than the bed. This condition is unstable, so momentum is redistributed from the surface to the bed through mass transfer. When the flowing mass of high momentum fluid reaches the bed, it dissipates, forming turbulent eddies that shed off the channel bed and move up into the water column.

 

b. the turbulent eddies coming off the channel bed cause sediment to be entrained into the flow and brought up from the bed towards the surface. The vertical distribution of sediment through the water column depends on the size of particles and the entraining velocity and can be estimated by the exponential Rouse equation.which predicts time-averaged concentration (c bar) as a function of the time-averaged near-bed concentration (cb bar) as a function of the height (z) above the bed (b) to the flow depth (H) and the Rouse number (ZR) which balances the settling velocity of particles (ws), to the entraining shear velocity (u*). The Rouse equation and log-law work well only in dilute suspensions, that is, flows in which the concentration of sediment is small enough to have no feedback on the flow.

 

c. In flows where the sediment concentrations are significant enough near the bed to have a feed back on the system a density stratification develops. In short, the high concentration of sediment prevents momentum redistribution from the surface fully reaching the bed, which has the net effect of reducing sediment suspension and enhancing flow velocities near the surface.

 

Because sediment transport (qs is width averaged transport) is the product of the velocity and concentration profiles integrated over the flow depth:this density stratification could significantly alter total sediment transport rates in fine-grain rivers from existing predictions. My ongoing research in this field is trying to resolve precisely what conditions lead to the development of a density stratification in the river.

This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No.1450681. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors(s) and do not necessarily reflect the views of the National Science Foundation.

Hurricane Harvey update

The Greater Houston Metro just got pounded by one of the largest storms (in terms of rainfall) on record: Hurricane Harvey. There was widespread flooding across all parts of the metro, but my home and Rice university were largely spared. The skies are clear today, after ~5 days of nearly continuous rainfall. However, the cleanup effort for this disaster will last years and cost billions of dollars. Nonetheless, I am certain that Houston will come back stronger and better than ever. This ordeal has made me even more proud to be a Houstonian.

At my house, we had a small leak in an interior bathroom, but because we received so much rain over the duration of the event, the roof became saturated and the sheetrock partially collapsed. No one was hurt, thankfully, and repairs are already underway.

I am currently working on an NSF RAPID grant, which I compiled the following figure for. This figure shows total rainfall during the Harvey event with the City of Houston (CoH) labeled. The precipitation data were collected from the National Weather Service and then summed to produce the total rainfall numbers for the below plot. I am proposing some work on the Brazos River (BR) so this feature is also labeled.

Outreach module — Flooding risk in low-lying landscapes

I have put together an outreach module that describes some of the risks of flooding in low-lying landscapes. The module runs in Matlab, either within a licensed environment, or with the Matlab Runtime Environment (which is available to anyone).

Accompanying the GUI is a worksheet that steps through all the aspects of the GUI and attempts to demonstrate the principles of flooding in deltas without detailing the math or physics behind the model. My hope is that it will be helpful to High School educators in their course work.

So far, I have only written a worksheet targeted at 9-12th graders, but plan to write two more (one for younger students, and one for more advanced undergraduate/graduate students) worksheets in the near future.

Below is a demonstration of the GUI. The full information for the module (including the source code) is on my GitHub here. The project website is here.

 

 


This material is based upon work supported by the National Science Foundation (NSF) Graduate Research Fellowship under Grant No.145068 and NSF EAR-1427177. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Accepted article — Continental divide location and mobility

My first peer-reviewed manuscript was recently accepted for publication in Basin Research! The manuscript explores the controls on continental divide location and mobility. In particular we explore the impacts of exogenic (allogenic) forcings that adjust topographic gradients in the landscape and ascertain what the relevant length scales for driving continental divide migration may be.

See the full publication here: http://onlinelibrary.wiley.com/doi/10.1111/bre.12256/full

A preview (submission version) can be found here, in accordance with the terms of the license agreement.

Appalachian 150 km filter overlain with contours of calculated total rock deformation (thick gray lines) since 3.5 Ma, resulting from the combined effects of mantle induced dynamic topography and the flexural response of the lithosphere to unloading and loading of sediments across the surface (Moucha & Ruetenik, 2017). Prince et al. (2011) suggest that the Roanoke River (R) will eventually capture the headwaters of the New River (N), causing the actual divide to jump farther west, ultimately approaching or reaching the synthetic divide. This prediction is consistent with patterns of rock deformation (Moucha & Ruetenik, 2017) and calculated χ values (Fig. 9), and crudely co-located with the Central Appalachian Anomaly (CAA) tomographically imaged by Schmandt & Lin (2014). Inset shows the record of sediment flux off the Appalachians into the Atlantic passive margin Baltimore Canyon Trough basin (Pazzaglia & Brandon, 1996). The unsteady flux is characterized by pulses in increased sediment deposition that are interpreted to result from large-scale drainage captures that rapidly incise an enlarged Atlantic slope drainage area. FZ – Fall Zone, scarp – Orangeburg, Chippenham, and Thornburg scarps from Rowley et al. (2013).

 

——————————————————————————

Exogenic forcing and autogenic processes on continental divide location and mobility

The position and mobility of drainage divides is an expression of exogenic landscape forcing and autogenic channel network processes integrated across a range of scales. At the large scale, represented by major rivers and continental drainage divides, the organization of drainage patterns and divide migration reflects the long-wavelength gradients of the topography, which are exogenically influenced by tectonics, isostasy, and/or dynamic topography. This analysis utilizes long-wavelength topography synthesized by a low-pass filter, which provides a novel framework for predicting the direction of divide movement as well as an estimate of the ultimate divide location, that is complementary to recent studies that have focused on the χ channel metric. The Gibraltar Arc active plate boundary and Appalachian stable plate interior, two tectonically diverse settings with ongoing drainage system reorganization, are chosen to explore the length scales of exogenic forcings that influence continental drainage divide location and migration. The major watersheds draining both the active and decay-phase orogens studied here are organized by topographic gradients that are expressed in long-wavelength low-pass filtered topography (λ ≥ 100 km). In contrast, the river network and divide location is insensitive to topographic gradients measured over filtered wavelengths < 100 km that are set by local crustal structures and rock type. The lag time between exogenic forcing and geomorphic response and feedbacks cause divide migration to be unsteady, and occur through pulses of drainage capture and drainage network reorganization that are recorded in sedimentological, geomorphic, or denudation data.

This article is protected by copyright. All rights reserved.

 

This material is based upon work supported by the National Science Foundation (NSF) Graduate Research Fellowship under Grant No.145068 and NSF EAR-1427177. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.