Machine learning in Geoscience Seminar: syllabus and review

I led the organization of a “Machine Learning in Geosciences” seminar this fall (2018). I did not do it alone, I worked with my advisor Jeff Nittrouer, and Texas AM professor Ryan Ewing. My responsibilities included selection of reading material for the course; Jeff and Ryan handled the invited speakers and student presentations.

The seminar was not for credit, but we did have a healthy 10-12 students and faculty participate throughout the semester. Ultimately I think the seminar was a huge success; I learned an absolute ton about machine learning, and consider myself to be literate on the subject now. In fact, I’ve even begun to incorporate some deep learning into my research and have written a proposal for funding to continue to pursue this endeavor. 

The course design was roughly as follows:

  • four (4) weeks of background reading on machine learning tools, techniques, vocabulary
  • four (4) weeks of primary scientific readings on various geoscience subjects
  • three (3) weeks of invited presenters
  • two (2) weeks of student project presentations

You can view the complete, detailed syllabus we created here:
This page contains a ton of resources.

AGU posters 2018: SedEdu and Density Stratification

I’m presenting two posters at AGU this year!

I’m really excited to be sharing a project I’ve been working on which integrates research and teaching. I am giving a poster Friday morning titled: “SedEdu: developing and testing a suite of computer-based interactive educational activities for introductory sedimentology and stratigraphy courses”. The poster will explain my SedEdu project and particularly emphasize the rivers2stratigraphy moduleHere is the abstract. Below is the poster, click the image to view a pdf.

I’m also looking forward to sharing an update on my ongoing research on the sediment transport in the Yellow River, China. This poster will be on Friday afternoon, and is titled: “Suspended-sediment induced stratification inferred from concentration and velocity profile measurements in the flooding lower Yellow River, China”. Here is the abstract.

If you’re going to be at AGU, let’s chat!

The Rouse-Vannoni-Ippen concentration profile interactive module

I have made another interactive GUI toy model thing for the teaching of the Rouse concentration profile. The activity is really simple right now, but I may add more features in the future. You can see the source code and download the module here.

You can specify the grain size and shear velocity (two of the major controls on the Rouse profile) and see how the profile responds in a simplified model (helpful for teaching) and the full Rouse model.

The simplified model is useful for teaching the mechanics of the concentration profile and can be defined as:

where c is the concentration at height above the bed z, cb is a known reference concentration defined at height b above the bed, and ZR is the Rouse number given as:

where ws is the settling velocity of the grain size in question, α and κ are constants equal to 1 and 0.41, and u* is the shear velocity.

You can see from the form of this simplified model how, for a constant Rouse number, the profile is an exponential decay from the reference concentration. It is also easy to see how changing the Rouse number, through a change in grain size (settling velocity) or shear velocity will change the rate of decay.

The full model has just an additional term to consider, but has the same basic exponential decay form. Here, H is just the total flow depth.


The module uses the Garcia-Parker entrainment relation to calculate near-bed concentration, and used the Ferguson-Church relation for settling velocity.

The Graduate Interdisciplinary Earth Science Symposia: a year in review, and looking to the future

Below is an article I wrote for our department newsletter about the GIESS symposia. I’m publishing it here because it didn’t make the cut for the newsletter in this cycle, and it’s kind of a time-sensitive article because it justifies moving forward with GIESS in the current format.

There is a short description of what GIESS is in the third paragraph.


The Graduate Interdisciplinary Earth Science Symposia:
a year in review, and looking to the future

Andrew J. Moodie

Scientists are typically effective written communicators, since professional success in academia is so closely linked with funded grant proposals and published manuscripts. However, oral communication skills are frequently considered subordinate and not consciously developed and practiced by early-career scientists.

In the summer of 2017, a group of EEPS graduate students addressed this training gap and the shortcomings of a weekly departmental seminar series by launching the Graduate Interdisciplinary Earth Science Symposia, or GIESS.

GIESS (pronounced “geese”) provides a forum for oral communication skill development while simultaneously encouraging department-wide participation in advancing the presenter’s science. The GIESS committee’s plan was to decrease the burden from weekly to monthly, limiting the seminar to two speakers, and intentionally adding prestige to presenting, thereby increasing the quality of the talks. Furthermore, the plan introduced “pop-up talks”, brief presentations allowing no more than two slides for no more than two minutes, as a lower-stakes opportunity to practice oral communication. Finally, lunch would be provided for participants to bolster attendance, and year-end awards would be given to select speakers and participants. Finally, department members were invited to provide feedback through an online survey at the end of the Spring semester to assess the inaugural year of the GIESS. The survey garnered 22 responses, predominantly from graduate students (19 of 22, 86%).

a) respondents overwhelmingly agreed that the GIESS was an improvement over the old seminar format. b) a paired t-test determines at above the 99% confidence level that the perceived quality of talks was improved by the GIESS format (p = 0.0001), based on respondents’ declared average quality of talks in the old seminar format and in the GIESS format on an integer scale from 1–10. c) respondents agreed that the pop-ups were a good addition to the GIESS. d) unsurprisingly, respondents were very happy with lunch. It should be noted that the survey designer (that’s me) has no training in survey design and the sample size is small, so the results presented herein should not be expected to withstand rigorous methodological or statistical scrutiny.


Overwhelmingly, participants agreed that the GIESS was an improvement over the old Friday Seminar format (Figure 1a, 68%), suggesting that the committee’s primary objective to improve the program was achieved. More importantly, the committee met the goal to improve the quality of the talks. A paired t-test determines at above the 99% confidence level that the perceived quality of talks was improved by the GIESS format (Figure 1b, p = 0.0001), based on respondents’ declared average quality of talks in the old seminar format and in the GIESS format on an integer scale from 1–10. The mean score given to talk quality improved from 4.9 to 7.6 between the old format and the GIESS format, respectively. The committee intends to have the presentation format in the GIESS remain largely the same in the 2018–19 year, possibly tweaking the duration of each talk based on respondent feedback.

Pop-up talks were a new addition, so a close look at participant opinions is warranted. Pop-up talks were scheduled between the two main presentations of the meeting, with typically three to five people participating. The addition of a short presentation style was decidedly favored (Figure 1c), with more than 80% of respondents agreeing that pop-ups were a positive addition. However, three respondents (15%) felt strongly that pop-up talks were a bad idea, though none clarified their position in an open-ended response. Therefore, the committee will keep pop-ups as a permanent feature of the GIESS. The committee also surveyed respondents about allotting time for audience questions of pop-up presenters. Seven respondents stated there should be no questions, seven were neutral on the issue, and seven think it would be a good idea; in the 2018–19 year, the committee will explore allowing questions directed at pop-up presenters.

Unsurprisingly, when more than 80% of survey respondents are hand-to-mouth graduate students, offering a free lunch to participants was favored; 20 out of 21 survey respondents agreed that providing lunch was a positive (Figure 1d). Attendance at the GIESS was vastly improved from the old seminar format, however, the number of students and faculty in attendance faltered late into the year. The committee tentatively attributes the increased attendance to lunch, though hopefully participants also came for the programming. In specific lunch-related feedback, one student asked for a “dessert table” next year, and more than one respondent added that they would like to see “platters of Chick-fil-A spicy chicken sandwiches with tangy Polynesian sauce.” Noted.

Finally, the GIESS committee would like to recognize the award winners who gave outstanding presentations, creative pop-ups, and engaged throughout the symposia. Best Talk awards for the year go to Brandee Carlson and David Blank, whose research presentations are respectively titled “Tie channels on deltas: A case study from the Huanghe (Yellow River) delta, China” and “Discrete element method as a tool for simulating megathrust earthquakes: insights into stress transfer”. Chenliang Wu and Cailey Condit received the Best Pop-up awards for presentations that were both fun and informative. The Best Participant awards were given to two first-year students Michael Lara and Patrick Phelps, because of their active engagement as participants in all aspects of the GIESS.

I would like to directly thank the committee members who helped make this inaugural year of the GIESS an outstanding success: Laura B. Carter, James Eguchi, Sahand Hajimirza, Harsh Vora, and Daniel Woodworth. Let’s make it even better next year.


Predicting equilibrium channel geometry with a neural network

In an attempt to learn more about ML I decided to just jump in and try a project. Predicting channel geometry with a simple neural network.

[source code]

All in all, I’m fairly sure I didn’t learn anything about equilibrium channel geometries, but I had some fun and learned an awful lot about machine learning and neural networks. A lot of the concepts I have been reading about for weeks finally clicked when I actually started to implement the simple network.

I decided to use the data from Li et al., 2015 [paper link here], which contains data for 231 river geometries.

The dataset has variable bankfull discharge, width, depth, channel slope and bed material D50 grain size.

We want to be able to predict the width, depth, and slope from the discharge and grain size alone. This is typically a problem, because we are trying to map two input features into three output features. In this case though, the model works because the output H and B are highly correlated.

The network is a simple ANN, with one hidden layer with 3 nodes. Trained with stochastic gradient descent. Training curve below.

Matlab speed comparison of switch-case and if-then statements and hard-code

I frequently use the extremely methodical approach in scientific programming of “just trying things”. This means that I create a lot of different ways to try to do something to find the best way to do that thing. Sometimes this means getting a realistic result, a stable result, or a fast computation.

All functional programming languages offer if-then statements, whereby sections of code are evaluated on True evaluations of a statement. An if-then statement can be extended with an elseif to provide further evaluations to compare with for True. Using this framework, different “cases” or “methods” of doing a single thing can be quickly tested by switching between them in a script. The purpose of all is to make the code more readable and maintainable, by not having to comment/uncomment blocks of code to make a change in the calculation method.

For example, two methods x = 'test1' | 'test2' could be executed by a function, depending on the definition of x:

if strcmp(x, 'test1')
y = 1;
elseif strcmp(x, 'test2')
y = 2;

A similar functionality can be obtained with a switch-case statement:

switch x
case 'test1'
y = 1;
case 'test2'
y = 2;

But which is faster?? I suppose I always knew switch-case statements were slower than if-then statements, but I wanted to know how much of a difference it could make. I also often have >4 of these case statements as optional methods, and I wanted to know if the number of cases (or really how “deep” down the list they were) made a difference in speed.

I designed a simple experiment in Matlab to test this out. I looped through a simple set of statements 1 million times, and timed each scenario. You can find the source code here.

It turns out that switch-cases are about ~30% slower than if-then statements. Both are more than an order of magnitude slower than.

Most importantly though, the time increases linearly with each “layer” to the if-then or case-switch statement. To me, this stressed the importance of 1) having few cases that aren’t simply hard-coded in, or 2) at least sort the cases in order of likelihood of being used during program execution.

Markov Chain stratigraphic model

I recently returned from the NCED2 Summer Institute for Earth Surface Dynamics at Saint Anthony Falls Laboratory at University of Minnesota, which is a 10-day long, intense workshop for early-career scientists to talk about novel methods and ideas in our field. This was the ninth, and hopefully not last, year of the meeting.

Typically the participants will do some sort of physical experiment (kind-of what the lab is known for being the best in the world for), but this year’s theme was about mathematics in Earth surface processes. We covered a range of subjects, but a lot of discussion came back to random walks, diffusion-like processes/simulations, and probability. Kelly Sanks and I worked on a project together, which combined a lot of these ideas.

Our question was: is deltaic sedimentation, and the resulting stratigraphy random? Specifically, we hypothesized that a pure “random walk” model can not capture the effect of the “stratigraphic filter”. However, a biased random walk model, where Δz depends on the previous time’s Δz, can capture the dynamics of the system sufficiently to look like an actual stratigraphic sequence. The null hypothesis then is that both random walk models are indistinguishable from a stratigraphic succession.

To attack the problem, we decided to use a simple biased random walk model: Markov chains. These models have no physics incorporated, only probability, which made the project simple and easy to complete over beers at night. The Markov chain can be summarized as a sequence of “states” where the next state is chosen at random based on some probability to change to another given state or stay at the same state. Repeating this over and over gives a chain/sequence of states. In our simulation, the “states” were changes in elevation. Said another way, the next timestep’s change in elevation is dependent on the current timestep’s change in elevation. This hypothesis is grounded in the physics of river-deltas, whereby channel location (and thus locus of deposition/erosion elevation change) does not change immediately and at random, but rather is somewhat gradual.

When a system has more than a few states it becomes complicated to keep track of the probabilities, so we naturally use a matrix to define the probabilities. The so called “transition matrix” records the probability that given a current state (rows), the next state will be any given state (columns).

We used the Tulane Delta Basin experiment 12-1 which has a constant Qw/Qs ratio and RSLR rate for our data. This is kind of necessary for our question because we needed to know what the elevation was at some regular interval, and what the resulting stratigraphy after deposition and erosion was. The experimental delta surface was measured to mm accuracy every hour for 900 hours. We calculated Δz values over the entire experiment spatiotemporal domain, to inform our main biased random walk model.

Markov transition matrix of dz values calculated from the experimental elevation profiles. (axes are in mm/hr)

The main data-trained transition matrix shows that states will tend towards slightly aggradational. This makes sense since this system is deterministically net aggradation due to a RSLR forcing. We compare this data-trained model with two synthetic transition matrices: a Gaussian distribution (intentionally similar to the data) and a uniform distribution (i.e., purely random).






We then simulated the elevation change sequences predicted by each of the transition matrices and used the stratigraphic filter concept to determine the resultant stratigraphy. We did this a bunch of times, and totally chose one of the more “convincing” results (i.e., where the slope difference between simulations was larger).

a SINGLE realization of the modeled Markov sequences.

We found that the data-trained model reproduced the stratigraphic sequences from the experiment (based only on comparing slopes…). The other models were not too far off, suggesting that deltaic sedimentation is in fact not too far off from purely random.

Ultimately, we would simulate the models hundreds of times and make ensembles of the results to interpret. We also would use more meaningful statistics of the stratigraphic sequences to compare model and data, such as # of “channel cuts”, unconformity-bed thicknesses, “drowned” sequences, etc.

The main source code for all this lives on github here ( if you are interested in checking anything out or exploring some more. You will need to obtain the data from the SEN website listed above, or ask me to send you a copy of the data formatted into a `.mat` file (it’s too big to put on here…).
If you format the data yourself, produce a 3D matrix with each cell where the x-y-z coordinates are time-across-along.

Authors of the code and project are Kelly Sanks (@kmsanks) and Andrew J. Moodie (@amoodie).

Rivers 2 Stratigraphy

Explore the construction of stratigraphy through the interaction of channel geometry, lateral migration rate, subsidence rate, and avulsion frequency — interactively!

Imagine there is a river in a subsiding basin. As the river laterally migrates, it leaves behind sandy bar deposits, which are encapsulated in a matrix of floodplain muds.

The river deposits are lowered away from the basin surface by subsidence, they become part of the stratigraphic record (Figure below). Depending on the amount of lateral migration, frequency of avulsion, and subsidence rate, the degree of overlap of channel bodies in the stratigraphic deposit will vary. The movie above shows a vertical slice across the basin width, and a channel (the colorful boxes) laterally migrating across a river floodplain.

The movie comes from an educational module I am working on called Rivers2stratigraphy ( The module is open source and relies only on Python dependencies.

This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No.1450681. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors(s) and do not necessarily reflect the views of the National Science Foundation.

A LaTeX package for peer review

I recently needed to do a peer review for a co-author’s manuscript. I rarely use MS Word or LO Writer anymore except for really simple documents (e.g., a quick abstract I will have to copy and paste into a browser form anyway), and instead use LaTeX for nearly all of my writing needs. So naturally, I needed a way to do my peer review in LaTeX.

There were no packages I could find that offered the functionality I was looking for, a simple environment to place comments, numbering of comments, and a place to organize line numbers.

Enter my custom package peer_review: a LaTeX package for commenting and responding throughout the peer review process (hosted on GitHub). This package offers a few simple environments and commands that make doing a peer review pretty painless — and naturally LaTeX quality beauty!

This is a demo of the package, and the demo file is included in the repository. This demo uses a class file from my collaborator Eric Barefoot called compact_proposal (also hosted on GitHub), but the package can be used on top of the base article class too.

If others end up picking up this package, or I really get it refined I may try and release it on CTAN. But for now, it can be downloaded from GitHub as a zip or cloned with:

git clone

There are instructions for installing on the project README, and a man page with instructions on using the package, complete with examples.

Python 3 silly random name generator

I was recently working on a project to try to annoy my collaborator Eric by a scheduled script that sends him an email if there are open pull requests that I authored in a github repository he owns (pull-request-bot).

Along the way, I created this fun little function for coming up with random sort-of-realistic-but mostly-funny-sounding names in Python. I’m in the process of switching from Matlab to Python, so little toy functions and projects like this are a good way for me to learn the nuances of Python faster. In fact, I would argue this is the best way to learn a new programming language–no need for books or classes, just try things!

The script is super simple and I have posted it as a GitHub Gist so I won’t describe it all again here. Below the markdown description of the function is an actual Python file you can grab, or grab it from here.