Fetching USGS Input Files

Required Input - Data Files

LoadRunner can do nothing without:
  1. Water quality data - the concentration of each element dissolved in the streamflow, and
  2. Flow data - how much water the stream carried that day.

The current version of LoadRunner assumes you've already fetched this data from the USGS for all the sites you're interested in, from:

The basic strategy is:

  1. Make a site list.
  2. Download the quality data.
  3. Download the flow data.
Fair warning: unless you're retrieving only a handful of sites, do not attempt this on a dial-up Internet connection.

The rest of this file walks through those steps.

Step 1: Make a USGS Site Number File

Because you need to fetch the data from two websites, you should construct a "File of Site Numbers" first. If getting a lot of sites (for instance, a whole "hydrologic region"), I recommend splitting the site list itself into several batches. For instance, the South Atlantic Gulf HR is so large that I split it into 9 site lists, each about a 20k long. Then I fetched the USGS data, and ran LoadRunner, one batch at a time.

So, how to fetch a site list.

On the first page of the USGS Water Quality site dialogue, you're presented with "Site Selection Criteria". You may have all sorts of criteria for what sites you want. But also, always select:

  • Number of observations
So you will have at least 2 checkboxes checked. In the picture, I chose "Site Name" and "Number of Observations" as my criteria.

Then I clicked Submit.

I filled in my two criteria (Number of observations 12, sites "Connecticut"), and told it to Save to file, then clicked Submit.

Note that LoadRunner needs a minimum of 12 quality data points to run, so there's no point in downloading sites that have less than 12 observations. Even so, there will be sites on the list that don't have the right 12 observations to answer your load question. So, there will still be sites in this list that cannot run.

The usual web browser file download dialog, asks where to put the file. I browsed to a directory for my project, and replaced the USGS suggested file name of "qwdata" to "sitelist.txt".

Step 2: Download the Quality Data

Assuming you have the right group of sites, then you use that file of site numbers to download both the quality and flow data.

Go back to the original USGS Water Quality page. This time, there's only one site selection criterion: File of Site Numbers. And Submit.

This brings us to a variant of the second page, where the top section asks us for the File of site numbers. I click the Browse button to give file I saved in Step 1. At the bottom, I click to request Tab-separated data. Everything else on the page I leave untouched, and go with their defaults.

LoadRunner tolerates most choices of file format, should you wish to alter other options on the Tab-separated data line. This may change (has changed...) as the USGS chooses to alter their file formats...

And Submit.

And wait. For my 3 kb file of site numbers for "Connecticut", this didn't take very long. I like to keep individual sitelist files down to 20 kb or smaller...

Eventually the browser file save dialog came up, and I named this one "qwdata.txt", right next to my "sitelist.txt". And I wait again while the file downloads - 7 Mb of data.

Before we leave the USGS Water Quality website, notice the links under the Submit button. You can use these to discuss problems with the USGS.

Step 3: Download the Flow Data

Now we move on to the USGS Surface-Water site. The first screen is site selection criteria.

Here, we select File of Site Numbers and Site type (the latter is pre-selected). Site type in this context means stream flow, well water, ground water, etc. We don't want "Number of observations" because we've already chosen our sites. Submit.

The next page is huge, and our needs very specific. We need:

  • File of Site Numbers
  • Site type of Stream/River data only
  • Streamflow, ft3/s
  • for a date range
  • Tab-separated data saved to file
Don't leave the date range blank. I copy the dates in parentheses (1855-04-01 through 2007-05-08) into the first and last date fields. Taking the blank defaults yields streamflow so far this year, which is of no use for LoadRunner. Alternatively, you might have specific years of interest. But taking all possible dates, works for me.

And Submit.

This immediately comes back with a browser file save dialog, and I specified a filename of "dv.txt", next to my "sitelist.txt" and "qwdata.txt" files. Then waited for the download to finish. For my example, this file was about 8 Mb.

I have input data. I can run LoadRunner.

In this picture, I told LoadRunner where my "qwdata.txt" and "dv.txt" files are. I could go ahead and click Run, but I probably need to adjust the options first. Click here for a discussion of the options.

Note: Sometimes the USGS changes their output data format. If it changes too much, LoadRunner will stop working. If you believe this has happened, you might need to contact Ginger and Pete (email links below) to build another version of LoadRunner.

Ginger Booth for Peter Raymond, September, 2007