Wednesday, May 18, 2011

Python Script To Update Current Stream Gauge Data In New AGRC Flood Map

We use a python script to scrape data from the USGS and NWS web sites to update our data in the SGID. It runs every two hours through Windows Scheduled Tasks. The script’s workflow is as follows:

First it loops through all of the features in our stream gauges feature class (SGID93.WATER.StreamGaugesNHD).

For each feature, it uses a USGS id (SourceFeature_ID) to build a url to hit their Instantaneous Values web service

# get json object
data = json.loads(urllib2.urlopen(r'' + id).read())

This is an example of one of the urls: It then uses the json library to parse the data and get the values that we are interested in. These values are used to populate the appropriate fields in our feature class.

def getJsonValue(variableCode, data): for ts in data['value']['timeSeries']: if ts['variable']['valueType'] == variableCode: value = ts['values'][0]['value'][0]['value'] return value

The NOAA data is served up via an rss feed which means xml. The minidom object from the xml.dom library came in handy here for parsing the xml data.

# get noaa data gaugeID = row.getValue('GuageID') if gaugeID:     ndata = minidom.parse(urllib2.urlopen('' + gaugeID.lower() + '.rss'))     descriptionText = ndata.getElementsByTagName('description')[2].firstChild.nodeValue     descriptionList = descriptionText.split('<br />')     row.setValue('HIGHEST_FORECAST', descriptionList[5].split()[2].strip())     row.setValue('HIGHEST_FORECAST_DATE', getNOAADate(descriptionList[6].split('Time:')[1].strip()))     row.setValue('LAST_FORECAST', descriptionList[8].split()[2].strip())     row.setValue('LAST_FORECAST_DATE', getNOAADate(descriptionList[9].split('Time:')[1].strip()))
So in the end we have one feature class that combines real-time data from multiple sources. You can check out a copy of the script here.