Skip to main content

Adding Context

Before you submit your data to Geoconnex it is important that it is well formatted so it can be easier to parse and provide better insights. This also makes it more maintainable and expressive for your own team.

Adding contextual information

To be most useful to the wider water data community, locations should have both descriptive and contextual information in the data published to geoconnex.us. Some useful descriptive information could include:

  1. identifier
  2. Location geometry (point or polygon latitude/longitude, preferably in WGS84)
  3. short name
  4. long name or description
  5. organization
  6. URLs where observed or modeled data about the location can be accessed This is of particularly great interest where available.

Contextual information could include:

  1. administrative geographies it is within (e.g. census tract, municipality, county, state, PLSS section)
  2. watershed boundary it is within (e.g. HUC12)
  3. for groundwater sites, relevant aquifers
  4. a relevant reference location. Many organizations publish data about the same feature, such as a common monitoring location that may serve as a streamgage, a water quality sampling site, as well as being fixed on a dam or bridge.
  5. for surface water sites, the hydrologic address on the National Hydrography Dataset stream network

Using reference cataloging features to add contextual information

Wherever possible, contextual data should be in the form of persistent identifiers (PIDs) for these features. For example, counties are often given as a name, but spelling errors, capitalization or abbreviation differences, and other ambiguities can lead to barriers to interoperability between datasets that reference counties. In addition, these PIDs are already members of the knowledge graph, making adding your data to the knowledge graph simpler and more meaningful. Some sources for PIDs for these contextual features are provided at reference.geoconnex.us/collections . Some common patterns include:

Since many organizations publish data about the same feature, it is useful for these organizations to link their relevant data to a common identifier for that feature. The geoconnex project currently maintains two sets of reference location identifiers:

  • Reference gages for all surface stream monitoring locations (whether streamgages in the traditional sense or any water sampling site). These take the form https://geoconnex.us/ref/gages/{7-digit integer} e.g. https://geoconnex.us/ref/gages/1000001
  • Reference dams for all artificial dams impounding water bodies. These take the form https://geoconnex.us/ref/dams/{7-digit-integer} e.g. https://geoconnex.us/ref/dams/1000001

Note that these identifiers have somewhat arbitrary schemes that are maintained independently of the identifiers of common national "authoritative" datasets such as USGS Gages II or the USACE National Inventory of Dams in order to accomodate features that are not (yet) included in these datasets, and to handle persistence in the case where these systems sometimes change identifiers for a given real-world feature.

Using NHDPlus identifiers to represent hydrologic addresses

By using persistent identifiers for NHDPlus features, you can represent your locations' spot on versions of NHDPlus in a way that eliminates ambiguity as to which version of the NHD the address pertains to, as well as reduce common errors such as failing to include leading 0's in reachcodes.

Example:

Below is an example table based on streamgages with data published at the California Data Exchange Center The table is also available for download as a csv here. Note the inclusion of descriptive information, links to various reference features, and the data_url linking to the CDEC data system entrypoint for each site.

uriidnameorganizationdata_urllatitudelongitudereachcode_nhdpv2measure_nhdpv2mainstem_riverreference_gage
https://geoconnex.us/ca-gage-assessment/gages/AMCAMCArcade Creek at Winding WayCalifornia Department of Water Resourceshttp://cdec.water.ca.gov/dynamicapp/staMeta?station_id=AMC38.645447-121.347407https://geoconnex.us/nhdpv2/reachcode/180201110000480https://geoconnex.us/ref/mainstems/5147https://geoconnex.us/ref/gages/1185578
https://geoconnex.us/ca-gage-assessment/gages/CSWCSWKings River Below Crescent WeirCalifornia Department of Water Resourceshttp://cdec.water.ca.gov/dynamicapp/staMeta?station_id=CSW36.3863018-119.875615https://geoconnex.us/nhdpv2/reachcode/180300120092430https://geoconnex.us/ref/mainstems/1796720https://geoconnex.us/ref/gages/1185619