Liking cljdoc? Tell your friends :D

Whither Weather?

...or "Choosing A New Weather API"

Ref: Issue 740

So far, the issues I've been exploring all relate to location specification, and what options, if any, there are for search operations or other strategies to disambiguate returned results.

At a quick first look, all these APIs return the basic weather data we want. I haven't done a careful assessment of their free plans beyond a quick gut check of N requests/interval seeming reasonable enough for my best guesses on typical use.

OpenWeatherMap

OpenWeatherMap was the first API I looked at, as it was suggested by @devth in #740.

The most basic set of endpoints provided are those for Current Weather Data. The problem I see here is that the API doesn't surfaces any notion of states, provinces, counties, prefectures, etc., returning only the city name, in name, and the country code in sys.country.

For the below examples I'm using "York", a city in Pennsylvania, US, and England, UK (country code: GB). "York" is also token in many city names, like "New York", a city in NY, US; relevant later when we look at the API's search functionality.

Full /data/2.5/weather Response, For Reference

 ❯ curl -s "https://api.openweathermap.org/data/2.5/weather?APPID=$owm_key&q=york" | jq .
 {
   "coord": {
     "lon": -1.08,
     "lat": 53.96
   },
   "weather": [
     {
       "id": 800,
       "main": "Clear",
       "description": "clear sky",
       "icon": "01d"
     }
   ],
   "base": "stations",
   "main": {
     "temp": 290.66,
     "pressure": 1013,
     "humidity": 51,
     "temp_min": 289.15,
     "temp_max": 292.15
   },
   "visibility": 10000,
   "wind": {
     "speed": 5.1,
     "deg": 220
   },
   "clouds": {
     "all": 0
   },
   "dt": 1539102000,
   "sys": {
     "type": 1,
     "id": 5101,
     "message": 0.0282,
     "country": "GB",
     "sunrise": 1539066114,
     "sunset": 1539105595
   },
   "id": 2633352,
   "name": "York",
   "cod": 200
 }

Complications When Trying To Qualify Locations

We'll use a trimmed down version for API responses, for discussion going forward.

❯ curl -s "https://api.openweathermap.org/data/2.5/weather?APPID=$owm_key&q=york" | jq -c '[.id,.name,.sys.country]'
[2633352,"York","GB"]

An unqualified "york" returns the city in England, UK. The problems arises when one tries to specify the city, by state (or similar), which is a common enough habit in the US.

Let's say you're trying to get York, PA's weather. One is likely to try "york, pa", or maybe "york, pa, us". Unfortunately, as OWM doesn't consider delineations other than countries, you're out of luck here.

"york, pa" will fail:

❯ curl -s "https://api.openweathermap.org/data/2.5/weather?APPID=$owm_key&q=york,pa" | jq .
{
  "cod": "404",
  "message": "city not found"
}

...and "york, pa, us" will succeed incorrectly:

❯ curl -s "https://api.openweathermap.org/data/2.5/weather?APPID=$owm_key&q=york,pa,us" | jq -c '[.id,.name,.sys.country]'
[2633352,"York","GB"]

Attempting to work around this, via the limited search, is challenging. Note the other two "York"s in the US, and one in Australia.

 ❯ curl -s  "https://api.openweathermap.org/data/2.5/find?APPID=$owm_key&q=york&type=like" | jq -c '.list[] | [.id,.name,.sys.country]' | sort
 [2057277,"York","AU"]
 [2633352,"York","GB"]
 [4098776,"York","US"]
 [4562407,"York","US"]
 [4601703,"York","US"]

OWM's Search Accuracy says they accept partial names, but I wasn't able to get this working in my tests.

You can use our geocoding system to find cities by name, country, zip-code or geographic coordinates. You can call also by part of the city name. To make the result more accurate just put the city name and country divided by comma.

I was expecting the above query ("find?q=york&type=like") to return "New York" but it doesn't. I also tried "New", "Yor", and "ork", without success.

Final Thoughts on OpenWeatherMap

There is a master file of locations that OWM makes available, but it doesn't provide any further resolution of locations, by name.

Without further location detail of states, provinces, counties, prefectures, etc., it didn't make sense to explore options for disambiguation of multiple returns, as you would do when trying to find your location for the first time.

I think it would be hard to support the existing Yetibot weather query interface, which feels very intuitive now, with OWM's API. My sense is that they have designed the API to work in concert with their mapping tools, which makes sense given the name. :)

Weatherbit.io

Weatherbit.io was the second API I looked at; I found it through some basic googling.

I wasn't initial that keen on Weatherbit.io as their free tier seems intended to push you into a paid subscription, but that's solely because of the wording on their plan pages.

Still sticking with the "york" examples.

Full /v2.0/current Response, For Reference

❯ curl -s "https://api.weatherbit.io/v2.0/current?key=$wb_key&city=york" | jq .
{
  "data": [
    {
      "wind_cdir": "NNW",
      "rh": 83,
      "pod": "d",
      "lon": -88.29642,
      "pres": 991.7,
      "timezone": "America/Chicago",
      "ob_time": "2018-10-10 13:00",
      "country_code": "US",
      "clouds": 25,
      "vis": 10,
      "state_code": "AL",
      "wind_spd": 2.57,
      "lat": 32.48625,
      "wind_cdir_full": "north-northwest",
      "slp": 1007.7,
      "datetime": "2018-10-10:13",
      "ts": 1539176400,
      "station": "E7440",
      "h_angle": -60,
      "dewpt": 19.2,
      "uv": 2.60677,
      "dni": 424.237,
      "wind_dir": 341,
      "elev_angle": 12.5555,
      "ghi": 164.134,
      "dhi": 71.3908,
      "precip": null,
      "city_name": "York",
      "weather": {
        "icon": "c02d",
        "code": "801",
        "description": "Few clouds"
      },
      "sunset": "23:28",
      "temp": 22.2,
      "sunrise": "11:55",
      "app_temp": 22.7
    }
  ],
  "count": 1
}

Qualify Locations

❯ curl -s "https://api.weatherbit.io/v2.0/current?key=$wb_key&city=york" | jq -c '.data[] | [.city_name,.state_code,.country_code]'
["York","AL","US"]

Again, an unqualified "york" will return something, in this case, a city in Alabama.

(York, AL has a population of ~2400, as of the 2016 US Census. I'm not sure why this was selected as the default "york"; I'm testing from an IP in New York City, NY, US. Just an interesting observation).

Adding a state, or state and country code, works as expected.

❯ curl -s "https://api.weatherbit.io/v2.0/current?key=$wb_key&city=york,pa" | jq -c '.data[] | [.city_name,.state_code,.country_code]'
["York","PA","US"]

❯ curl -s "https://api.weatherbit.io/v2.0/current?key=$wb_key&city=york,pa,us" | jq -c '.data[] | [.city_name,.state_code,.country_code]'
["York","PA","US"]

Note: proper URL encoding of the query, including spaces and commas, works nicely so there isn't too much preparation needed for the request, eg: "york, pa" as "york%2C%20pa" works fine.

Complications In Disambiguation Of Multiple Location Matches

Unfortunately, Weatherbit.io's doesn't seem to provide any sort of search functionality so users would have to specify enough detail to disambiguate, without guidance from YB.

Currently, YB will offer suggestions on which "york" you might want.

jereme 9:52 AM
!weather york

yetibot APP 9:52 AM
Found multiple locations: York, 10Jamaica; York, 13Jamaica; York, ABCanada; York, AL; York, ERYUnited Kingdom; York, FSSouth Africa; York, GA; York, IZGuatemala; York, ME; York, NBCanada; York, ND; York, NE; York, NLSouth Africa; York, NY; York, NYKUnited Kingdom; York, ONCanada; York, PA; York, PECanada; York, SSierra Leone; York, SC; York, WSierra Leone; York, WASAustralia; York, WV; York, YORUnited Kingdom; York, NLSouth Africa; York, WSierra Leone; York, SSierra Leone; York, YORUnited Kingdom; York, WV; York, GA; York, YORUnited Kingdom; York, ERYUnited Kingdom; York, NYKUnited Kingdom; York, PA

If we want to provide something similar, as a possible work around, WB publishes lists of locations, in varying sizes, by populations, which would be used to offer a local, simple search (just a string match, really).

Here are the lists, and counts provided by WB:

  • Cities >15,000 population - ~20,000 Cities
  • Cities >1,000 population - ~160,000 Cities
  • All Cities ~376,000

I pulled "All Cities", which is currently actually 378,815 locations, at ~18 MiB of uncompressed text, without much white space.

I'm not sure how often this data set changes (it has skewed little since /api/meta was published). My guess is this is sufficiently rare that shipping the current version with each YB release wouldn't be untenable.

With the data set stored locally, we could add our own simple search and disambiguation feature... but that also means shipping a big blob of EDN or similar, which is a bit gnarly. We could retrieve on demand but that seems inefficient to me, and overly complicated.

Final Thoughts On Weatherbit.io

I wound up liking this API for its qualified locations and simple request formatting requirements. Unfortunately, unless we want to build our own search, users will not be offered suggestions to help further qualify locations. That might not be a problem... it's just notably different than the current interface.

US National Weather Service API

NWS, part of NOAA, offers a rich API. I didn't dig into this as it's a beast that tries to support many different use cases via decomposed responses you stitch together by traversing JSON-LD links.

Why does the API require multiple requests for all the information?

There are many uses for the weather information provided by the API, and, historically, the service responded with everything but the kitchen sink. This design bloated bandwidth and make caching efforts difficult. One goal of the new API was a design that allowed repeat users of specific data the ability to access only the information needed. Another goal was to expire content based upon the information life cycle. The new approach using JSON-LD achieves both of these goals. While this requires additional requests, future enhancements, especially HTTP2, will make this design more efficient than a catch-all approach.

Next Steps

There are surely more APIs I could research, or things I'm missing about the two I looked at. I could devote more time there, or we could make a call based on the above.

The actual integration work is pretty trivial, regardless of which API we go with; I have a working WB impl now.

Can you improve this documentation?Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close