Discrepancies in SOTA Swagger API

K7IW · 24 July 2022 20:13

As I poke around the SOTA API, I have found a few discrepancies between various APIs and also the Summit List CSV file that’s provided elsewhere:

https://api2.sota.org.uk/docs/index.html
https://www.sotadata.org.uk/summitslist.csv

First of all, the BonusPoints field that is found in the CSV file does not appear to be provided anywhere in the JSON APIs documented through Swagger or in the actual data downloaded from them. This is not a big deal since I am already downloading and parsing the CSV data, but it means that I can’t completely replace it with the JSON data.

Second, while the Swagger docs show that the same data-type, SummitViewModel, is used for summit data listed on both the GET region and GET summit JSON APIs, several of the fields provided through the region API for summits are nulled out including regionCode, regionName, etc. They are present, but have the value null. This is somewhat understandable as they are duplicate information as what can be found in the region section at the top, but the documentation does not make this obvious. I’ll need to rework my app a little to account for this.

The third big discrepancy is a little more interesting. There are fields present in the GET summit API that have the value null, but have the proper value in the GET region API and the CSV file data. In particular, activationCount, activationDate, and activationCallsign are null in the summit-specific API call, but have valid data elsewhere. This is data that is more likely to change regularly so I’m not sure why it’s not filled in on the summit-specific GET JSON call. On top of that, the field locator is null in the GET region and non-null in the GET summit API. Ultimately, I’d need to combine information from the CSV file, GET region API, and the GET summit API and then merge it where non-null data takes priority in order to get the full picture of a summit. I do realize that I can easily calculate the locator value myself, but it would still be good to fix.

There may be other discrepancies as well, but this was what I’ve seen so far. Any ideas from the devs on this?

VK3ARR · 25 July 2022 04:59

The short answer is the API does exactly what it needs to do to handle the SOTA websites and nothing much more. This means there’s a bunch of these kinds of inconsistencies because they aren’t issues for eg, summits.sota and the like.

The longer answer is the same as the shorter answer but that many folks are using the API in ways we aren’t sure about and I’m not really in the mood to make too many breaking changes to the API at this stage. There’s a bit of work going on to improve the API and will most likely be releases as a new API version, rather than fixing the existing API. Timeframe indeterminate at this point.

K7IW · 25 July 2022 05:55

OK, in the mean time, I’m taking a closer look and will try to document all the needed discrepancies. Another one I just ran into is the valid field which I missed since it’s never null. However, it appears to always be false for summits from the GET region API, but true in the data retrieved from the GET summit API. That means, with the exception of the locator and valid fields, the GET summit API is really just a subset of the data retrieved from the GET regions API. With that, it’s probably not worth the effort for me to implement it per summit since I’m already caching the data retrieved offline. The locator can be calculated pretty easily from the latitude/longitude and I presume the valid boolean is probably nothing more than the current data when retrieves being in the range validFrom <= date <= validTo so it could also be calculated as needed. I’ll just have to make sure I implement the refresh button on a summit at the region API level if someone is trying to get the latest activation data for a specific summit.

MM0FMF · 25 July 2022 06:41

This is how all the sites work. There is no point storing it and thus having to enter it, maintain it, validate it etc. when the data to generate it exists and we already done that to get the latitude and longitude. There’s also the question of whether we should store it at 6, 8 or 10 characters.

K7IW · 25 July 2022 06:56

Also, just to document it for others, it looks like restrictionMask and restrictionList only contain meaningful values at the GET summit level.

There’s also a minor, but generally insignificant different in the latitude/longitude between the two. One appears to be getting the data as a string from the database and printing it out as JSON floating point and the other is being retrieved as a floating point number causing a slight error in the precision. The different is less than a meter, but can cause values to fluctuate if caching data from both APIs offline:

GET /api/summits/HL/GN-001

...
  "summitCode": "HL/GN-001",
  "longitude": 127.7307,
  "latitude": 35.3369,
...

GET /api/regions/HL/GN

...
      "summitCode": "HL/GN-001",
      "longitude": 127.73069763183594,
      "latitude": 35.33689880371094,
...

AB6D · 12 February 2023 17:09

Just now finding this thread. Thank-you K7IW for your work, and SOTA team for the replies.

I’ve got a few wish list items for the API:

I’ll add my up-vote for adding a way to get summit bonus points information from the API (both when getting all summits in a Region, and when pulling data on a specific Summit).
My app (SOTAmat) locally caches the entire database so my users don’t crush the official site. I poll on a regular basis looking for changes since my last sync. It would reduce load on both your system and my system if the API’s allowed me to specify a “WHERE MODIFIED AFTER ” parameter. Your SOTA database can add a SQL automatic Modification-Date column that the database itself maintains (where you don’t need to code anything). Any time you do a write operation to a database row where at least one value changes, the SQL database will auto-update the “Modified-Time” column for that row. You of course want an index on that column for fast lookup. The advantage is that when I make an API query, I can provide the UnixDateTime parameter, and I will instantly get just the rows that have recently changed since my last sync (if any). That would allow me to sync the data 1000’s of times faster, put far less load on your servers, and sync more frequently. I think you could do it without breaking the existing API (for Associations and Regions) since this could be an optional URL parameter, and would not need to change the format of the returned data (it only changes how many rows are returned). Other API consumers who don’t provide the parameter would get the same results as today.
How often is that CSV file updated?

73 de AB6D

MM0FMF · 12 February 2023 17:37

Every 24hrs.

Why? I’m curious why a spotting program needs to know this.

Do you mean just the summits? Just what is your app doing that it needs so much access that you are concerned you may “Slashdot” the servers? It’s nice you have considered this however.

AB6D · 12 February 2023 18:18

I’m curious why a spotting program needs to know this.

SOTAmat is (slowly) adding functions beyond spotting. For example, mountain-aware localized weather data, peak info, associated POTA park info, callsign info and stats, etc. The next version helps users select the summit ID they want to spot, and gives stats on that summit – and I can’t show bonus points along the other stats since I didn’t see it from the API. I want to show peak info to avoid fat-fingered (gloved fingered) mis-typed summit-ID’s and the easiest way is to show the peak name the user selected. For people who might not know the summit-ID I can eventually allow search or GPS auto-lookup (but not yet). I don’t need to show the other stats (like bonus points), but as long as I’m showing some peak info why not show all the peak info?

Do you mean just the summits? Just what is your app doing that it needs so much access that you are concerned you may “Slashdot” the servers? It’s nice you have considered this however.

I pull down a few tables:

The Associations list
The Regions list
The Summits list

When I sync, I track which rows have changes vs. which rows are the same. Each row gets a “Modified on” datetime and a “Created On” datetime. When a summit is added, that’s what the “Created On” is for. When metadata for a summit is changed (ex. LastActivator), that is what “Modified On” is for.

This gets used in several ways:

When users are preparing their configuration on the web site, they build a table of the Regions they plan to operate in the future. That defines one part of their “Configuration” that is taken offline to the mobile app later. In order to have a good user interface, my web site has a full searchable database that updates while they type. So I can start typing "“W6/”… and it will real-time search the database on each character and auto-match the possibilities and display them in a selector. Very fast. SOTA’s API is not set up for this type of wildcard search.
When a configuration is stored for a specific user on the server and then transferred to their mobile app for offline use, I need to know the exact state of the database at that moment. SOTAmat encodes commands into just 4 callsign suffix characters: 4 characters is not enough to actually compress the data into it, so instead SOTAmat creates a mapping between the 1 Million allowed suffix combinations and their meanings on a per-user per-configuration basis. Since users pre-define their Regions of interest rather than their Summits of interest, both the server and the mobile app (which is offline) need to agree on that mapping which is super sensitive to the number of summits inside a region (and that number changes over time). SOTAmat keeps a lot of timestamps to track everything for each user and for each summit. For example, I might load a configuration for W6/NC on January 2022, but the Association manager might add a summit on July 2022. Since the mobile app is fully offline and rarely updates its configuration, the server needs to track that the mobile app is out of date with reality and map callsign suffix mappings according to the old state of W6/NC. But another user might be using the fresh W6/NC mappings. Since this is happening for all users at the same time, and since different users are changing their configuration on different datetime’s, I keep track of everything from my local timestamped copy of the data since I keep extra columns of metadata beyond the official SOTA data for each summit. As long as I need these extra columns, it is easier to just have all the data in one place for simple querries.
When I get streamed data from PSKreporter (400 events per second via fiber optic feed!!!) I do a bunch of filtering and validation on reception reports that have suffixes. For example, is this a registered and enabled user? Does the suffix properly decode? Does it decode to a valid configuration? A valid Peak ID? That requires lookups, and those lookups need to be based on per-user timestamps. Having all the data local is helpful.
SOTAmat has a bunch of new 2-way commands for those times when you do have cell phone service or a Garmin inReach: users can pull summit info and statistics to their inReach (or cell phone) for example. For that I pull the data from my local cache.
The next version of the mobile app (as described earlier) allows the offline mobile app to display stats (most importantly the Name) of the peak the user is selecting to self-spot. To get that offline database of all possible peaks onto the mobile phone, each user sync’s their mobile phone when online and downloads the entire database PER USER. Since the stats for each peak (ex. LastActivator, LastActivationDate) are changing, I don’t want each user to have to download the entire database on each online-sync episode. Instead, the mobile apps send a timestamp to the server, and the server sends back just the changed records, which is often tiny. It also compresses the data for poor reception areas.
When someone uses the 2-way commands (via inReach, eMail, or SMS), such as when they ask for hyper-localized mountain-aware weather reports, I need to know the elevation and Lat/Lon value for the peak ID in question. I do this lookup locally from my cached database. Etc. etc. etc.

73 de AB6D - Brian