Missing Association / Regions?

G0LGS · 20 February 2011 23:31

According to my counting on the Summits List on SotaWatch site there are 51 Associations with 521 Regions, yet the Fact and Figures on the Database site show 52 Associations and 525 regions.

Where and which is the extra Association and the four Regions ??

Stewart G0LGS

G8ADD · 21 February 2011 00:37

In reply to G0LGS:

Yes, this difference does exist, but TBH I can’t be bothered to track down the “missing link”, all such numbers were no doubt correct at the time of entry but are rapidly made out of date by the expansion of SOTA - there will be even more soon! In the meanwhile, the valid Associations are the ones in the database, that is the list that counts!

73

Brian G8ADD

G0LGS · 21 February 2011 01:41

After a re-check - I found I had missed one Region in EA4 from my list (and hence my count).

I have also found:

S2 Association which has NO regions listed is shown in the database, but not on the SotaWatch site. (Hence 51 v 52).

ZS/MI and W4/WV appear to be empty regions in the Database and are not shown on SotaWatch.

Which still leaves me one region unaccounted for.

Stewart G0LGS

MM0FMF · 21 February 2011 01:47

In reply to G0LGS:

W4/WV definitely exists. It’s just not where you’re looking! And yes, it’s down for a fix.

However, the database is always right, even when it’s wrong.

Rather than rumaging about why not download the full list of summits which is updated daily.

Andy
MM0FMF

G0LGS · 21 February 2011 02:15

Rather than rumaging about why not download the full list of summits
which is updated daily.

Because it is just too big for what I want.

Stewart G0LGS

MM0FMF · 21 February 2011 02:44

In reply to G0LGS:

Because it is just too big for what I want.

Too big? What are you programming, a PIC?

It’s about 45000 rows and 6MB, hardly big. Why not download, extract needed data, use data. It contains every summit, association and region and when each summit is valid. Moreover, it’s the gospel according to St. SOTA.

If you email me I can probably provide what you need for little effort on your half. Put DATABASE in the subject.

Andy
MM0FMF

G1INK · 21 February 2011 03:48

In reply to MM0FMF:

What happened to the Bangladesh association - was it a wind up?
I nearly booked a hol in Chittagong!

G0LGS · 21 February 2011 05:01

In reply to MM0FMF:

45000 rows of information to read and store in an Array in memory, for my little program is a bit too much.

However (between my earlier post and this one) I have downloaded the full list and written my own program to convert this to a file in a form that works for me - it takes that program a few minutes to run, but I guess when it is finished and compiled without all the debugging stuff it will work much quicker (and I won’t need to watch it either).

That process gives me 51 Associations with 524 Regions (even if the names of some are suspect - such as W4\WV)

You will see what I’m doing with it when I release my SOTA CSV Logging program.

Stewart G0LGS

MM0FMF · 21 February 2011 14:15

In reply to G0LGS:

45000 rows of information to read and store in an Array in memory, for my
little program is a bit too much.

Are you using the wrong language? Here is the whole summit list read into memory and searchable on the the summit reference. Uses <8MB of memory including 5.5MB of data and that includes all the runtime etc. That reads the whole file into memory before constructing a dictionary keyed on the summit ref. So it probably peaked at 12MB before garbage collecting.

summitinfo = {}

def loadsummits():
.file = open(“summitslist.csv”, “r”)
.for lines in file:
…summitline = lines.strip().split(‘,’)
…summit = summitline[0]
…summitinfo summit ] = summitline[1:]
.return

print “poor mans database!”
loadsummits()

print “GM/SS-001\n”, summitinfo’GM/SS-001’]
print “OE/SB-215\n”, summitinfo’OE/SB-215’]
print “G/LD-057\n”, summitinfo’G/LD-057’]

Here’s it running with the time taken to load all the data and then find 3 summits. PC is 5yr old 3GHz Pentium 4 with 1GB of memory.

[andys@gb50linux01 ~]$ time ./filer.py
poor mans database!
GM/SS-001
‘Scotland’, ‘Southern Scotland’, ‘Ben More’, ‘1174’, ‘3852’, ‘NN 432244’, ‘’, ‘-4.5414’, ‘56.3858’, ‘10’, ‘3’, ‘01/07/2002’, ‘31/12/2099’, ‘18’, ‘29/06/2010’, ‘GM7PKT/P’, ‘’]
OE/SB-215
‘Austria’, ‘Salzburg’, ‘Weidschober’, ‘1789’, ‘5869’, ‘13.9297’, ‘47.1703’, ‘13.9297’, ‘47.1703’, ‘6’, ‘3’, ‘01/01/2008’, ‘31/12/2099’, ‘0’, ‘’, ‘’, ‘’]
G/LD-057
‘England’, ‘Lake District’, ‘Swinside’, ‘244’, ‘802’, ‘NY 243224’, ‘’, ‘-3.1730’, ‘54.5911’, ‘1’, ‘0’, ‘06/03/2002’, ‘31/12/2099’, ‘8’, ‘19/02/2011’, ‘G4MD/P’, ‘’]

real 0m0.406s
user 0m0.373s
sys 0m0.033s
[andy@gb50linux01 ~]$

Simples! The whole thing took 10mins to write just now. I used to program in preferably C and assembler. I have to use C++ at work but I don’t like it, it’s an ugly language. Tried Java, just as complex as C++ for no benefit in reality. Now, wherever possible, I use Python. It’s so simple and expressive. Moreover is the sheer power of the language to solve problems that take lots of coding in other languages. And that program runs unaltered on Windows, Linux, OS-X and even on my Nokia phone. Hey if Python is good enough to run Google it’s good enough for me!

If you need some bits of data etc. just ask.

Andy
MM0FMF

M1MAJ · 21 February 2011 23:14

In reply to MM0FMF:
…

Rather than rumaging about why not download the full list of summits
which is updated daily.

Obviously that’s the right way to solve the problem.

BUT, it neatly sidesteps the actual question that was asked. As a flat file which is (I hope) the result of a JOIN operation on normalised data, it cannot possibly represent an empty association or region. The list gives no information about whether S2 is or is not in your list of associations.

Of course the difference between an association with no summits and a non-existent association is fairly subtle. It does matter if you’re counting.

G0LGS · 21 February 2011 23:40

Andy,

The reason I was using the information shown on the SotaWatch pages to generate my own list of Associations & Regions was because I had not realised that I could actually download the full summit list.

The scripting tool I am using is Basic like in Syntax.

A simple test loop that reads the CSV (one line at a time) and loads the parts of the summit list that I need into an 2D array takes around 2 minutes (on my AMD 9650 Quad Core 2.3GHz) - clearly waiting 2 minutes for a program to load is just not acceptable.

Using my own list (524 Regions) which I can now extract from the full summit list takes only a few mS.

Stewart G0LGS

MM0FMF · 22 February 2011 04:46

In reply to M1MAJ:

Yes it’s a subtle difference and the lack of S2 summits in the file tells you there are none in the current database. It doesn’t tell you if there is an S2 association or not, just no current summits. Why there aren’t any summits is a different question and I don’t know the answer to that. The S2 assocations exists from the database’s view, it has an entry in the associations table so it exists. But not in a usable form.

In reply to G0LGS:

I don’t know when Gary introduced the downloadable summits list. I don’t think it’s always been available. But it is very useful for people who want to use it to craft some other resource.

You’ll have to tell me the scripting system you’re using as it seems a bit slow. Or perhaps you’re doing lots of arithmetic computing array indeces. I try and use higher level data abstractions in my programming now. Most object oriented languages come with a useful container library of some kind offering queues, lists, associative arrays etc. and all they’ll be well tested. You can get all the gen about performace (memory and cpu cycles) and know that you can pick the abstraction that suits your needs best without having to revinvent the wheeel. And then debug the wheel you’ve just reinvented because it’s wonky in some way!

Andy
MM0FMF

G0LGS · 22 February 2011 05:33

Andy,

You’ll have to tell me the scripting system you’re using as it seems a bit slow

Auto-IT (Home - AutoIt)

Stewart G0LGS

M1MAJ · 22 February 2011 20:38

In reply to MM0FMF:

I don’t know when Gary introduced the downloadable summits list. I
don’t think it’s always been available.

I was told about it in June 2010. I was informed at the time that it had been announced about a month before that, but I don’t recall seeing that annoucement and don’t know where it was made. I’d been nagging for such a facility for several years…

The facility is very useful, though it’s fairly well hidden! As you know from private correspondence, I have some small niggles with it, but having the data easily downloadable in one go is the most important thing.

M1MAJ · 22 February 2011 20:48

In reply to MM0FMF:

Simples! The whole thing took 10mins to write just now.

Indeed. But I think you’ll find it doesn’t actually parse every line correctly. I realise it was only proof of concept, but it’s a trifle misleading to claim that that code is sufficient.

I agree that Python is an excellent tool for this sort of thing. I tend to use Perl, largely because I’d already become fluent in it by the time Python became popular. They have many of the same facilities; I think Perl is probably better for quick one-off throw-away jobs, and Python better for serious code that you expect to keep and want to be able to understand next year.

Both are powerful, operating system neutral, and free.

MM0FMF · 23 February 2011 02:43

In reply to M1MAJ:

Oh it can’t parse them all correctly as some of the data contains the delimiter character. That’s a known issue. As for claims, I think it shows that loading and processing the whole summit list in memory is not an issue. The same code running on a 266MHz PPC603 with 128Mb of memory (the SMSBOT server) runs in 6.5 secs compared to about 0.5 before. Which nicely shows that the same code is portable and usable not only on modern whizz-bang machines but also memory and compute power constrained old tin boxes.

However, your point got me thinking and a quick to tweak that code and another 10mins of development and it appears there are at least 714 summits with hard to parse data out of a total 43731. That proof of concept code has nicely identified many of the summits where the data could do with a clean up and revision. Also it idirectly helped Stewart get further with his coding. So that’s a win-win situation. Definitely not bad for a few minutes musing. Sadly it’s another job on the list of tasks for me!

Andy
MM0FMF

M1MAJ · 23 February 2011 03:50

In reply to MM0FMF:

As for claims, I think it
shows that loading and processing the whole summit list in memory is
not an issue.

Sure. I do it all the time. The daemon which generates the Twitter feed of the spots has all the summit names and coordinates in memory as it runs.

However, your point got me thinking and a quick to tweak that code and
another 10mins of development and it appears there are at least 714
summits with hard to parse data out of a total 43731.

Well yes, I told you that (well I didn’t give the numbers, but could have if you had asked). Nevertheless, as I also said, I don’t think the data should be restricted. If the name of a summit happens to contain a comma or a quote, so be it. It’s our job to cope. It’s not difficult. I do parse the existing file, with a bit of tweaking - I haven’t found any actual ambiguities.

G0LGS · 23 February 2011 05:45

I have improved my parsing of the 43729 summits (+ 2 header lines) by changing the way it was being done - from an iterative loop through each line to using a RegExp (that should handle quotes and embedded commas).

Whilst that reduced the time to read all the required information into memory from around 120Secs to around 35Secs that is still much too long.