ZL Hunting (Part 1)

G5OLD · 25 January 2025 13:05

I’ve put together calculations accounting for summit height and the bulge of the earth.

The Earths equatorial radius is about 21km more than at poles, or about 34km further to travel if your S2S is on the equator vs pole to pole. This combined with your summit height makes the largest difference.

But I agree, let’s see how small the margins are if Activations zones make a difference …… reminds me of the kilometers of walking in the AZ of Ben Avon GM/ES-006

When @GM5ALX has cranked the data we can have a look!

If there are two near equatorial antipode summits say South America to a Philippines or Japanese island, they will win, without a doubt.

GM4LLD · 25 January 2025 14:39

I’m not! You’re a scientist and I’m an engineer. Scientists want measurements to be precise and accurate and engineers want them to be precise and accurate enough. That’s why I’m happy to use a simple average for the circumference of the Earth and ignore heights. Likewise the distance calculations used by the DB was the famous Haversine formula with an accepted value of 6371km for the Earth’s radius.

I was unaware of the geopy library. That’s the great thing about “modern” languages, the almost infinite number of libraries available to do the heavy lifting for you. The Haversine formula is somewhat special to me as it was the first bit of complex software (for various definitions of complex) that I wrote in October 1979 for my shiny new TI59 programmable calculator. The next was a program to display the positions of Jupiters 4 largest moons relative to Jupiter. So calling dst = geopy.distance.distance(loc1, loc2).km to get the answer seems a bit lightweight to me. ideally it should be written in old school FORTRAN where there would be about 20-40 lines of code and hundreds and hundreds of lines of JCL guff to proceed it so the damn thing would compile and run. For the record I’ve written versions in TI59 speak, GW-Basic, C, Java, C#, C++ and I am now desperately trying to avoid pip installing geopy and wasting the rest of the day playing with that

MM7MOX · 25 January 2025 16:49

We (MM7MOX and MM0UHR) did the GM/SS-104 to GM/SS-105 pair last year. We could have done the s2s with flags but the database doesn’t have optical wavelengths as a logging option !
It makes for an instant complete when you swap summits at half-time though.
Andy
MM7MOX

GM4LLD · 25 January 2025 16:56

Damn you Alex @GM5ALX I couldn’t resist and have a simple piece of code iterating through the whole summitslist.csv. Except it’s a bit slow. Like 61 days to generate all 180000 values.

The first optimisation is I don’t need calc to GM/SS-104 to GM/SS-105 and GM/SS-105 to GM/SS-104 as they are the same which reduces the number to calculate. And I need a parallel algorithm so I can use more cores. And a faster computer as it’s running on my low power 2.8GHz i5 Linux desktop right now. That’s fast enough and low power enough not to worry about the cost of leaving it running 24/7 but not fast enough for this!

I do have an dual 40 core AMD EPYC (80cores 512GB RAM 7TB disk) server sitting idle at work…

GM5ALX · 25 January 2025 17:27

geopy is quite slow, think it has various loops and I implemented haversine with numpy (uses Fortran underneath!). Whilst geopy uses an accurate radius it still only uses one and ignores altitude by default.

@G5OLD got sucked into calculating the distance on oblate spheroid including height. He made a spreadsheet for that. So I’ll get the top 1,000 then reprocess in his sheet for “a better answer”.

Use python itertools to generate a pairwise list, multiprocessing to add cores and then chunking to avoid running out of RAM.

My first list of activated summits was 40,000 summits. This was 800 million calcs and it took 30 minutes. Now with all summits of 160,000 there’s are 12.8 billion calcs and that’s estimated to take 9 hours.

The first calc generated a 23GB csv file the current process…I guess 350+GB?

Then I’ll have to wait for sort to put them in order.

So many ways to speed it up but when the family is visiting for the day it’s quicker to just let the computer crunch the numbers.

GM4LLD · 25 January 2025 17:33

I do have some BigIron machines siting idle whilst some other hardware is made ready in the data centre at work. Seems a shame not to use them

G5OLD · 25 January 2025 17:34

My portable computer (aka iPhone) says no to even working out the number of permutations

You could spend even more time working out how to optimise the number of computations to just the key summits. But looks like @GM5ALX is doing proper engineering… multi cores and brute force!

GM5ALX · 25 January 2025 17:35

It’s the computer version of:

If I Had More Time, I Would Have Written a Shorter Letter.

VK3ARR · 25 January 2025 17:41

As much as I respect the use of old school Unix tools, surely maintaining a running list of top 10 would have to be easier on your SSD

And of course you could use some AABB bounding boxes to narrow the comparison list quickly

GM4EVS · 25 January 2025 21:41

Every once in a while, you stick your head above the parapet, fully expecting it to be blown off … yet you still do it

Based on my limited understanding of the problem, here goes:

1 Assume we are going to do the full set of worldwide S2S calcs, comprising some 160,000 summits.

2 First, pull the relevant info (Lat, Long, name) from the SQL database into a flat CSV file.

3 For the distance calculations, we’ll use C++ as the standard library makes multi-threading notably straightforward. My Linux desktop has 12 hyper-threaded cores. We can use them all in parallel because the 12.8 billion distance calcs have no interdependency.

4 Create a simple C++ class object to represent a summit. Include data members for the input (which we’ll read from our flat file), and the result (pointer to 159,999 values?) .

5 Assuming 8-byte doubles for lat, long and distance, plus 20+ bytes for the name, and a bit of overhead, that’s about 50 bytes per object.

6 Total size 160,000 * 50 = 8,000,000 bytes ie, the whole lot fits easily into memory on a typical 8GB machine.

7 We use the C++ std::vector type to store all the objects - access times are super-fast compared to linked-lists etc.

8 Write out the results to a CSV file. Then sort …

Execution time hopefully rather less than 9 hours, but …

On reflection, storing then sorting 12.8 billion results could be somewhat more problematic than this …

Using 4-byte floats for the output, a simple binary results file would be 4 * 12.8 = 51.2 billion bytes.

73 Dave

PY2VM · 25 January 2025 21:42

Hello Tim,

I was happy with your response to my request. Normally the UK only appears in the early hours of the morning in Brazil, 08:00 UTC.

And I am happy to hear you very well, exchanging names and information about the Peaks.

I hope to hear you in other activations and I hope to make contact with VK. Every day I look at the cluster for VK, but I don’t hear anything.

I was using EFHW 20.5m long at 5 meters high, Radio FT857d with 60W.

73
carlos
PY2VM

GM5ALX · 25 January 2025 23:38

I’m sure the computation could be massively reduced. I only know python (and whilst ChatGPT could probably recommend some C++ code if it doesn’t work then I don’t know what I’m doing).

The final csv is 360 GB, so not sure in memory (with a mortal’s computer) is possible, and so I chunk it and write to a file every 50,000 S2Ss.

GNU sort does the job simply enough afterwards, although not sure how fast it is. I’ve since found DuckDB which looks to be much much faster, and has processed 1/3 of the csv in about 5 minutes. Compard to sort chugging away for 40 minutes and I think has done 1/3 based upon the 100GB of temp files it has made.

GM5ALX · 25 January 2025 23:58

Using @G5OLD’s bulge calculator after processing all the summits:

Summit 1 Ref	Name	Summit 2 ref	Name Summit 2	ARC Distance (with Height and Bulge)
EA1/OU-006	O Xistral	ZL3/CB-101	Mount Una	20043.17
EA1/OU-006	O Xistral	ZL3/TM-011	ZL3/TM-011	20042.59
EA1/ZA-003	Moncalvo	ZL3/MB-085	ZL3/MB-085	20042.28
EA1/OU-015	Pe√±a Nofre	ZL3/TM-022	Mahanga Range	20041.96
EA4/CR-066	Rayo	ZL1/MW-002	Mount Ngauruhoe	20041.88
EA1/OU-001	Pe√±a Trevinca	ZL3/MB-161	ZL3/MB-161	20041.75
EA1/LE-267	Las Colinas	ZL3/CB-293	Snowflake	20041.59
EA1/LU-019	Faro	ZL3/CB-406	The Nelson Tops	20041.57
CT/TM-004	Armada	ZL3/WC-454	Lyell Range	20041.49
EA5/MU-009	Gato	ZL1/BP-039	ZL1/BP-039	20041.45
EA1/ZA-003	Moncalvo	ZL3/MB-066	ZL3/MB-066	20041.43
EA1/OU-009	Chao do Porco	ZL3/MB-359	ZL3/MB-359	20041.41
EA1/OU-017	Majedo	ZL3/MB-025	Mount Weld	20041.37
EA5/MU-042	Los Odres	ZL1/GI-007	ZL1/GI-007	20041.32
EA1/OU-003	Manzaneda	ZL3/CB-291	ZL3/CB-291	20041.17
EA1/CR-019	Espi√±eira	ZL3/WC-109	Mount Rosamond	20041.07
CT/TM-005	Coroa	ZL3/MB-033	ZL3/MB-033	20041.05
EA5/AB-016	La Atalaya	ZL1/BP-124	ZL1/BP-124	20041.02
EA1/ZA-006	Alto de Marab√≥n	ZL3/MB-027	ZL3/MB-027	20041.01
EA4/CR-060	Pico Casa de Valdebenito (norte)	ZL1/MW-011	ZL1/MW-011	20041.01
EA1/CR-031	Alto de Fernandi√±a	ZL3/CB-159	Hinge Peak	20040.98
EA1/OU-014	Meda	ZL3/WC-227	Mount Freyberg	20040.97
CT/TM-020	Coroto	ZL3/MB-049	ZL3/MB-049	20040.97
CT/TM-016	Minh√©u	ZL3/WC-447	The Needle	20040.97
EA1/LE-251	Pico de la Fraga	ZL3/CB-473	Mount Clear	20040.95
EA1/OU-028	Outeiro de Fari√±eiros	ZL3/TM-030	Emily Peaks	20040.95
EA5/AB-019	Loma de las Yeguas	ZL1/BP-119	ZL1/BP-119	20040.92
EA1/LE-153	Llagarino	ZL3/CB-506	ZL3/CB-506	20040.92
EA1/CR-026	Faro	ZL3/WC-119	Mt Beaumont	20040.91
EA1/LU-023	O Piorno	ZL3/WC-349	Fetlock Shrouds	20040.90
EA5/AB-017	Pe√±a de Moratalla	ZL1/BP-147	ZL1/BP-147	20040.88
EA7/JA-051	El Majal√≥n	ZL1/BP-185	ZL1/BP-185	20040.81
EA1/LU-023	O Piorno	ZL3/WC-218	Mount Ajax	20040.74
EA1/OU-045	O Fial	ZL3/MB-144	ZL3/MB-144	20040.72
EA5/MU-011	Gigante	ZL1/GI-084	ZL1/GI-084	20040.71
EA1/LE-038	Pic√≥n	ZL3/MB-120	Mount Jackson	20040.70
EA1/OU-032	O Picoto	ZL3/MB-087	ZL3/MB-087	20040.65
EA5/AB-034	Las Allanadas	ZL1/GI-175	ZL1/GI-175	20040.65
EA7/JA-090	Cerro de Gontar	ZL1/BP-141	ZL1/BP-141	20040.64
EA7/JA-058	Cerro de los Calarejos	ZL1/BP-094	Pokaikiri	20040.61
EA1/LE-226	Pico de las Yeguas	ZL3/CB-576	Mount Gooch	20040.60
EA1/ZA-011	Pe√±a del Castro	ZL3/MB-044	ZL3/MB-044	20040.57
EA5/AB-016	La Atalaya	ZL1/BP-123	Kapuarangi	20040.57
EA1/OU-001	Pe√±a Trevinca	ZL3/MB-014	Dillon Cone	20040.56
EA1/LU-013	Leg√∫a	ZL3/CB-634	Mount Saul	20040.54
EA5/AB-017	Pe√±a de Moratalla	ZL1/BP-121	Te Reinga	20040.53
EA5/AB-017	Pe√±a de Moratalla	ZL1/BP-258	ZL1/BP-258	20040.51
EA7/JA-065	El Rayo	ZL1/BP-084	ZL1/BP-084	20040.51
EA5/AB-034	Las Allanadas	ZL1/GI-042	ZL1/GI-042	20040.51
EA7/GR-068	Cabras	ZL1/WK-217	ZL1/WK-217	20040.50

It’s not until ~650+th S2S do you get something outside of Europe:

Summit 1 Ref	Name	Summit 2 ref	Name Summit 2	ARC Distance (with Height and Bulge)
JA6/KG-177	Ontake	PY3/PA-010	Morro Cantagalo	20037.55
JA6/KG-167	Kogajyajima	PY3/PA-055	PY3/PA-055	20037.54
JA6/KG-138	Ooyama	PP5/RS-089	PP5/RS-089	20037.40
JA6/KG-177	Ontake	PY3/PA-020	PY3/PA-020	20036.58
JA6/KG-177	Ontake	PY3/PA-014	PY3/PA-014	20035.08

GM4EVS · 26 January 2025 00:01

To get a better feel for the problem, I decided to start with a simpler task.

As you may know, the Python ‘itertools’ library has a function called ‘combinations’. This can used be to generate all the S2S pairs based on ‘pick any 2 from 160,000 where order doesn’t matter.’

With just a few lines of code, you can get this working correctly at the outset with just 10 summits. Then you try 100, then a 1000.

As this is an O(N^2) problem, you quickly realise that the full 160,000 could be quite a long wait.

With an typical output record like (123567, 64517), this is some 15 bytes. As such, the full file for 12.8 billion pairs is going to be around 12.8 * 15 = 192 billion bytes.

I got tired of waiting, and Ctrl_Ced the task

73 Dave

GM4LLD · 26 January 2025 00:25

Probably. But brute force is also fun!

There’s 177000 summits in the summitslist.csv file ISTR. You need to work the distance for every summit against every other summit which is approx 31billion Haversines. That entails 7 transcendental functions, 4 FP mults and an FP add and FP subtract per Haversine.

I now prototype all code in Python as the language doesn’t get in the way of the problem. What you learn is that as Python is interpretive, it’s not fast for big iterative tasks. So the above makes me think I need to not do the Haversine in Python at least.

However, what is good is the simplicity and speed you can throw something together. Considering I’ve never written code to read a CSV file in Python and to handle the data, it took about 10mins to get something that loaded the summitslist.csv and started accessing the summits and work them against each other. 5 mins to install geopy and get something throwing out distances.

At this point you do the performance calc which said a long time.

The next thought was do I rewrite this in C++ which will go like the clappers but is a pain. Or is there an easy way to parallelise the task. It takes my crude code about 30 secs to check one summit against all the others on the 1ltr-computer Linux machine, a 2.5GHz i5. There’s a faster i7 on the same desktop that also has Python dev tools but is Windows so we don’t want to really use that. There’s a faster i7 Linux laptop but I’d have to walk upstairs to get it. So I started the slow code running and went and watched an episode of Vera from earlier this month and 3/4 of Match of the Day whilst ruminating.

Most fun way to solve the problem is split the summitslist.csv into 106 files only contain 1 association and write a wee Bash script to run 106 copies of the Python script in parallel on the AMD EPYC server. It will take much less time that way. And producing any kind of workload that gets all 80 cores to 100% for any appreciable time (more than a second or two) is most amusing. I should work out how long this would have taken on the first powerful computer I used at university, a PrIme 750 supermini/mainframe.

ZL4NVW · 26 January 2025 03:16

Well, I suspect that the top item (ZL3/CB-101) on the list might not happen for a while. Not at greyline times at least!

https://climbnz.org.nz/nz/si/nelson-lakes/spenser-mountains/mt-una

Second on the list - ZL3/CB-406 - I will volunteer for, if anyone wants try come the March equinox. It’s probably a 2-day walk in: day 1 about 20km to Top Hope Hut then a short day up onto the tops on day 2 for the activation. There’s even some nice little tarns to camp by nearby so two chances for both evening and morning grey-line.

EA1/LU-019 looks likely too - it has activations and maybe even a road to the top! Assume from the name that it’s a historic beacon site?

Who’s keen?

VK3AFW · 26 January 2025 06:24

Alex,
If Tim’s mast is significant then the Moons tidal bulging of the crust should be considered. Contacts under a full moon are better.

73
Ron
VK3AFW

EA2BD · 26 January 2025 07:19

No need of a computer processing with brutal force to confirm that EA1/LU-019 is a drive on summit in a wind power generator park.

Good access to the top with a modern fire lookout tower, not an historic site though.

8680_5bk9dt78hxye0fil

Ready for a record trial then?
In my S2S logs only VK long path and JA not sure the path, but not ZL yet.

73 Ignacio

MM0EFI · 26 January 2025 07:32

I did mention that to him in a WhatsApp, along with the gravitational pull from the sun.

Not heard back…

GM4EVS · 26 January 2025 07:43

Following an initial skirmish with this problem, reducing the amount of required computation is certainly worth looking at.

For someone in a given country, all the large S2S distances are likely to occur with summits that are, say, more than +/-150 degrees longitude away from your own country. With the UK, for example, the Greenwich meridian would probably make an adequate reference longitude for ‘filtering’ purposes.

As such, the list of S2S distance calculations falls dramatically. And, with a bit more testing, the 150 degree figure could probably be increased somewhat without missing anything distant.

With the filtered list of ‘far away’ summits, you could then check each of these against all the individual UK summits. A much smaller result set.

AFAIK, in the days of yore, use of the Haversine-based (rather than Cosine-based) spherical distance algorithm provided a more accurate result.

The Cosine of very small angles are all numbers just less than 1.0. Using low-precision tables, or even 32-bit REALs (about 7 DP), you don’t see much difference; this motivates the use of the Haversine algorithm.

For 64-bit REALs (about 15 DP), the precision of small-angle cosines is fine; you get the same distance result as the Haversine approach.

73 Dave

PS

In Python, I have used this for many years, where x is in radians.

def hav(x):
‘’‘Haversine function’‘’
y = math.sin(x / 2.0)
return y*y

(with appropriate indentation which I have yet to master here)