developing a notational comma popularity metric

User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

That meets my definition of "not fruitful", as in: I could add it to the code, but it would only waste resources. Thanks for illustrating.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

cmloegcmluin wrote: Sun Jul 12, 2020 2:37 am Without further ado, I gotcha down to 0.006136416. Unfortunately my code does not yet support locking the value of a given parameter across all submetrics it is used in, but that's on the agenda for today. In other words, my result has caused v and y to deviate: y=0.8735, v=0.8901, w=-1.4957, b=-2.0463. So perhaps you're uninterested in that result.
Yeah. I'm interested in the 3-parameter case. I look forward to your future result.

In the meantime, I found the following minimum for the abovementioned 4 parameter (6 chunk) metric.
g×lb(n) + h×lb(d) + sopfar(n) + k×sopfar(d), where ar = r → ry

g = 0.6385, h = -1.6518, k = 1.5289, y = 0.8023, SoS = 0.006700181

This is a different use of g and h from pages of this thread prior to the last two.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

The lowest SoS I can find by eliminating any one parameter from the above, is when I eliminate k, to obtain this 3-parameter 5-chunk metric:
g×lb(n) + h×lb(d) + sopfar(nd), where ar = r → ry

g = 0.6845, h = -0.3909, y = 0.8006, SoS = 0.006926345

That's not as good as this earlier 3-parameter 4-chunk metric:
soa1pfar(n) + soa2pfar(d), where a1p = p → lb(p)+w, a2p = p → lb(p)+b, ar = r → ry

w = −1.4457, b = −1.8630, y = 0.8568, SoS = 0.006282743
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

But I'm now thinking that
soa1pfar(n) + soa2pfar(d), where a1p = p → lb(p)+w, a2p = p → lb(p)+b, ar = r → ry
is really a 5 chunk metric too.
1. soapfar()
2. lb()
3, 4, 5. w, b, y

And despite its higher SoS,
g×lb(n) + h×lb(d) + sopfar(nd), where ar = r → ry
has the nice property that it can be described as a two-stage correction to good old sopfr(nd).

The first stage is the compression of the prime repeat-counts by raising them to the 0.8006 power. This only affects ratios with repeat-counts greater than 1.

The second stage is the addition of g×lb(n) + h×lb(d), which can also be written as:
lb(ng/d-h)
= lb(n0.6845/d0.3909)

I'm investigating whether these parameters can be rounded to simple ratios without suffering too much of a hit to the SoS. Maybe the first stage of correction can be r → r4/5 and the second stage can be to add \(\operatorname{lb}\Big(\sqrt[3]{\frac{n^2}{d}}\Big)\).
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

It turns out that with g = 2/3 and h = -1/3, r → r5/6 is better than r → r4/5.

So here's my favourite:

sopfar(nd) + 2/3 × lb(n) - 1/3 × lb(d), where ar = r → r5/6 and n≥d
= sopfar(nd) + (2×lb(n) - lb(d))/3
= sopfar(nd) + lb(cuberoot(n2/d))

SoS = 0.006790864
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Sorry for the delayed response. I have been immersed in the code all weekend. I will respond in more detail later.

My current update:

I spent Saturday cleaning up the code base. The conversion to TypeScript left a bunch of things in annoyingly disjointed states (including the bounds analysis module, and the modules which I developed for finding and analyzing tina commas).

I then spent today (Sunday) refactoring the automatic metric finder to work asynchronously (non-blocking) so that it can simultaneously be pushing to and popping from the arrays of metric configurations to explore. Otherwise it would exceed the maximize space size allotted to the JavaScript engine if it tried to prepare everything it needed from the population of possibilities step before it started the next step of processing them all (searching for metrics).

It then took me all of 5 minutes to apply the fix for the search space resolution I detailed recently. Although there was one change: it turns out that thousandths-place-guaranteed-accuracy is not very feasible even across 3 chunks (i.e. a maximum of 2 parameters); I therefore geared down to a hundredths-place-guarantee.

As as consequence, I now have updated best 2- and 3- chunk metrics:

`(\text{sopfr}(n))^{1.0955} + \text{sopfr}(d)` gives SoS of 0.009100971. That's the entire sopfr(n) to the power. That beats out the 0.009491243 SoS I found with that 2-chunk metric which was just sopfr with k≈0.79. But that is pretty fascinating that all you have to do is raise the sopfr for the numerator to a barely superlinear power and you get less than 2/3rds the SoS of the old standard (again, that's around 0.014206087)! And it's also cool that it's one of the parameters we only recently came up with (i.e. using j or k as a power instead of a coefficient).

Well, I spoke too soon. I began drafting this message before my script completed. Took several hours, but then it came back with a 3-chunk metric which is not as good as the one I already about. So... perhaps something is wrong...

Or perhaps I spoke too soon about speaking too soon, as I just re-ran the 3-chunk one I reported previously (the one with the -1.11111 k on the copfr), and I got the exact same SoS as you got for it! 0.018902287. So my number must have been bogus somehow. I'm really not sure what I would have fixed in that aspect of the code between then and now, but... okay. So here's the 3-chunk one I just found:

`1.407 * \text{sopfar}(n) + \text{sopfar}(d), where ar = r → r^{0.9666}`, gives SoS of 0.008543707

So... that's not too exciting I guess.

You have probably noticed that I put off confrontation of the deeper issues w/r/t parameter relationships that lead to blue threads of death. Unfortunately I'm about to be traveling for a little while and I probably won't be able to resolve those before departing. Realistically I should be able to complete the smaller task of refactoring the search to be breadth-first rather than depth-first. I can dig back into this problem in a couple weeks after my trip.

I feel you re: heartily sickness. My energy on this task has been off and on, as you know. I think a little time away from it will be good for me. I only wish I could have timed it so that during my trip I could left running a script I was confident about, so I could come home to some fun results. I think if we want good answers for 5- or 6- chunk metrics it could take weeks for the scripts to complete.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

Dear Douglas,

Your latest had its greatest effect on me in your revelation that I could expect little progress from your end for "a couple weeks".

This galvanised me into searching desperately for whether there wasn't some way the Excel Solver could be made to do Simulated Annealing or a similar algorithm that would find global minima with a high degree of confidence.

During my searches I came across an article entitled New and Improved Solver. It contained the following words:
...the Evolutionary Solver can be used for any Excel formulas or functions, even when they are not linear or smooth nonlinear. Spreadsheet functions such as IF and VLOOKUP fall into this category.
And this Solver appears as standard in Excel 2010, the version immediately after the 2007 version I currently had. I say "had" because I coughed up the AU$199 that Microsoft wanted for Office 2019 so fast it would make your head spin. After downloading and installing, there was a moment of panic when I couldn't find any Solver in the new version! The help soon told me it was an optional Add In (but included in the download), so I added it in.

Then I opened the same spreadsheet I've been using all along, which was set up for my current favourite, the 3-parameter metric described above. I opened the Solver, chose the Evolutionary algorithm and switched the cell-to-be-minimised from the smooth-but-approximate SoS, to the discontinuous-but-true SoS. Then I was told that the Evolutionary algorithm needed upper and lower bounds for all parameters, so in they went.

Then with my heart in my mouth I hit the "Solve" button. It ran for the default time or number of iterations and came back with a lower SoS than I had ever seen before (for that metric). What joy! So I hit "Solve" again, and again a lower SoS! A few more runs and there was no further change.

The result:
g = 0.692773045, h = -0.245303445, y = 0.827031908, SoS = 0.006372713 [hyg]

When I get a chance, I will revisit some old favourites to see how they compare.

All the best for your trip. :)

-- Your friend Dave
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Yes, I'm sorry for the surprise notice. It is not a well-planned trip. We weren't sure we'd free the foster kittens of their ringworm affliction in time before my partner starts her new job. But we found an opening so we're taking a road trip up the coast.

AU$199? I guess that makes this your 2nd most expensive hobby, after that car of yours – you've put like a million dollars into it, right? ;)

How long does this New and Improved Evolutionary Solver Algorithm take to run? Sounds like not weeks or hours, but just moments!

That's a really nice result you found. Please correct me if I'm wrong, but your g and h here are coefficients on what in my code I would call a "poapfr" where ap → lb(p). You may have proved that equivalent to something my code is currently capable of, since you recommended that I not implement poapf(a)r. If it is equivalent to something in my code already, I'd prefer not to implement it as well, because any new parameter or metric I add, as I've explained, exponentially decreases its ability to succeed. But I would like to check any of your work on my end. So even if we present it to people as lb(n), do we have a way of expressing this in terms of soapfar and coapfar? You had said:
Dave Keenan wrote: Sun Jul 12, 2020 12:37 pm For the same reason (just take the log of everything) I don't see any point in poapfar.
But I can't quite figure out myself how taking the log of poapfar would be equivalent to a soapfar. It sort of makes sense, because of shifting from multiplicative space to additive space, but there's something about it being broken across primes that makes me thing that can't be true. Because the difference between `\log_[a]25`, `\log_{a}26`, and `\log_{a}27` is going to be smoothly changing whatever your choice of `a`, while sopfr gives 10, 15, and 9. I don't think the correspondence gets any better or worse when you include the 5-roughness concept. I can write some tests to confirm this. Maybe I misunderstood what you meant about poapfar being irrelevant. Of course, if it's not, then I will certainly include it my code and re-run my solver for 3-chunks.
User avatar
volleo6144
Posts: 81
Joined: Mon May 18, 2020 7:03 am
Location: Earth
Contact:

Re: developing a notational comma popularity metric

Post by volleo6144 »

cmloegcmluin wrote: Tue Jul 14, 2020 1:05 am But I can't quite figure out myself how taking the log of poapfar would be equivalent to a soapfar. It sort of makes sense, because of shifting from multiplicative space to additive space, but there's something about it being broken across primes that makes me thing that can't be true. Because the difference between `\log_a25`, `\log_a26`, and `\log_a27` is going to be smoothly changing whatever your choice of `a`, while sopfr gives 10, 15, and 9. I don't think the correspondence gets any better or worse when you include the 5-roughness concept. I can write some tests to confirm this. Maybe I misunderstood what you meant about poapfar being irrelevant. Of course, if it's not, then I will certainly include it my code and re-run my solver for 3-chunks.
...I'm sure there isn't any: even if you 2,3-reduce the ratios, `49^a` is still between `47^a` and `53^a`.
I'm in college (a CS major), but apparently there's still a decent amount of time to check this out. I wonder if the main page will ever have 59edo changed to green...
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

cmloegcmluin wrote: Tue Jul 14, 2020 1:05 am But we found an opening so we're taking a road trip up the coast.
Sounds great. Janelle and I did a weekend trip up there in the summer of '94, when I was working at Interval Research. Looking at the map now, I can't remember where we went. We can't have got far. Maybe Mendocino. But I remember it was beautiful.
AU$199? I guess that makes this your 2nd most expensive hobby, after that car of yours – you've put like a million dollars into it, right? ;)
Hah! I'd been perfectly happy with the capabilities and familiar user-interface of Excel 2007 for the past 12 years, and I wasn't in any hurry to upgrade to Office 365 (now called Microsoft 365) since its name refers to the fact that they bill you for it all over again, every 365 days. But for an evolutionary or GA solver I was willing to consider it. I looked at Libre Office Calc, but its solver was pathetic — linear only. I considered second-hand Excels, but I couldn't trust their licensing. Then I learned that Microsoft was again offering a one-off-payment version of Office, and I was in like a shot.
How long does this New and Improved Evolutionary Solver Algorithm take to run? Sounds like not weeks or hours, but just moments!
Yes, each of the four or five runs, took about 30 seconds. There's a good article here on its limitations. It isn't quite magic. But it's close enough. :)
That's a really nice result you found. Please correct me if I'm wrong, but your g and h here are coefficients on what in my code I would call a "poapfr" where ap → lb(p).
No. g×lb(n) + h×lb(d) + sopfar(nd), where ar = r → ry
But I would like to check any of your work on my end. So even if we present it to people as lb(n), do we have a way of expressing this in terms of soapfar and coapfar?
Yes. Because lb(n) = soapfr(n), where ap = p → lb(p)
Post Reply