developing a notational comma popularity metric

User avatar
Dave Keenan
Site Admin
Posts: 1287
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

Do you have the z=1 SoS for these new metrics (with parameters optimised for z= -1)?

User avatar
cmloegcmluin
Site Admin
Posts: 984
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Dave Keenan wrote: Mon Aug 10, 2020 10:01 pm Do you have the z=1 SoS for these new metrics (with parameters optimised for z= -1)?
I can get those to you soon.

------

Sadly I woke up and my chunk count 5 is still running, and only at 13% completion. I should have had it measure % in terms of estimated samples as well as known scope count, since some stretches of scopes have many more samples than others. Since it's running on my work laptop, performance may be impacted by my workday, since developing web applications while remotely video-chatting constantly with my pair programming buddy already usually taxes the thing to its limits (or I may have to cancel the run if it compromises my ability to do my job).

It occurred to me to run a third experiment -- how fast does the code run on my crummy 5-year-old personal Lenovo laptop which I got for dirt cheap, the one I got in an emergency when my college laptop finally died in the middle of my web development bootcamp, which I now keep by my bed just for watching movies with my partner? Well, to my surprise, it ran almost as fast as my brand-new spiffy work Mac laptop! So there's something about my primary workhouse desktop PC which I built myself for developing VR applications which is making it no good at calculating a bunch numbers in Node.js. I can print all the stats here for you, but in terms of memory, core count, and GHz, it would seem like the desktop computer should be the strongest machine... so I'm not sure why it runs less than half as well. Bummer that it's the one I've been using to run these commands all this time!

I could restart the chunk count 5 command on my personal laptop then. The work laptop would have an 8-hour head start on it, but probably smart to do just in case. (Edit: I've done that, just in case).

------

So I've determined that I wrong about the 7x performance improvement by eliminating those hurtful parameters and tightening scopes. It was really more like ~3x. The extra ~2.5x was the difference between running it on my work laptop (while brewing my morning coffee) vs running it downstairs on my desktop PC.

If we got another 2x improvement, let's be liberal, from improving performance per the profiler, we'd still only be at a total 3x2x2=12x improvement. To get a ~75 day 5-chunk run down to a few days, we'd still need a 25x improvement. Perhaps the shift from a resolution of 0.1 to 0.333 would do the trick, but I doubt it.

------

So here's another thing I could try: a single 10-chunk metric, like a cwyksxl, or a wyblaux, etc. combining basically everything we've found to be useful so far, with some reasonable ranges for each parameter value, and just see what the lowest SoS it can come up with is. Such a thing would not take too long to run, since it's just a single metric. Perhaps you're not interested in such a thing with so many chunks. It sounds like 5-chunks is either your max, or your sweet spot.

User avatar
cmloegcmluin
Site Admin
Posts: 984
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Here's the updated table, with z=1 SoS for the new metrics I found (ak, kj, aux, wab, and lak... lak being the best one among the ones which combined lb and gpf in various way)

MetricLowest SoSSoS Chunk
namefound z=-1with z=1count
sopfr0.01420608719845.01
k0.00949124318757.52
j0.00910036518637.52
kj0.00796949916814.53
ak0.00759374715492.53
wyk0.00746044317077.55
wb0.00734536116520.53
cwyk0.00730019516890.57
kl0.00697059117464.54
aux0.00681523215516.54
wab0.00680220515276.54
wyks0.00640663914125.56
hyg0.00637271317867.55
lak0.00621074515550.54
wyb0.00605764915638.54
xwyks0.0055389214309.57
cwyks0.00405952213440.58

User avatar
cmloegcmluin
Site Admin
Posts: 984
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

FYI: As an experiment on my lunch break, I wanted to see how much faster the code would get if I ripped out the asynchronous stuff. (It took me all of 5 minutes to rip it all out, by the way.) At least for chunk count 2, it was a 3x speed up, which is quite a bit more than I would have expected given that only 30% of the runtime as identified by the profiler was spent doing the asynchronous management.

(So I ctrl-c'd the run on my work laptop and restarted it with this change)

User avatar
Dave Keenan
Site Admin
Posts: 1287
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

Good work cutting down the 5-chunk time!

I feel like I'm Peter Higgs, but you've built the Large Hadron Collider. :) The Large Function Collider?

I will add my two recent metrics to the table, and post it below. I will call them "c" and "cef", where the latter is:

cef(n, d) = soapfr(n*d) + c*copfr(d), where ap = if p<e then p else e + f*(p-e)

c = -1.814421168, e=15.24816546, f=0.354541887, SoS = 0.006847526, SoS(1) = 13523, 5 chunks.

It's the low SoS(1) with the low chunk count, that impresses me about cef.

Would you please give equations like the above for your new metrics, giving the soapfar etc submetrics and how they go together with the parameters. I find I can't unambiguously parse your descriptions.

Have you updated the chunk counts in the table above, to agree with your definition? If not, please update them in the table below.

User avatar
Dave Keenan
Site Admin
Posts: 1287
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

I've resorted the table on SoS(1).

MetricLowest SoSSoS Chunk
namefound z=-1with z=1count
sopfr0.01420608719845.01
k0.00949124318757.52
j0.00910036518637.52
c0.009004181533
hyg0.00637271317867.55
kl0.00697059117464.54
wyk0.00746044317077.55
cwyk0.00730019516890.57
kj0.00796949916814.53
wb0.00734536116520.53
wyb0.00605764915638.54
lak0.00621074515550.54
aux0.00681523215516.54
ak0.00759374715492.53
wab0.00680220515276.54
xwyks0.0055389214309.57
wyks0.00640663914125.56
cef0.006847526135235
cwyks0.00405952213440.58
μyδ0.0033892413296.5>20

User avatar
cmloegcmluin
Site Admin
Posts: 984
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

I feel like I'm Peter Higgs, but you've built the Large Hadron Collider. :) The Large Function Collider?
What I've built is complex, but honestly, pretty dumb. Not that I'm saying I'm dumb for building it. I'm just saying that my approach is fundamentally one of brute force. The artificial life type stuff the Excel solver is doing is a more intellectually interesting approach. There are clearly immense depths of thought-provoking possibilities for automating searches intelligently in such problem spaces which I only have the foggiest notion of at this time. One day I'd like to master this stuff. Perhaps one day I can look back fondly on this project and chuckle at myself a bit.
Have you updated the chunk counts in the table above, to agree with your definition? If not, please do.
The chunk counts in those columns already agree with my conceptions, except cef which I'd give 6 (sopfr, copfr, c, e, f, and what I would call a j=0 but you can call it whatever you need to get copfr to apply only to d).

Is "c" just sopfr(n*d) + c*copfr(d), then? If so, I would give that 4 chunks.
Would you please give equations like the above for your new metrics, giving the soapfar etc submetrics and how they go together with the parameters. I find I can't unambiguously parse your descriptions.
How're these:

wab: lb(na + w) + lb(da + b), a = 2.0791902, w = -0.2209040, b = -1.9497175
aux: lb((n + x)a) + lb((d + u)a), a = 1.9713225, x = -0.4582444, u = -1.5128479
ak: lb(na) + k⋅lb(da), a = 2.0717828, k = 0.7981481
kj: sopfr(n)j + sopfr(d)k, j = 1.3673258, k = 1.4690207
lak: lb(na) + lb(da)k + gpf(n,d)a, a = 0.6165725, k = 0.5242938

Please verify these for me on your end.

User avatar
Dave Keenan
Site Admin
Posts: 1287
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

So the prime factorisation ie monzos aren't used by the first 3?

User avatar
cmloegcmluin
Site Admin
Posts: 984
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Oh, uh... no, they are. I guess it’s been my assumption that whenever we show n or d here that it’s short for np and dp, respectively. Except in the case of gpf as you pointed out before. I just didn’t want to put the whole mess of the big sigma inline for each one of these here. Hopefully I haven’t misunderstood something subtle about how soapfr where ap is lb(p) can be simplified to just lb.

User avatar
Dave Keenan
Site Admin
Posts: 1287
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

Yeah. I think you've overgeneralised that. Could you please give the un"simplified" versions of these metrics, with ap= and ar=. No sigmas required, just soapfars.

Post Reply