## developing a notational comma popularity metric

Dave Keenan
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

### Re: developing a notational comma popularity metric

cmloegcmluin wrote:
Fri Sep 11, 2020 1:40 am
Is that another puzzle? I don't quite see the relationship between sqrt and the log-log plot.
Old modellers trick. A straight line on a log-log plot means a power function, with the slope being the power, in this case 2. So sqrt is its inverse, to convert N2D3P9 back to sopfr.

cmloegcmluin
Posts: 721
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

### Re: developing a notational comma popularity metric

Oh okay, yeah, that makes sense. Cool trick!

But would you prefer sqrt to lb for any reason? Otherwise, I don't see that we require a substitute.

Dave Keenan
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

### Re: developing a notational comma popularity metric

cmloegcmluin wrote:
Fri Sep 11, 2020 6:31 am
But would you prefer sqrt to lb for any reason? Otherwise, I don't see that we require a substitute.
I think we need to reconfigure the LHC as an electron positron collider.

By which I mean you could use similar methods to those you employed in finding N2D3P9, but the set of functions to be tried would be much smaller, and we would not be minimising a sum of squared errors but maximising the count of "correct" comma assignments below the half-apotome in the Extreme JI notation.

The set of functions can be summarised as:
compress(N2D3P9) + t × expand1(ATE) + s × expand2(AAS)

We would try various functions for "compress" such as lb(N2D3P9) and sqrt(N2D3P9) and perhaps a parameterised N2D3P9^a where 0<a≤1/2. And the inverse of those functions would be candidates for "expand1" and "expand2".

cmloegcmluin
Posts: 721
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

### Re: developing a notational comma popularity metric

Dave Keenan wrote:
Fri Sep 11, 2020 5:43 pm
cmloegcmluin wrote:
Fri Sep 11, 2020 6:31 am
But would you prefer sqrt to lb for any reason? Otherwise, I don't see that we require a substitute.
I think we need to reconfigure the LHC as an electron positron collider.
Aw man, of course that's what you would've meant. I knew this was the goal but something about the word "substitute" threw me off. Maybe if you'd said "alternative" it'd've felt more to me like we were still keeping the lb around too. Anyway, not your fault; if I'd thought about it just a bit more it should've been obvious.

An no big deal. Don't worry — even when I create a bit of unnecessary back-and-forth on this, I've still got plenty of work to do in the code base during the interim.
By which I mean you could use similar methods to those you employed in finding N2D3P9, but the set of functions to be tried would be much smaller, and we would not be minimising a sum of squared errors but maximising the count of "correct" comma assignments below the half-apotome in the Extreme JI notation.
Sure. I could take the straight count of corrects, or I could take some weighted correctness. Like, if you're not correct, the square of the difference between your usefulness score and the correct comma's usefulness? The latter has a bit more of the SED feel from the approach we took with the popularity metric; I think it's just going to have a more aggressively stepped shape.

Another thing to consider is whether we care more about getting the lower precision symbols right. I might suggest we proportion the penalty to the size of the symbol's zone.
The set of functions can be summarised as:
compress(N2D3P9) + t × expand1(ATE) + s × expand2(AAS)

We would try various functions for "compress" such as lb(N2D3P9) and sqrt(N2D3P9) and perhaps a parameterised N2D3P9^a where 0<a≤1/2. And the inverse of those functions would be candidates for "expand1" and "expand2".
Yes, that all makes sense.

Dave Keenan
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

### Re: developing a notational comma popularity metric

cmloegcmluin wrote:
Sat Sep 12, 2020 1:09 am
Sure. I could take the straight count of corrects, or I could take some weighted correctness. Like, if you're not correct, the square of the difference between your usefulness score and the correct comma's usefulness? The latter has a bit more of the SED feel from the approach we took with the popularity metric; I think it's just going to have a more aggressively stepped shape.

Another thing to consider is whether we care more about getting the lower precision symbols right. I might suggest we proportion the penalty to the size of the symbol's zone.
Those are both excellent ideas.

cmloegcmluin
Posts: 721
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

### Re: developing a notational comma popularity metric

No update on this front yet. I'm still buried in chores, cleaning up the code base. I could always plunge ahead and add more stuff, but sometimes I reach a tipping point and just feel like if I don't pay down some of the tech debt I'll go nuts.

I did come across this interesting number yesterday: Mills' constant. It's related to prime numbers, and similar to the actual r we were finding for the votes on comma popularities (we were finding something close to -1.3, but in the end went with -1). I'm not suggesting we need to reconsider any of the decisions made in developing N2D3P9, or even reference it on the wiki page, but I did think it was a curiosity worth sharing here.

Dave Keenan
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

### Re: developing a notational comma popularity metric

cmloegcmluin wrote:
Mon Sep 14, 2020 5:23 am
No update on this front yet. I'm still buried in chores, cleaning up the code base. I could always plunge ahead and add more stuff, but sometimes I reach a tipping point and just feel like if I don't pay down some of the tech debt I'll go nuts.
Fair enough.
I did come across this interesting number yesterday: Mills' constant. It's related to prime numbers, and similar to the actual r we were finding for the votes on comma popularities (we were finding something close to -1.3, but in the end went with -1). I'm not suggesting we need to reconsider any of the decisions made in developing N2D3P9, or even reference it on the wiki page, but I did think it was a curiosity worth sharing here.
I assume you're referring to the Zipf's law exponent, which we called "z", that you found to be approximately -1.37. While Mills' constant is fascinating, I don't see how it could possibly relate in any way to our "z", given that z is an exponent and Mill's constant is a base, and given that Mills constant only generates a single prime greater than 2 of any musical relevance, namely 11, and that there's nothing special about the middle exponent of 3 that generates Mills' constant, so there are many similar constants.

At first Mills' and similar constants seem almost magical, but in fact they are merely a way of encoding a (very sparse) list of primes into the continuing digits of their decimal fraction, since there is no way to obtain their value without first finding the primes.

This can be made more explicit, and used to generate all primes, with the unnamed constant and recursive function described here: https://en.wikipedia.org/wiki/Formula_f ... all_primes

You might as well just list the primes. The constant does provide some data-compression, if you don't mind doing the work to unpack it. But then we already have efficient algorithms to "unpack" the primes from nothing.

I wasn't aware of any of this until this morning. Thanks for an interesting excursion.