## developing a notational comma popularity metric

- cmloegcmluin
- Site Admin
**Posts:**535**Joined:**Tue Feb 11, 2020 3:10 pm**Location:**San Francisco, California, USA**Real Name:**Douglas Blumeyer-
**Contact:**

### Re: developing a notational comma popularity metric

The simple copfr experiment was not fruitful. I couldn't find any example of a weight > 0 on it helping. Cool idea, but it doesn't look like its our winner.

- cmloegcmluin
- Site Admin
**Posts:**535**Joined:**Tue Feb 11, 2020 3:10 pm**Location:**San Francisco, California, USA**Real Name:**Douglas Blumeyer-
**Contact:**

### Re: developing a notational comma popularity metric

Oh, interesting! So you adjust each prime by a constant as well (in this case -1)? I'll play around with that too.Dave Keenan wrote: ↑Tue Jun 30, 2020 10:14 am...I'm hot on the trail of a prime weighting function that may eliminate the need to include prime-limit/gpf in the mix. It looks a lot like log_{3}(p)-1.

I don't think it makes as much sense to try that on the term.

- Dave Keenan
- Site Admin
**Posts:**802**Joined:**Tue Sep 01, 2015 2:59 pm**Location:**Brisbane, Queensland, Australia-
**Contact:**

### Re: developing a notational comma popularity metric

I did a thing where I left prime-limit out of the mix (forced s=0), but I kept y, where the number repetitions r of a given prime is replaced with r

Then I plotted these prime weights against the primes and saw that it did indeed look roughly logarithmic. It turned out it was an even better fit to a function of the form log

BTW, changing that variable name from α to e is a really bad idea, Mexican food notwithstanding, because using e as a variable log base is exactly the place where it would be confused with the constant log base e ≈ 2.718.

[Edit: Oops! You suggested epsilon ε, not e. That wouldn't be so bad, except it is traditionally used to represent an infinitesimal.]

So I then weighted each r

I'm keen to know how low you can get the SoS in the vicinity of these results.

^{y}where y is slightly less than one. And when summing the r^{y}'s for each prime p, instead of weighting each according to p^{a}or log_{α}(p) I let the solver adjust the weights of all the primes separately, along with k and y, to minimise the sum of squared errors in the reciprocals of the ranks (with my usual fudge to avoid sorting).Then I plotted these prime weights against the primes and saw that it did indeed look roughly logarithmic. It turned out it was an even better fit to a function of the form log

_{α}(p)+w.BTW, changing that variable name from α to e is a really bad idea, Mexican food notwithstanding, because using e as a variable log base is exactly the place where it would be confused with the constant log base e ≈ 2.718.

[Edit: Oops! You suggested epsilon ε, not e. That wouldn't be so bad, except it is traditionally used to represent an infinitesimal.]

So I then weighted each r

^{y}as (log_{α}(p)+w) × r^{y}before summing them. This was done separately for the numerator and denominator as usual, which were then summed as usual, with the smaller term first being multiplied by k. And I added the prime-limit back in (multiplied by s). Here are some approximate optima I found. The underlined numbers were held constant.α w (your d) k y s SoS 3.956349187 -0.619217685 0.638243216 0.883788532 0.020609268 0.006160415 3 -0.774993871 0.638278131 0.883803886 0.025836729 0.006160415 2.718281828 -0.851411926 0.638277637 0.883804124 0.028385603 0.006160415 3.018652175 -0.904768274 0.618447635 0.874496057 0 0.007488211 3 -0.909855998 0.618460475 0.874485023 0 0.007488211 3 -1 0.67017005 0.955080391 0 0.008473958

I'm keen to know how low you can get the SoS in the vicinity of these results.

- Dave Keenan
- Site Admin
**Posts:**802**Joined:**Tue Sep 01, 2015 2:59 pm**Location:**Brisbane, Queensland, Australia-
**Contact:**

### Re: developing a notational comma popularity metric

No, but the fact that a negative weight on it helped, was interesting. I realised that if instead of the simple c × copfr(n/d) you split it intocmloegcmluin wrote: ↑Tue Jun 30, 2020 11:15 amThe simple copfr experiment was not fruitful. I couldn't find any example of a weight > 0 on it helping. Cool idea, but it doesn't look like its our winner.

w×copfr

^{y}(n) + k×w×copfr

^{y}(d), then when you add it to the existing

sop

_{α}fr

^{y}(n) + k× sop

_{α}fr

^{y}(d), where p

_{α}= log

_{α}(p), you get

sop

_{αw}fr

^{y}(n) + k×sop

_{αw}fr

^{y}(d), where p

_{αw}= log

_{α}(p)+w.

[Edit: I changed the first occurrence of "w × copfr(n/d)" above to "c × copfr(n/d)", because the first is a coefficient of copfr while those that follow are coefficients of copfr

^{y}.]

Yes. Although I found the optimum to be around -0.8 to -0.9.cmloegcmluin wrote: ↑Tue Jun 30, 2020 11:18 amOh, interesting! So you adjust each prime by a constant as well (in this case -1)? I'll play around with that too.

Not sure what you mean by that. But I should mention that I also tried weighting the prime exponentI don't think it makes as much sense to try that on the term.

*before*raising it to the power y, i.e. ((log

_{α}(p)+w)×r)

^{y}, but I couldn't get the SoS quite as low with that as with (log

_{α}(p)+w)×r

^{y}.

- cmloegcmluin
- Site Admin
**Posts:**535**Joined:**Tue Feb 11, 2020 3:10 pm**Location:**San Francisco, California, USA**Real Name:**Douglas Blumeyer-
**Contact:**

### Re: developing a notational comma popularity metric

I have to go make dinner now but I will respond to your post soon. I just wanted to share that I got SoS of 0.004250806!

This is using sopafry, with logarithmic a and exponential y, and this new constant we're adjusting the prime by (which I call w... really running out of single-letter variables here!)

k = 0.038 (I know... extremely low... which is weird, but I'm going to treat it as 0)

s = 0 (no prime limit)

a = 1.994 (so basically log

y = 0.455 (so basically square root)

c = 0.577 (that's copfr... no a or y in it. so it did, after all, play a part in this, apparently doing better where k made sense too)

w = -2.08 (so basically subtract 2 from each prime)

max(so{log

This is using sopafry, with logarithmic a and exponential y, and this new constant we're adjusting the prime by (which I call w... really running out of single-letter variables here!)

k = 0.038 (I know... extremely low... which is weird, but I'm going to treat it as 0)

s = 0 (no prime limit)

a = 1.994 (so basically log

_{2}, which is certainly psychologically motivated!!)y = 0.455 (so basically square root)

c = 0.577 (that's copfr... no a or y in it. so it did, after all, play a part in this, apparently doing better where k made sense too)

w = -2.08 (so basically subtract 2 from each prime)

max(so{log

_{2}p-2}f√r(num), so{log_{2}p-2}f√r(den)) + ⅗copfr(num/den)- Dave Keenan
- Site Admin
**Posts:**802**Joined:**Tue Sep 01, 2015 2:59 pm**Location:**Brisbane, Queensland, Australia-
**Contact:**

### Re: developing a notational comma popularity metric

The name "d" is problematic, as I've used it for denominator many times above. Can we call it "w"? My earlier use of "w" for what you're now calling "c", was short lived, as I decided it was simpler to roll copfr into so{..p..}fr, whereupon your "c" was shown to be

The

We're definitely in trunk-waggling territory now, with 5 or 6 parameters. We should try to cut that down if possible without going over SoS = .0055.

It seems fairly insensitive to the value of alpha (the log base), so we might claim that as a constant rather than a parameter. But then I thought it would be 3 where you are finding it better as 2.

I was hoping you could set c=0 without too much damage because your d (my w) should pick up the slack. But then I realised that k=0 means the denominator is irrelevant, except in the count of its primes, including repetitions. So if you set c=0 then you will need k≠0.

I will investigate this k=0, c≠0 regime.

*almost*equivalent to your "d" (my "w"). I have gone back and edited my earlier copfr coefficients from "w" to "c".The

*almost*ness is due to the fact that copfr was not previously split into separate calculations for numerator and denominator, and was not previously applied to prime exponents raised to the power y.We're definitely in trunk-waggling territory now, with 5 or 6 parameters. We should try to cut that down if possible without going over SoS = .0055.

It seems fairly insensitive to the value of alpha (the log base), so we might claim that as a constant rather than a parameter. But then I thought it would be 3 where you are finding it better as 2.

I was hoping you could set c=0 without too much damage because your d (my w) should pick up the slack. But then I realised that k=0 means the denominator is irrelevant, except in the count of its primes, including repetitions. So if you set c=0 then you will need k≠0.

I will investigate this k=0, c≠0 regime.

- Dave Keenan
- Site Admin
**Posts:**802**Joined:**Tue Sep 01, 2015 2:59 pm**Location:**Brisbane, Queensland, Australia-
**Contact:**

### Re: developing a notational comma popularity metric

I'm afraid this:

so{log

is really pushing our idiosyncratic function-naming scheme beyond the breaking point.

No one coming to this for the first time would ever suspect that "so{log2p-2}f√r" was the name of a single function.

It would be read as so × {log

As long as we were only using subscripts and superscripts, we could always fall back to the

so{log

_{2}p-2}f√r(num)is really pushing our idiosyncratic function-naming scheme beyond the breaking point.

No one coming to this for the first time would ever suspect that "so{log2p-2}f√r" was the name of a single function.

It would be read as so × {log

_{2}p-2} × f × sqrt(r(num)).As long as we were only using subscripts and superscripts, we could always fall back to the

*real*name by unsubsupering them (and ungreeking them). Hence my use of sop_{αw}fr^{y}→ sopawfry for this above. The sub and superscripting works as a mnemonic for us, but to new-comers, even that will be read as so × p_{αw}× f × r^{y}.- Dave Keenan
- Site Admin
**Posts:**802**Joined:**Tue Sep 01, 2015 2:59 pm**Location:**Brisbane, Queensland, Australia-
**Contact:**

### Re: developing a notational comma popularity metric

I assume you mean "so basically subtract 2 from the

*log*of each prime".

- Dave Keenan
- Site Admin
**Posts:**802**Joined:**Tue Sep 01, 2015 2:59 pm**Location:**Brisbane, Queensland, Australia-
**Contact:**

### Re: developing a notational comma popularity metric

I confirm your result of

What do you get when you set k=0, c≠0, and when you set c=0, k≠0?

α w (your d) k y s c SoS 1.994 -2.08 0.038 0.455 0 0.577 0.004250806

What do you get when you set k=0, c≠0, and when you set c=0, k≠0?

- cmloegcmluin
- Site Admin
**Posts:**535**Joined:**Tue Feb 11, 2020 3:10 pm**Location:**San Francisco, California, USA**Real Name:**Douglas Blumeyer-
**Contact:**

### Re: developing a notational comma popularity metric

You're right, and I should have noticed that. Yes, let's make it w. I can go back and edit my previous post from "d" to "w" too.Dave Keenan wrote: ↑Tue Jun 30, 2020 1:07 pm"d" is problematic, as I've used it for denominator many times above. Can we call it "w"? My earlier use of "w" for what you're now calling "c", was short lived ... I'm happy to go back and edit my earlier copfr coefficients from "w" to "c".

Dave Keenan wrote: ↑Tue Jun 30, 2020 1:32 pmI'm afraid this ... is really pushing our idiosyncratic function-naming scheme beyond the breaking point.

...we could always fall back to the real name

I'm not sure what "real" name you're referring to. As soon as we went beyond sopfr and sopf, aren't these all new things without established/"real" names? Or do you mean something else by that?Dave Keenan wrote: ↑Tue Jun 30, 2020 12:07 pmBTW, changing that variable name from α to e is a really bad idea, Mexican food notwithstanding, because using e as a variable log base is exactly the place where it would be confused with the constant log base e ≈ 2.718.

[Edit: Oops! You suggested epsilon ε, not e. That wouldn't be so bad, except it is traditionally used to represent an infinitesimal.]

These unwieldy names are helpful during the development process. Once we reach the final step of naming these functions nicely for the outside world we can rely a lot more on the descriptions of what exactly the functions do, and reduce the name to something optimized for pronounceability, e.g. "soapfar" for "sum of adjusted prime factors (with) adjusted repetition", or something like that.

It might be cool if we added a plugin to the forum for LaTeX or MathJax. That might help get these formulas across in a less disgusting and/or intimidating way

Do you still want me to check those, even though I've since found one with 0.004250806?Dave Keenan wrote: ↑Tue Jun 30, 2020 12:07 pmI'm keen to know how low you can get the SoS in the vicinity of these results.

Actually, scratch that. I think your method will need to fulfill the role of the "home stretch": finding the exact values down to the millionths place or whatnot. The way I'm doing things, it's not really tractable to look deeper than the thousandths place. So if you're already working with SoS-billionths, I'm not going be able to help you get any more precise.

I could at least double-check them, though, if you want.

I actually didn't check it with negative values, FYI. I should have tried that, though, and I should have been more clear.Dave Keenan wrote: ↑Tue Jun 30, 2020 12:31 pmNo, but the fact that a negative weight on it helped, was interesting.cmloegcmluin wrote: ↑Tue Jun 30, 2020 11:15 amThe simple copfr experiment was not fruitful. I couldn't find any example of a weight > 0 on it helping. Cool idea, but it doesn't look like its our winner.

Dave Keenan wrote: ↑Tue Jun 30, 2020 12:31 pmI realised that if instead of the simple c × copfr(n/d) you split it into

w×copfr^{y}(n) + k×w×copfr^{y}(d), then when you add it to the existing

sop_{α}fr^{y}(n) + k× sop_{α}fr^{y}(d), where p_{α}= log_{α}(p), you get

sop_{αw}fr^{y}(n) + k×sop_{αw}fr^{y}(d), where p_{αw}= log_{α}(p)+w.

I can't figure out how that works. Is it using a logarithmic identity I'm not familiar with? I'm interested, certainly, since it seems like you found a way to consolidate the count of primes into their sum.it was simpler to roll copfr into so{..p..}fr, whereupon your "c" was shown to be almost equivalent to your "d". I'm happy to go back and edit my earlier copfr coefficients from "w" to "c".

The almostness is due to the fact that copfr was not previously split into separate calculations for numerator and denominator, and was not previously applied to prime exponents raised to the power y.

Dang, I was afraid that might not be clear enough, but I was in a rush. What I meant was that while it feels right to adjust the primes by some constant – if only because we're starting with 5, just kind of floating out there in Fiveland – adjusting the terms of the monzos (AKA the repetitions, or "r") by a constant doesn't make as much sense (whereas IDave Keenan wrote: ↑Tue Jun 30, 2020 12:31 pmNot sure what you mean by that. But I should mention that I also tried weighting the prime exponentI don't think it makes as much sense to try that on the term.beforeraising it to the power y, i.e. ((log_{α}(p)+w)×r)^{y}, but I couldn't get the SoS quite as low with that as with (log_{α}(p)+w)×r^{y}.

*do*think adjusting them by a power or log does make enough sense).

Yes, what I meant was subtracting 2 outside the quotes, like logDave Keenan wrote: ↑Tue Jun 30, 2020 1:44 pmDo you mean "so basically subtract 2 from thelogof each prime"?

_{α}(p) + w. I did

*not*mean log

_{α}(p + w). In my defense, I'm pretty sure logarithms come before subtraction in order of operations, so my "log

_{2}-p" was accurate (and I think the only reason I left off the parens was because I was trying to have it double as part of something like a name... the problem which we've already covered above)

Ah. I'm glad for your "i.e." clause because I would have interpreted what precedes it differently otherwise. I thought by "weighting the prime exponent before raising it to the power" you meant logDave Keenan wrote: ↑Tue Jun 30, 2020 12:31 pmI should mention that I also tried weighting the prime exponentbeforeraising it to the power y, i.e. ((log_{α}(p)+w)×r)^{y}, but I couldn't get the SoS quite as low with that as with (log_{α}(p)+w)×r^{y}.

_{α}(p + w). I thought that because we're still entertaining both sublinear exponents and logarithms (though logarithms seem to be winning now) and so when you said "before raising it to the power" I thought that was standing in for either raising to a power or putting to a logarithmic base. I had not even yet considered the prospect of raising the whole thing to some power (or to some base), even after the repetition count had been raised to some power (or put to some base). All of these things are real and distinct possibilities though! We could have:

((log

_{α}(p + v)+w)×r

^{x})

^{y}or log

_{y}(((p + v)

^{α}+w)×log

_{x}r)

or any other combination of logs and exps, where v is just some other constant and x is just some other exponent...

Yeah... let's not go down that path...

I agree with the sentiment here.Dave Keenan wrote: We're definitely in trunk-waggling territory now, with 5 or 6 parameters. We should try to cut that down if possible without going over SoS = .0055.

Honestly, I think that while "max(so{log2p-2}f√r(num), so{log2p-2}f√r(den)) + ⅗copfr(num/den)" looks kind of gross, it wouldn't actually be that bad once you got in LaTeX. Especially if you have devised a way to fold in copfr. And I think the log base 2 on the prime is kind of pretty, even, in its recognition of human base-2 pitch perception. The square root bit of r makes less immediate musical sense, but it offends me less than the ⅗ on the copfr, insofar as I sense there might be an underlying truth to something like a square root...

I look forward to your results. I'm not really sure what more I can produce tonight until I hear back from you in more detail about your consolidation of copafry into sopafry, so that I can understand it well enough to implement it myself.Dave Keenan wrote:I was hoping you could set c=0 without too much damage because d (or w?) should pick up the slack. But then I realised that k=0 means the denominator is irrelevant, except in the count of its primes, including repetitions. So if you set c=0 then you will need k≠0.

I will investigate this k=0, c≠0 regime.

Welp, you snuck one more in before I could get my post out.Dave Keenan wrote: ↑Tue Jun 30, 2020 2:59 pmWhat do you get when you set k=0, c≠0, and what when you set c=0, k≠0?

with k = 0, I can get 0.004609100.

that's with c = 0.723, a = 1.753, y = 0.473, w = -2.620

those parameters are disconcertingly different from the c = 0.577, a = 1.994, y = 0.455, w = -2.08 which got us 0.004250806 with k = 0.038. If you just take those parameters and set k = 0 then you get 0.004749566 which is also quite close.

with c = 0, I can only get 0.006251296.

that's with k = 0.635, a = 1.430, y = 0.850, w = -2.770

So it would seem that the right path forward is to obliterate the smaller of the num and den, and use the count of primes to account for the presence of harmonic information on the other side.