Magrathean diacritics

User avatar
volleo6144
Posts: 81
Joined: Mon May 18, 2020 7:03 am
Location: Earth
Contact:

Re: Magrathean diacritics

Post by volleo6144 »

Dave Keenan wrote: Fri Jun 05, 2020 12:54 pm munged_error(error) = 1/(0.5 - error) - 1/0.5

munged_3_exp(3_exp) = 1/(14 - 3_exp) - 1/14

Code: Select all

Error Munged x5.5
----- ------ ----
0.000 0.0000    0
0.050 0.2222    1
0.100 0.5000    3
0.150 0.8571    5
0.200 1.3333    7
0.250 2.0000   11
0.300 3.0000   16
0.350 4.6667   26
0.400 8.0000   44
0.450 18.000   99
0.480 48.000  264
0.490 98.000  539
0.495 198.00 1089
0.499 998.00 5489
0.500      ∞    ∞

Code: Select all

3exp Munged x154
---- ------ ----
   0 0.0000    0 [0.0028]
   1 0.0055    1 [0.0055]
   2 0.0119    2 [0.0110]
   3 0.0195    3 [0.0221]
   4 0.0286    4 [0.0442]
   5 0.0397    6 [0.0884]
   6 0.0536    8 [0.1768]
   7 0.0714   11 [0.3536]
   8 0.0952   15 [0.7071]
   9 0.0129   20 [1.4142]
  10 0.1786   28 [2.8284]
  11 0.2619   40 [5.6569]
  12 0.4286   66 [11.314]
  13 0.9286  143 [22.627]
  14      ∞    ∞ [45.255]
  15 -1.071 -165 [90.510]
  16 -0.571 - 88 [181.02]
  17 -0.405 - 62 [362.04]
   ∞ -0.071 - 11 [     ∞]
Something tells me that 1/(limit - x) - 1/limit isn't a good fit for some of these...

(the bracketed entries are the coefficients multiplied by ln(SoPF>3 + 2) in George's old metric; they're all the successive square roots of odd powers of 2, starting with 2^-8.5, and this term with interpolation was also used for apotome slope)
I'm in college (a CS major), but apparently there's still a decent amount of time to check this out. I wonder if the main page will ever have 59edo changed to green...
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

Not ready with results yet, but I wanted to drop a line on this thread to say: I've been experimenting a bunch in the last couple of days and I have a technique now which may prove fruitful.

I mentioned to @Dave Keenan yesterday that my partner does Product Marketing for a living, a deeply data-driven occupation. I regularly turn to her for answers on stats related problems. So I asked her this afternoon if she knew a better way to do regression analyses than manually copying and pasting data series into Wolfram online, because I was going to need to start testing a ton of variations of metric combinations. She said: just use Google Sheets! Whatever I say about Sheets is probably also true for Excel.

Indeed they have various formulas, e.g. SLOPE for linear fit and GROWTH for exponential fit. But those turned out to only be good for predicting future values in a series. I needed something to give me the actual formula for the best fit curve.

Google Sheets did in the end have the answer, but not in a formula. The solution was found inside their Charts feature. If you give it a data series and Customize the chart, one of the options it provides is a Trendline. Enabling the Trendline gives you a bunch of options: linear, exponential, logarithmic, power, etc. You can generally eyeball which is the best shape for your data, but an objective measure is found in the R2 value, or coefficient of determination. It can go as high as 1, or 100%.

And f you change the Label of the Trendline in the dropdown to "Equation" then you can get its equation. But what I ultimately needed was the goodness-of-fit; I was only after the equation as a means to calculating goodness-of-fit myself. So that Sheets calculated R2 for me was even better than I was expecting!

So anyway, my next steps will be to come up with a ton of different combinations of metrics (SoPF>3, Benedetti height, Tenney height, n+d ("length"?), abs(n-d), abs(SoPF>3(n) - SoPF>3(d)), etc. etc. etc.) and then just compare all of their R2 and see which one has the best fit with respect to the frequency statistics from Scala.

By the way, the R2 for the frequency statistics themselves is an impressive 0.991 when fit to the equation 8041x-1.37, where x is the index of the comma in the list of commas sorted by descending frequency. Dunno if there's any significance to that coefficient, but there ya go.

So I guess the moral of the story is: trust your partner for assists, and the solution is often right under your nose.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: Magrathean diacritics

Post by Dave Keenan »

I look forward to further results from this approach.
cmloegcmluin wrote: Tue Jun 16, 2020 11:14 am By the way, the R² for the frequency statistics themselves is an impressive 0.991 when fit to the equation 8041x⁻¹⋅³⁷, where x is the index of the comma in the list of commas sorted by descending frequency. Dunno if there's any significance to that coefficient, but there ya go.
It relates to my observation that the frequency falls off faster than Zipf's law. Zipf's law implies kx⁻¹.

But I note that we don't really care if the complexity function is a good fit to the frequency, only that it produces (nearly) the same rank ordering.

[Edit: Off-topic stuff about superscript decimal points deleted.]
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

Dave Keenan wrote: Wed Jun 17, 2020 1:12 pm It relates to my observation that the frequency falls off faster than Zipf's law. Zipf's law implies kx⁻¹.
Ah! Yes, that's an excellent observation.
But I note that we don't really care if the complexity function is a good fit to the frequency, only that it produces (nearly) the same rank ordering.
Right. We're not trying to predict further (appended or interpolated) entries in the frequency list. I submitted this 0.991 value as an indicator that we have enough data in the Scala stats such that they come out pretty smooth. And it's also a measure that people use these commas in their scales in a remarkably predictable way (I was vaguely wondering whether there might be some deeper mathematical/harmonic meaning to this 1.37 number).
Sorry for the off-topic stuff. I'll move this to the Admin subforum eventually.
Dave added a post here: viewtopic.php?f=15&t=489, and I've updated my previous post to use the [sup] BBCode. Thanks for the helpful suggestion... that off-kilter decimal point was irking me too.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: Magrathean diacritics

Post by Dave Keenan »

volleo6144 wrote: Sat Jun 06, 2020 12:48 am
Dave Keenan wrote: Fri Jun 05, 2020 12:54 pm munged_error(error) = 1/(0.5 - error) - 1/0.5

munged_3_exp(3_exp) = 1/(14 - 3_exp) - 1/14
[Tables elided]

Something tells me that 1/(limit - x) - 1/limit isn't a good fit for some of these...
Thanks for making those tables. I agree, it looks useless when applied to the 3-exponent. But not so bad on the error.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

Dave Keenan wrote: Wed Jun 17, 2020 5:50 pm
volleo6144 wrote: Sat Jun 06, 2020 12:48 am
Dave Keenan wrote: Fri Jun 05, 2020 12:54 pm munged_error(error) = 1/(0.5 - error) - 1/0.5

munged_3_exp(3_exp) = 1/(14 - 3_exp) - 1/14
[Tables elided]

Something tells me that 1/(limit - x) - 1/limit isn't a good fit for some of these...
Thanks for making those tables. I agree, it looks useless when applied to the 3-exponent. But not so bad on the error.
I could be wrong, but I think @volleo6144 might have been saying that if we created a similar type of metric but for comparing the prime limits of the commas, it might not work well? Or did I get that wrong? What exactly did you mean by "some of these" (my italics), volleo6144? Some of these munged values, or some of these comma candidates?

@Dave Keenan could you say more about why you think this metric looks useless on the 3-exponent?

And just to make sure we're on the same page (and this articulation is as much for my own benefit, to clarify what the heck is happening here, haha): the investigations I described my plans for above are focused on producing an improvement to the SoPF>3 metric – a better frequency heuristic – and they may likely take the closely-related prime limit property of a comma into consideration. It is a parallel task to this development of munged tina error and abs3exp metrics.

In the end, whatever I come up with will then be combined with (perhaps munged in some way) metrics on tina error and abs3exp. The result will become our consolidated badness metric to aid us in choosing the best primary commas for the tinas.

I think it makes sense to keep the tina error and abs3exp metrics separate from the frequency heuristic I'm working on, which may have applications beyond Sagittal (while tina error is certainly Sagittal-specific, and our specific needs for our abs3exp are somewhat Sagittal-specific).

I may be totally off-base on any of this. Please correct me if any of my intel is bad.

And also, if anyone else is interested in the stuff I said I was doing, by all means, if you want to take a crack at it, even beat me to it, I say go for it! I think it's good to get this stuff straight so we don't duplicate work undesirably, but to be clear, I won't feel like anyone is stepping on my toes.
User avatar
volleo6144
Posts: 81
Joined: Mon May 18, 2020 7:03 am
Location: Earth
Contact:

Re: Magrathean diacritics

Post by volleo6144 »

cmloegcmluin wrote: Thu Jun 18, 2020 7:31 am
Dave Keenan wrote: Wed Jun 17, 2020 5:50 pm
volleo6144 wrote: Sat Jun 06, 2020 12:48 am

[Tables elided]

Something tells me that 1/(limit - x) - 1/limit isn't a good fit for some of these...
Thanks for making those tables. I agree, it looks useless when applied to the 3-exponent. But not so bad on the error.
I could be wrong, but I think @volleo6144 might have been saying that if we created a similar type of metric but for comparing the prime limits of the commas, it might not work well? Or did I get that wrong? What exactly did you mean by "some of these" (my italics), volleo6144? Some of these munged values, or some of these comma candidates?
...I meant "some of the munged values".
I'm in college (a CS major), but apparently there's still a decent amount of time to check this out. I wonder if the main page will ever have 59edo changed to green...
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: Magrathean diacritics

Post by Dave Keenan »

cmloegcmluin wrote: Thu Jun 18, 2020 7:31 am @Dave Keenan could you say more about why you think this metric looks useless on the 3-exponent?
I feel it punishes increasing 3-exponents too harshly too soon.
And just to make sure we're on the same page (and this articulation is as much for my own benefit, to clarify what the heck is happening here, haha): the investigations I described my plans for above are focused on producing an improvement to the SoPF>3 metric – a better frequency heuristic – and they may likely take the closely-related prime limit property of a comma into consideration. It is a parallel task to this development of munged tina error and abs3exp metrics.
Yes. That's the page I'm on too. But I guess I'm making the additional assumption that whatever alternative complexity measure we may come up with will not be too different from the current SoPF>3. You need not conform to that. It's just that in munging the error and 3-exponent I'm attempting to put them in a form that is in some way comparable to SoPF>3 so that it makes sense to add them to it, to obtain an overall badness.

But now, I think I'll wait until we have the improved PF>3 measure before doing anything more on the error or 3-exponent.
In the end, whatever I come up with will then be combined with (perhaps munged in some way) metrics on tina error and abs3exp. The result will become our consolidated badness metric to aid us in choosing the best primary commas for the tinas.
Yes. That is my hope.
I think it makes sense to keep the tina error and abs3exp metrics separate from the frequency heuristic I'm working on, which may have applications beyond Sagittal (while tina error is certainly Sagittal-specific, and our specific needs for our abs3exp are somewhat Sagittal-specific).
Agreed.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

Dave Keenan wrote: Fri Jun 19, 2020 12:08 am
cmloegcmluin wrote: Thu Jun 18, 2020 7:31 am @Dave Keenan could you say more about why you think this metric looks useless on the 3-exponent?
I feel it punishes increasing 3-exponents too harshly too soon.
Got it.

We could play around with decreasing the inside radius of the curve, by increasing the power:

abs3exp	1/(14-x)-1/14	×154	1/(141.5-x1.5)-1/(141.5)	×154	1/(142-x2)-1/(142)	×154
0	0		0	0			0	0			0
1	0.005494505495	1	0.0003715239113		0	0.0000261643119		0
2	0.0119047619	2	0.001089600817		0	0.000106292517		0
3	0.01948051948	3	0.002102165925		0	0.0002455527666		0
4	0.02857142857	4	0.003440957342		1	0.0004535147392		0
5	0.03968253968	6	0.005180069101		1	0.0007459124		0
6	0.05357142857	8	0.007444777539		1	0.001147959184		0
7	0.07142857143	11	0.0104407162		2	0.001700680272		0
8	0.09523809524	15	0.01451682008		2	0.002473716759		0
9	0.1285714286	20	0.02030604202		3	0.003593611358		1
10	0.1785714286	28	0.02907847782		4	0.00531462585		1
11	0.2619047619	40	0.0438016849		7	0.008231292517		1
12	0.4285714286	66	0.07338276838		11	0.01412872841		2
13	0.9285714286	143	0.1623639698		25	0.03193499622		5
14	∞		∞	∞			∞	∞			∞
And just to make sure we're on the same page (and this articulation is as much for my own benefit, to clarify what the heck is happening here, haha): the investigations I described my plans for above are focused on producing an improvement to the SoPF>3 metric – a better frequency heuristic – and they may likely take the closely-related prime limit property of a comma into consideration. It is a parallel task to this development of munged tina error and abs3exp metrics.
Yes. That's the page I'm on too. But I guess I'm making the additional assumption that whatever alternative complexity measure we may come up with will not be too different from the current SoPF>3. You need not conform to that. It's just that in munging the error and 3-exponent I'm attempting to put them in a form that is in some way comparable to SoPF>3 so that it makes sense to add them to it, to obtain an overall badness.
I believe it will be quite close to SoPF>3, if judging only from the fact that in this list linked earlier, other factors besides SoPF>3 such as prime limit and vincular(?) balance of the primes accounted only for sorting within a SoPF>3 tier, or in other words, their effect on the metric would only ever budge it by less than 1.
But now, I think I'll wait until we have the improved PF>3 measure before doing anything more on the error or 3-exponent.
Probably wise. As you can see from the above, I lack such wisdom.
In the end, whatever I come up with will then be combined with (perhaps munged in some way) metrics on tina error and abs3exp. The result will become our consolidated badness metric to aid us in choosing the best primary commas for the tinas.
Yes. That is my hope.
I forgot to include a couple other factors were suggested earlier: 5-schisma-slope, and a bias toward 3-exponents close to +8 to counteract the 5-schisma's 3-exponent of -8.
User avatar
volleo6144
Posts: 81
Joined: Mon May 18, 2020 7:03 am
Location: Earth
Contact:

Re: Magrathean diacritics

Post by volleo6144 »

Modified the table a little:
abs3exp	1/(14-x)-1/14	×154	1/(141.5-x1.5)-1/(141.5)	1054	1/(142-x2)-1/(142)	6468
0	0		0	0			0	0			0
1	0.005494505495	1	0.0003715239113		0	0.0000261643119		0
2	0.0119047619	2	0.001089600817		1	0.000106292517		1
3	0.01948051948	3	0.002102165925		2	0.0002455527666		2
4	0.02857142857	4	0.003440957342		4	0.0004535147392		3
5	0.03968253968	6	0.005180069101		5	0.0007459124		5
6	0.05357142857	8	0.007444777539		8	0.001147959184		7
7	0.07142857143	11	0.0104407162		11	0.001700680272		11
8	0.09523809524	15	0.01451682008		15	0.002473716759		16
9	0.1285714286	20	0.02030604202		21	0.003593611358		23
10	0.1785714286	28	0.02907847782		31	0.00531462585		34
11	0.2619047619	40	0.0438016849		46	0.008231292517		53
12	0.4285714286	66	0.07338276838		77	0.01412872841		91
13	0.9285714286	143	0.1623639698		171	0.03193499622		207
14	∞		∞	∞			∞	∞			∞
The point of the 154 multiplier was because Dave considered "two prime 5's, or a prime 11" (I went with 11) to be what we were willing to trade off for a tina error of 1/4 or a 3-exponent of ±7. After changing the multiplier to account for this (to 1053.568 for the x1.5-based metric or 6468 for the x2-based metric, as the remainder of the post will assume), we get the table above.

Did @Dave Keenan mean before or after the ±7 mark by "too harshly too soon"?

Going to extremes, an exponent too high sets anything beyond our benchmark to be completely unbearable (16th-power metric counts ±8 as being worse than ±7 by 82 = 41+41), and low exponents ... apparently asymptote until you get into floating-point imprecision territory (64th-root and 1,000,000,000,000th-root both count the difference between ±13 and ±12 as a number that rounds to 54 = 23+31).
I'm in college (a CS major), but apparently there's still a decent amount of time to check this out. I wonder if the main page will ever have 59edo changed to green...
Post Reply