Magrathean diacritics

User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: Magrathean diacritics

Post by Dave Keenan »

Here's the real metacomma count I think we need:

Find the lowest-badness comma (N2D3P9 <= 5298, ATE <= 15) for every half-tina-wide bucket (boundaries at quarter and three quarter tinas). That's list-A. 809 commas.

Take the existing commas below the half-apotome, but only for the symbols that have no mina accents. i.e. the Ultra-level symbol commas. Round their size to the nearest integer multiple of a semitina (half-tina). That's list-B. 47 commas (must include 1u).

Calculate the metacommas between list-A commas and list-B commas that differ by 1 to 19 semitinas. At most 47 × 38 = 1786 metacomma occurrences.

I assume the following is what you did before, and you should do it again: Count the number of times each metacomma occurs. Group them by semitina. Ignore their actual size. Just go by the integer difference between their list-A and list-B integer semitinas. Within each semitina group, sort them by occurrence count. That's primarily to give us preferred commas for the whole tina accents. But then ...

Take the most common metacomma for each semitina. That's list-C. 19 commas.

Calculate the meta-meta-comma between each pair of consecutive metacommas in list-C, giving 18 meta-meta-comma ocurrences. Count occurrences of each meta-meta-comma. That's to give us preferred commas for the 0.5 tina dot.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

Dave Keenan wrote: Mon Nov 02, 2020 8:35 am Thanks for the most common metacommas. I'm leaning towards using them.
Excellente.
cmloegcmluin wrote: Mon Nov 02, 2020 7:03 am Let me know if you want me to re-run this but only gathering the best comma every whole tina, i.e. that we should only care about counting metacommas for commas that would actually appear in the Insane notation.
Sure. If it's easy.
It would be a matter of minutes, yes. But I'm blocked on getting other questions answered here before proceeding with it.
It would also be good to look for the most common meta-meta-comma for the half-tina dot. i.e. find the most common metacommas for the odd half-tina multiples as well as the whole tinas, then find the most common difference between metacommas for consecutive half tinas.
Okay, that makes sense I think.

The thing that I still need to talk through is whether to use for the whole tinas ±0.25 zones or ±0.5 tina zones. The odd semitina zones will for sure be ±0.25 tinas, but I'm not sure whether the zones for the whole tinas we compare them against should overlap with them or not. The argument for "yes overlap" is that "these whole tina buckets represent potentially the actual Insane precision level notation's commas and therefore can very well go up close to the edge of their capture zones". The argument for "no overlap" is just "it seems to make sense / we're more likely to get semitina-sized results if we do it that way, because otherwise we're going to get 50% of the odd semitinas whose best commas are the same as the best commas in one or the other overlapping whole tina zone and thus the metacomma will be the unison, and in general the average metacomma is going to be significantly less than a semitina in size".

(It looks like your suggestion you posted before I finished this post speaks to buckets sizes such as these, but I don't think it's directly relevant, other than said suggestion might outmode this previous suggestion of yours)
Dave Keenan wrote: Mon Nov 02, 2020 10:12 am Here's the result of an earlier metacomma search of a different kind:
viewtopic.php?f=10&t=430&p=2365&hilit=25%2F11n#p2365

1 tina as 121/1225n
2 tinas as 275/29n [This comma is rubbish. It's closer to 1.5 tinas.]
3 tinas as 1/455n
4 tinas as 3025/7n
5 tinas as 2401/25n
6 tinas as 65/77n
7 tinas as 2125/7n
7 tinas as 7/425n (a close second, the 5s-complement of the first)
8 tinas as 253/5n
9 tinas as 1/539n
Right. That was when we were taking an approach whereby the best notating comma per each 2,3-free class was a target, and we sought good tina-sized metacommas between them. Thanks for resurfacing this. And please correct me if I've inaccurately characterized that past effort. I did hesitate to use the term "metacomma" again since I knew we had used it before, but I do think it's a good word for what it describes, and we should be comfortable enough with it to be able to handle multiple breeds of metacomma.
Those for 1, 2 and 4 tinas above, help notate low-N2D3P9 ratios that are not yet exactly notated.
Some information about how this list was derived, is in this earlier post:
viewtopic.php?p=2353#p2353
Yes, and reviewing the spreadsheet shared just a few posts later has convinced me that I remembered this correctly.
Dave Keenan wrote: Mon Nov 02, 2020 11:50 am Of your 2 tina metacommas that are within 20% of the max count., most can be rejected because they are outside of the range 1.75 to 2.25 tinas.
Well... The 121/1225n is just outside the ±0.25 tina range of 1 tina, at 1.26 tinas, and we're still considering it. (I see that you've noticed this yourself; more on this topic later then).

But you make a good point that none of the commas I've suggested on account of their occurrence count as metacommas have had their own badness vetted in any way.

Why not use the same error metric against them that we use against the best commas per zone in the first place? (More on this topic later, too...)

But I will take a moment to suggest that we could refer to these most recent results of mine as "occams" for short: occurrence count as metacommas. Or just "occurences as metacommas" (which I find slightly preferable for some reason to simply "oam"s).
I have checked that these all map to the correct number of tinas, using the zeta-peak mapping, including the comma for the 0.5 tina dot mapping to zero.
Thanks for remembering that consistency constraint!


I believe this may be an exhaustive list, then, of all the things we've ever taken into consideration in evaluating commas:
  1. unpopularity (compressed N2D3P9)
  2. uselessness (expanded ATE & AAS)
  3. error
  4. inconsistency (zeta-map)
  5. low "occam" (occurrence count as metacommas)
  6. high newly notated 2,3-free class count
Overall, this list has a greatest prime factor of 37. The 0.5 tina comma is the only one with a prime greater than 17.
But are they all superparticular, though? ;)

Wow. I wouldn't have bet on that. But I like it.

Just kidding. Of course, being 37-limit and superparticular are not actually scoreable. Just style points I suppose you could say :)
Dave Keenan wrote: Mon Nov 02, 2020 1:12 pm Oops!

I just noticed that 121/1225n is 1.26 tinas, and so is outside the ±0.25 range that I used to eliminate 2 tina candidates. i.e. it is slightly closer to an odd multiple of a half-tina and so could be notated with a dot.
Alright. But here's the thing. Mightn't we care more about being able to exactly notate 27 good commas exactly using the 121/1225n (internally, as the value of the tina mark inside that Insane precision level symbol) rather than only 18 with the 10241/5n (that's 50% more) than about whether we need to use a dot or not to notate 121/1225n itself? The 121/1225n, i.e. as the primary comma of the symbol which looks like a single "horn" against a bare shaft, is just a single comma, and thus of like 1/27th as much importance, no?


I think we should use a slightly more nuanced error metric here, punishing those past ±0.25 tinas, sure, but not ruling them out entirely. Only once you reach ±0.5 tinas would it be ∞, maybe. I do like your (abs(2 * tinavalue - 2 * round(tinavalue))) error. Maybe it should look like a semicircle, or maybe it should look like a cosine curve.
I hate to say it, after all we've been through to get here, but I don't think the N2D3P9 of a metacomma matters one whit. All that matters is that when added or subtracted it converts one comma with low N2D3P9 into another comma with low N2D3P9. Same with its AAS and ATE. An example of this can be seen here:
viewtopic.php?p=2393#p2393
Let me verify my understanding of your example. The example is that the 6-tina 65/77n has a low N2D3P9 of 200.818, and there's a desirable 6.5-tina candidate out there in 13/37n with low N2D3P9 of 329.574 respectively, but to get there from 65/77n requires a metacomma with quite high N2D3P9 of 1626.744, the 77/185n (the two commas share only a single prime factor in 13, and they don't even cancel out! they're on the same side. so sad). But your point is that this high N2D3P9 is unimportant.

I agree. I think we could say that generally speaking, the deeper we go in accent precision, the less important unpopularity becomes (I may have made a claim like this before, and you may have agreed with it, if memory serves). The same goes for uselessness (and therefore complexity). Error, however, gets more important the deeper you go. I'm not sure I could make a strong argument for whether inconsistency, occam, or newly-notating gets more or less important the deeper you go... maybe I don't need to worry about it.

However, the way you say that "after all we've been through to get here" makes me think N2D3P9 plays no role in the final decision. Maybe you're not saying that, but if you are, I don't agree with that. I mean, we use N2D3P9 in the badness metric to find the best commas from which the metacommas are drawn which feeds the occurrence count as metacommas which we use to rate tina candidates, so N2D3P9 is still there, just indirectly.

So here's a suggestion: what if it really is a 2-step process. Since you found that error had barely any effect on match count (101 to 102) for Extreme commas and therefore was unlikely involved in their selection, perhaps we leave error out of that part. We collect the best commas per zone by complexity (unpopularity & uselessness), gather metacomma counts, and then rate the final tina candidates by a metric which is a combination of occam, error, inconsistency, and newly notated 2,3-free class count.

In other words, we use the #1 and #2 metrics — popularity and usefulness — to determine the commas which produce metacommas which become indirectly consolidated up into the occam score, which is #5, which along with #3, #4, and #6 become their own metric.

I feel like my brain is being taken over by machine elves, or passing through the stage in Penrose's CCC where the death of the universe is physically equivalent to the Big Bang. Kind of crazy, but might be right.
Dave Keenan wrote: Mon Nov 02, 2020 2:11 pm Here's the real metacomma count I think we need:

Find the lowest-badness comma (N2D3P9 <= 5298, ATE <= 15) for every half-tina-wide bucket (boundaries at quarter and three quarter tinas). That's list-A.

Take the existing commas below the half-apotome only for the symbols that have no mina accents. Round their size to the nearest integer multiple of a semitina. That's list-B.

Determine the metacommas between a list-A comma and a list-B comma that differ by 1 to 19 semitinas.
I'm with you so far. Thought I can't yet guess why we'd round to the nearest integer multiple of a semitina. Maybe it'll become clear if I re-read this a few more times. (I can, btw, understand why we only care about symbols with no mina accents).

I really like "semitina". I've gone back and updated this post to use it. Thanks for that.
I assume the following is what you did above, and you should do it again: Count the number of times each metacomma occurs. Group them by semitina. Ignore their actual size. Just go by the integer difference between their list-A and list-B integer semitinas. Within each semitina group, sort them by occurrence count.
Yes, that is what I did above. Except that I used their actual size to group them. Is your integer difference trick supposed to make code run more efficiently or somehow be written more succinctly, or would it actually change the results? If I understand correctly, I'm still taking metacommas between list-A and the unrounded list-B, and just using the rounded version of list-B as a grouping helper.
Take the most common metacomma for each semitina from 1 to 19. That's list-C.

Determine the meta-meta-comma between each pair of consecutive commas in list-C. Count occurrences of each meta-meta-comma.
Oooooh, okay. So this whole bit here is exclusively in order to find the pesky semitina comma. Right? So this really is a direct alternative to the previous metametacomma suggestion you gave me which I wrangled with earlier in this post.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: Magrathean diacritics

Post by Dave Keenan »

On the assumption that I wouldn't hear from you until tomorrow, I was editing all those posts like crazy — anticipating many of your questions and objections below.

But to reward you for responding today, I won't force you to reread all those posts. :) I'll just answer your questions below.
cmloegcmluin wrote: Mon Nov 02, 2020 3:23 pm The thing that I still need to talk through is whether to use for the whole tinas ±0.25 zones or ±0.5 tina zones. The odd semitina zones will for sure be ±0.25 tinas, but I'm not sure whether the zones for the whole tinas we compare them against should overlap with them or not. The argument for "yes overlap" is that "these whole tina buckets represent potentially the actual Insane precision level notation's commas and therefore can very well go up close to the edge of their capture zones". The argument for "no overlap" is just "it seems to make sense / we're more likely to get semitina-sized results if we do it that way, because otherwise we're going to get 50% of the odd semitinas whose best commas are the same as the best commas in one or the other overlapping whole tina zone and thus the metacomma will be the unison, and in general the average metacomma is going to be significantly less than a semitina in size".
All 809 comma zones ±0.25 tina, no overlaps. The imaginary insane precision level could just as well be based on semitinas as tinas.
(It looks like your suggestion you posted before I finished this post speaks to buckets sizes such as these, but I don't think it's directly relevant, other than said suggestion might outmode this previous suggestion of yours)
Yes. My later post outmodes my previous suggestion.
Well... The 121/1225n is just outside the ±0.25 tina range of 1 tina, at 1.26 tinas, and we're still considering it. (I see that you've noticed this yourself; more on this topic later then).
I now think the 0.26 tina error isn't necessarily a problem, provided the 121/1225n metacomma almost-always causes a comma to jump between semitina buckets that are 2 semitinas apart (rarely 3 semitinas). Of course those times it causes a 3 semitina jump cannot count towards its occam as 1 tina, but will instead count towards its (hopefully-small) occam as 1.5 tinas. So a given comma can appear in more than one group, in this search.
But you make a good point that none of the commas I've suggested on account of their occurrence count as metacommas have had their own badness vetted in any way.
I no longer think metacommas need to have their own badness vetted in any way.
But I will take a moment to suggest that we could refer to these most recent results of mine as "occams" for short: occurrence count as metacommas. Or just "occurences as metacommas" (which I find slightly preferable for some reason to simply "oam"s).
"Occams" works for me. :)
I believe this may be an exhaustive list, then, of all the things we've ever taken into consideration in evaluating commas:
  1. unpopularity (compressed N2D3P9)
  2. uselessness (expanded ATE & AAS)
  3. error
  4. inconsistency (zeta-map)
  5. low "occam" (occurrence count as metacommas)
  6. high newly notated 2,3-free class count
Good work! I can't think of anything else.
Dave Keenan wrote: Mon Nov 02, 2020 1:12 pm I just noticed that 121/1225n is 1.26 tinas, and so is outside the ±0.25 range that I used to eliminate 2 tina candidates. i.e. it is slightly closer to an odd multiple of a half-tina and so could be notated with a dot.
Alright. But here's the thing. Mightn't we care more about being able to exactly notate 27 good commas exactly using the 121/1225n (internally, as the value of the tina mark inside that Insane precision level symbol) rather than only 18 with the 10241/5n (that's 50% more) than about whether we need to use a dot or not to notate 121/1225n itself?
If you need to use a dot, in what sense would it be the comma for the horn without a dot? But there is still a chance we can use 121/1225n without a dot, as described above.
The 121/1225n, i.e. as the primary comma of the symbol which looks like a single "horn" against a bare shaft, is just a single comma, and thus of like 1/27th as much importance, no?
My earlier preference for 10241/5n had nothing to do with the horn-and-bare-shaft symbol. It was because it had the highest occam of the metacommas within 1±0.25 tinas.
I think we should use a slightly more nuanced error metric here, punishing those past ±0.25 tinas, sure, but not ruling them out entirely. Only once you reach ±0.5 tinas would it be ∞, maybe. I do like your (abs(2 * tinavalue - 2 * round(tinavalue))) error. Maybe it should look like a semicircle, or maybe it should look like a cosine curve.
I prefer the thing I've described above, where it is the difference in the integer semitina bucket numbers of the two commas (that differ by the metacomma under consideration) that matters. This will allow metacommas slightly outside the ±0.25 tina tolerance.
Let me verify my understanding of your example. The example is that the 6-tina 65/77n has a low N2D3P9 of 200.818, and there's a desirable 6.5-tina candidate out there in 13/37n with low N2D3P9 of 329.574 respectively, but to get there from 65/77n requires a metacomma with quite high N2D3P9 of 1626.744, the 77/185n (the two commas share only a single prime factor in 13, and they don't even cancel out! they're on the same side. so sad). But your point is that this high N2D3P9 is unimportant.
You got it.
I agree. I think we could say that generally speaking, the deeper we go in accent precision, the less important unpopularity becomes (I may have made a claim like this before, and you may have agreed with it, if memory serves). The same goes for uselessness (and therefore complexity). Error, however, gets more important the deeper you go. I'm not sure I could make a strong argument for whether inconsistency, occam, or newly-notating gets more or less important the deeper you go... maybe I don't need to worry about it.
Maybe. But I agree we needn't worry about it.
However, the way you say that "after all we've been through to get here" makes me think N2D3P9 plays no role in the final decision. Maybe you're not saying that, but if you are, I don't agree with that. I mean, we use N2D3P9 in the badness metric to find the best commas from which the metacommas are drawn which feeds the occurrence count as metacommas which we use to rate tina candidates, so N2D3P9 is still there, just indirectly.
Agreed. I removed the "I hate to say it ...".
So here's a suggestion: what if it really is a 2-step process. Since you found that error had barely any effect on match count (101 to 102) for Extreme commas and therefore was unlikely involved in their selection, perhaps we leave error out of that part. We collect the best commas per zone by complexity (unpopularity & uselessness), gather metacomma counts, and then rate the final tina candidates by a metric which is a combination of occam, error, inconsistency, and newly notated 2,3-free class count.
I'm willing to ignore newness here. When we multiply the number of commas by a factor of (809×2)/233 ≈ 7, we're bound to notate lots of new 2,3-equivalent classes. And with my integer bucket-jump scheme for assigning metacomma occurrences to semitinas, we don't need to consider metacommma error at all. So I'm willing to go with the consistent metacomma that has the highest occam for each tina accent, and the zero zeta-mapped metametacomma that has the highest occam for the 0.5 tina dot.

And I wouldn't mind running the whole thing again, using the LPEI (u= 1.5) badness to choose the best comma for each zone. To see if it changes the result at all.
In other words, we use the #1 and #2 metrics — popularity and usefulness — to determine the commas which produce metacommas which become indirectly consolidated up into the occam score, which is #5, which along with #3, #4, and #6 become their own metric.
Cool.
I feel like my brain is being taken over by machine elves, or passing through the stage in Penrose's CCC where the death of the universe is physically equivalent to the Big Bang. Kind of crazy, but might be right.
I love that feeling. Just not all the time. :)
I'm with you so far. Thought I can't yet guess why we'd round to the nearest integer multiple of a semitina. Maybe it'll become clear if I re-read this a few more times. (I can, btw, understand why we only care about symbols with no mina accents).
It's so we're looking at how many semitina buckets the metacomma jumps, for each occurrence, rather than worrying about the size of the metacomma itself.
Yes, that is what I did above. Except that I used their actual size to group them. Is your integer difference trick supposed to make code run more efficiently or somehow be written more succinctly,
No.
or would it actually change the results?
Yes.
If I understand correctly, I'm still taking metacommas between list-A and the unrounded list-B, and just using the rounded version of list-B as a grouping helper.
No. It's turtles integers all the way down.
Oooooh, okay. So this whole bit here is exclusively in order to find the pesky semitina comma. Right?
Right. :)
So this really is a direct alternative to the previous metametacomma suggestion you gave me which I wrangled with earlier in this post.
Actually, no. That part is the same. I've just explained it less ambiguously.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

Dave Keenan wrote: Mon Nov 02, 2020 4:47 pm On the assumption that I wouldn't hear from you until tomorrow, I was editing all those posts like crazy — anticipating many of your questions and objections below.

But to reward you for responding today, I won't force you to reread all those posts. :) I'll just answer your questions below.
That was nice of you, thanks. Perhaps, though, it is I who should have dropped a quick courtesy post alerting you to my presence, since I know as you do that it's hard to see who else is on the forum at a given moment.
Well... The 121/1225n is just outside the ±0.25 tina range of 1 tina, at 1.26 tinas, and we're still considering it. (I see that you've noticed this yourself; more on this topic later then).
I now think the 0.26 tina error isn't necessarily a problem, provided the 121/1225n metacomma almost-always causes a comma to jump between semitina buckets that are 2 semitinas apart (rarely 3 semitinas). Of course those times it causes a 3 semitina jump cannot count towards its occam as 1 tina, but will instead count towards its (hopefully-small) occam as 1.5 tinas. So a given comma can appear in more than one group, in this search.
I think we should use a slightly more nuanced error metric here, punishing those past ±0.25 tinas, sure, but not ruling them out entirely. Only once you reach ±0.5 tinas would it be ∞, maybe. I do like your (abs(2 * tinavalue - 2 * round(tinavalue))) error. Maybe it should look like a semicircle, or maybe it should look like a cosine curve.
I prefer the thing I've described above, where it is the difference in the integer semitina bucket numbers of the two commas (that differ by the metacomma under consideration) that matters. This will allow metacommas slightly outside the ±0.25 tina tolerance.
It's so we're looking at how many semitina buckets the metacomma jumps, for each occurrence, rather than worrying about the size of the metacomma itself.
But you make a good point that none of the commas I've suggested on account of their occurrence count as metacommas have had their own badness vetted in any way.
I no longer think metacommas need to have their own badness vetted in any way.
And with my integer bucket-jump scheme for assigning metacomma occurrences to semitinas, we don't need to consider metacommma error at all. So I'm willing to go with the consistent metacomma that has the highest occam for each tina accent, and the zero zeta-mapped metametacomma that has the highest occam for the 0.5 tina dot.
*click* That all makes sense to me now. Thanks. I mean, I understand it well enough that I can implement it and I get the gist of why it's a substitute for error. But I don't feel it in my bones yet how it works, or maybe what I mean is, I don't understand it well enough that I could imagine having come up with it myself. Maybe I will attain this level of understanding through implementing in the code. At the moment, it seems like there's probably something genius about it, like it's better than a substitute, but actually an amazing upgrade, like you got right at the root of the special kind of error particular to the problem we most truly care about. It certainly at least addresses my concerns about dismissing the 121/1225n for being just over that ±0.25 tina edge.

I'm willing to ignore newness here. When we multiply the number of commas by a factor of (809×2)/233 ≈ 7, we're bound to notate lots of new 2,3-equivalent classes.
I agree we should ignore newness. If we wanted to do newness right, we couldn't treat each comma assignment independently; each comma's choice would influence each other comma's, which I'm pretty sure would increase the complexity of our task a million fold or something like that.

Even if we're not going to use it moving forward, for what it's worth, I think when speaking of that which you are referring to as "newness" for short here in the negative sense (aligned with unpopularity, uselessness, inconsistency, etc. underscoring how in the inverted score, points are bad) we could call it "redundancy". That seems to me to be the best word to capture "having a low count of newly exactly notated 2,3-free classes".
And I wouldn't mind running the whole thing again, using the LPEI (u= 1.5) badness to choose the best comma for each zone. To see if it changes the result at all.
To be clear: I did use LPEI with u=1.5 in my previous results to choose the best comma for each zone. I only suggested we might simplify from LPEI to LPE as an offshoot of my suggestion that we introduce this 2-phase process (1. gather metacommas, 2. score metacommas) when I noticed it was maybe weird that we use error in both phases and error hadn't had much effect previously when finding best commas in zones (which is involved in phase 1 of gathering metacommas).

Regardless which came first, I will run the thing once with LPEI and once with LPE, and see how much things change. I would expect: very little.

To summarize, here's what I actually did previously:

phase 1, gather metacommas: unpopularity, uselessness, error
phase 2, score metacommas: low occam (as judged by buckets based on actual metacomma size)

Then I proposed we do it this way instead:

phase 1, gather metacommas: unpopularity, uselessness, error
phase 2, score metacommas: low occam (as judged by buckets based on actual metacomma size), error, inconsistency, redundancy

But before I ever went back to the code to do it that way, we discussed further and we're now thinking something more like this:

phase 1, gather metacommas: unpopularity, uselessness, try once w/ error and once w/o
phase 2, score metacommas: low occam (as judged by buckets based on actual metacomma size) as judged by new and improved integer-jumping magically-error-sensing buckets, error, inconsistency, redundancy

And then, as the cherry on top, find the most common metametacomma between consecutive most common metacommas per semitina bucket, and that should be our semitina winner. :lol:

I don't mean to be stickler for terminology, or obsessed with terminology as an end in itself. I'm trying to only bring up names for things as I feel like they become almost necessary for me to hold the entire problem clearly in my head at once. We've been using the words "zone", "bucket", and "group" roughly interchangeably, though, and I think it might help to settle on one at at time for a particular purpose. I've almost certainly been guilty of mixing them around myself. And it looks like you're already standardizing around what I'm going to suggest here, but again, just surfacing it so it's one less thing we have to disambiguate moving forward.

We've been using "zone" for "capture zone" and "secondary comma zone" which are of the same nature: continuous bands of pitch between a lower and upper bound within which commas are found. So I'd like to continue using that word in phase 1 when we find the best commas per zone, which we use to get metacommas.

Then in phase 2, those metacommas get organized before they get sorted by descending occurrence count. I think "bucketing" is a great word for this purpose. These are the buckets which previously I was using the actual metacomma size to bucket by, but now I'm going to use the integer tina difference from tina-rounded mina-less symbols to bucket by. I'll call them the "semitina buckets". Only the whole tina buckets are candidates for tinas directly; the semitina's candidates come from the metametacommas thereupon.
"Occams" works for me. :)
For the record, I am aware of the irony.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: Magrathean diacritics

Post by Dave Keenan »

To be clear: I was using "bucket" as synonymous with "zone", because I thought you were. I'm happy to stick with "zone" for the things that hold commas. I understand you want to re-purpose "bucket" for what I was calling "group", i.e. the things that hold metacommas. That's fine with me. So metacomma occurrences are put into buckets according to how many zones they jump. e.g. The metacomma occurrence that is the difference between the existing minaless-symbol comma for zone 5 and the best (LPE or LPEI) comma for zone 7 (or zone 3), gets put into bucket abs(7-5) = 2. These zone and bucket numbers all have units of semitinas.

I'm worried by what I've bolded below.
cmloegcmluin wrote: Tue Nov 03, 2020 3:26 am These are the buckets which previously I was using the actual metacomma size to bucket by, but now I'm going to use the integer tina difference from tina-rounded mina-less symbols to bucket by. I'll call them the "semitina buckets". Only the whole tina buckets are candidates for tinas directly; the semitina's candidates come from the metametacommas thereupon.
I'd say instead:
These are the buckets which previously I was using the actual metacomma size to bucket by, but now I'm going to use the integer semitina difference from semitina-rounded mina-less symbol [commas] to bucket by. I'll call them the "semitina buckets". Only the even semitina buckets are candidates for tina [accents] directly; the half-tina dot's candidates come from the metametacommas [between consecutive semitina buckets].
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

I agree with your revisions to my paragraph, and that's how I actually did it in the code.

Results are almost ready, my next post is 95% done.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

Actually, got a question for you, before I polish anything up:

I decided when calculating the error in phase 1, I'll still go by tina error, not semitina error, even though since we're slicing into semitina-sized zones, we'll never make it as high as 0.5 tina error as we used to, instead maxing out at 0.25 tina error. Does that work for you, or should I go by semitina error?
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: Magrathean diacritics

Post by Dave Keenan »

cmloegcmluin wrote: Tue Nov 03, 2020 8:12 am Actually, got a question for you, before I polish anything up:

I decided when calculating the error in phase 1, I'll still go by tina error, not semitina error, even though since we're slicing into semitina-sized zones, we'll never make it as high as 0.5 tina error as we used to, instead maxing out at 0.25 tina error. Does that work for you, or should I go by semitina error?
I take it this error is only used to calculate LPEI badness, where it is multiplied by u = 1.5. You noted earlier, that I was using semitina error for this purpose in my spreadsheet, so that's what I was expecting you to use now. It's whatever the zone width is. I note that what you've done is equivalent to using semitina error but with u = 0.75. I'd prefer to see it with u = 1.5, so if we find that going from LPE to LPEI doesn't change the most common metacomma in each bucket, then we'll know it's that much more robust.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

Dave Keenan wrote: Tue Nov 03, 2020 8:22 am I take it this error is only used to calculate LPEI badness, where it is multiplied by u = 1.5.
Correct.
You noted earlier, that I was using semitina error for this purpose in my spreadsheet, so that's what I was expecting you to use now.
Ha. To you, it may have looked like I "noted" that, but I certainly wasn't noting it intentionally at that time. I guess I just expended all my energy learning the Excel syntax to decipher your fancy formula, and then double-checking a few values I was getting from my code against your spreadsheet. So I failed to notice that your AERR column actually hits its maxes of 0.5 at n.25 and n.75 tinas, not at n.5 tinas. I had noticed that you were multiplying both the actual tina value and the rounded tina value by 2, and I think I even offhandedly claimed that I "liked" your formula at some point, but I didn't take the time to realize that the multiplying by 2 wasn't a mere scaling of a result, but actually a clever trick to change the resolution of the rounding, so that you were actually taking the semitina error, not the tina error. Cool. I get it now. Sorry about that. I guess that's what I get for blitzing through all this in a couple of days rather than taking the time to build it up carefully one piece at a time.
It's whatever the zone width is. I note that what you've done is equivalent to using semitina error but with u = 0.75. I'd prefer to see it with u = 1.5, so if we find that going from LPE to LPEI doesn't change the most common metacomma in each bucket, then we'll know it's that much more robust.
So actually I was doing semitina error all along without understanding that, and by not changing my code, it's still correct (and was before).
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: Magrathean diacritics

Post by cmloegcmluin »

Alright, done!


Dave Keenan wrote: Sat Sep 05, 2020 10:50 am
Here's the full val I used, up to the 59-limit, if you'd like to double-check it. Graham Breed's tool (http://x31eq.com/temper/net.html) seems to be not working for me, but it wasn't hard to find it myself.

Code: Select all

⟨ 8539 13534 19827 23972 29540 31598 34903 36273 38627 41482 42304 44484 45748 46335 47431 48911 50232 50643 51798 52513 52855 53828 54437 55296 56357 56855 57096 57565 57794 58238 59676 60058 60610 60789 61645 61809 62289 62751 63050 63484 63904 64041 64704 64832 65085 65209 65931 66612 66831 66939 67152 67466 67568 68069 68360 68644 68922 69014 69283 69460 ]
I checked it up to prime 59 and I concur.
I just noticed that what I sent you previously was actually up to the 281-limit. I suspect my confusion must have been related to the fact that 281 is the 60th prime, and the final index of a zero-indexed array of length 60 is 59. *shrug*

Point is: I'm filtering inconsistent metacommas out myself now, so you don't have to worry about it.

I decided to filter them as a final step, that is, I didn't filter them since I didn't think inconsistency of a metacomma should affect whether it counts toward the metametacomma choice.

None of the most common metacommas per bucket were inconsistent anyway, though.



Man, this *really* would have been a great thing to pair program on. I think I struggled a lot on exactly the kind of stuff you could have guided me through.



I'm going to list results per whole tina bucket, only including those within the top 20% of the maximum occam in the given bucket, as you did before.

The following comes from the badness metric being LPEI. Shortly will follow the LPE results too.

So according to these results, the 10241/5n narrowly edges out the 121/1225n:

CANDIDATES FOR SEMITINA 2
[
    ["10241/5n", 13],
    ["121/1225n", 12],

The one you yesterday declared the winner for 2 tinas wins here too:

CANDIDATES FOR SEMITINA 4
[
    ["1/5831n", 12],

Of course the 1-mina wins:

CANDIDATES FOR SEMITINA 6
[
    ["1/455n", 37],

The 4-tina agrees with your most recent summary of the results per bucketing by actual metacomma size:

CANDIDATES FOR SEMITINA 8
[
    ["3025/7n", 17],
    ["49/1045n", 15],

Same goes for the 5-tina:

CANDIDATES FOR SEMITINA 10
[
    ["2401/25n", 15],

Of course we have our 2-mina:

CANDIDATES FOR SEMITINA 12
[
    ["65/77n", 23],

The 7/425n barely ekes out a victory here too (as it did for the actual metacomma size based bucketing):

CANDIDATES FOR SEMITINA 14
[
    ["7/425n", 12],
    ["143/1715n", 11],
    ["1729n", 10],
    ["119/11n", 10],

Interesting. The 77/13n does not win here. You recently wrote:
We agreed long ago, that 8 tinas should be 77/13n because it is the schisma-complement of 6 tinas as 65/77n. And you've just shown that 77/13n is also a fairly common metacomma.
viewtopic.php?f=10&t=430&p=1493&hilit=s ... ment#p1493
Being the tina complement of a mina might be just another rule that trumps any of these considerations. Which I'm fine with. (I may go back and edit that list of considerations I drew up recently to include this one last thing, which we did talk about a bit in the earlier days of this project). What do you think?

CANDIDATES FOR SEMITINA 16
[
    ["187/175n", 16],
    ["385/19n", 13],
    ["77/13n", 13],

The 9-tina was a landslide victory and again agrees with actual metacomma size based buckets:

CANDIDATES FOR SEMITINA 18
[
    ["1/539n", 34],

Alright, then how about metametacommas? There are actually 19 of them, not 18, since I include the one from the unison to the result for the most common metacomma for the half-tina dot. Please let me know if you specifically intentionally wanted to exclude that metametacomma for some reason. In this case it is the 77/185n. That results in there being a tie for the most common metametacomma: the 77/185n, with 2 occamms (that's occurrences as meta-metacomma), one from 0 to 0.5 as just described, and one from 9 to 9.5 (the metametacomma from 6 to 6.5 is not actually the 77/185n here, because the 2125/7n was found to be the most common metacomma in the 6.5 bucket). The other metacomma is the 21385/11n, which occurred between 3 and 3.5 and 5.5 and 6. Between these two I certainly prefer the 77/185n, though this particular metametacomma technique didn't give an obvious winner.



Alright, how about with the badness metric being LPE? (Without error, it's not a full-fledged badness metric, only a complexity metric, but since we're using it in a position where metrics up to badness get used, I think I'll just treat it as one.)


The answers, thankfully, are remarkably similar:

For the 1-tina, it's again super close between the 10241/5n and 121/1225n. This time they tie.

CANDIDATES FOR SEMITINA 2
[
    ["10241/5n", 12],
    ["121/1225n", 12],

For the 2-tina, the 1/5831n still wins, though it's a bit closer this time.

CANDIDATES FOR SEMITINA 4
[
    ["1/5831n", 11],
    ["35/1573n", 9],

(skipping the 3-mina)


Very similar for the 4-tina, again, just a bit closer. Beginning to wonder if the fact that including error in the badness metric very slightly increases the victories of the most common metacommas, or in other words consolidates them turning up, means it's doing a good job.

CANDIDATES FOR SEMITINA 8
[
    ["3025/7n", 17],
    ["49/1045n", 16],

Same exact result for 5-tina.

Skipping the 2-mina.

Here the 7-tina gets actually a slightly stronger victory, running against the pattern we've seen so far, but it's the same victor.

CANDIDATES FOR SEMITINA 14
[
    ["7/425n", 13],
    ["119/11n", 11],
    ["143/1715n", 11],

Again our preferred 8-tina does not win, but again it's close.

CANDIDATES FOR SEMITINA 16
[
    ["187/175n", 16],
    ["77/13n", 15],

And again a landslide victory for the 9-tina.

Alright, and so then what do we see for the metametacommas? This time, not a single dupe. 19 different metametacommas. Now, I do notice that we have several ties for most common metacomma per semitina bucket, so I could re-run that and try each one tied for first. But I suspect this won't be worth the trouble as this method isn't seeming super fruitful. And/or you'll say that because 77/185n turned up previously both in a direct badness search, in my bucketing-by-actual-metacomma-size search, and its competitors almost never show their heads anywhere, we should just go with it.



Here is the list I think we should go with:

0.5 tina: 77/185n
1 tina: 10241/5n
2 tinas: 1/5831n
3 tinas: 1/455n
4 tinas: 3025/7n
5 tinas: 2401/25n
6 tinas: 65/77n
7 tinas: 7/425n
8 tinas: 77/13n
9 tinas: 1/539n

This is is still 37-limit. The whole tinas are 19-limit. There are 7 superparticulars out of the 10.
Post Reply