developing a notational comma popularity metric

User avatar
Dave Keenan
Site Admin
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

I found this 3 parameter (4 chunk?) maybe-minimum by fixing `k = 1` and `v = y`, (`c = 0, s=0`, log base is `2`, numerator ≥ denominator), but having a separate `w` for the denominator, which I call `b` (mnemonic: `w`hite and `b`lack).

$$\text{metric}(n,d) = \sum_{p=5}^{p_{max}} \big((\operatorname{lb}{p}+w){n_p}^y + (\operatorname{lb}p+b){d_p}^y)\big)$$
$$y=0.839, w=-1.465, b=-1.867, \text{ gives } SoS=0.00676$$

Only a little is gained by freeing up `k` and/or `v`. [Hey, those backquotes for AsciiMath are really handy. :)]

User avatar
Dave Keenan
Site Admin
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

cmloegcmluin wrote:
Fri Jul 10, 2020 2:21 am
I have added another layer on top of the code. My previous update was that it could now take an initial configuration (a set of submetrics, parameters, and ranges of settings for the parameters) and recursively bisect along every dimension of the search space and follow all local minima branches until it found a global minimum. My latest update is the ability for it to start from zero chunks and work its way up, for each chunk count calculating every possible such initial configuration (for the parameter ranges, it starts with the widest reasonable range for each), performing the recursive search for each one, and returning the best possible metric for that chunk count.
Awesome work.
I'm not ready to share out any grand conclusions, though:
  • It can't get past 3 chunks yet, because it keeps drowning in infinite blue threads. This was happening before but I wasn't able to figure it out because I was straight away attacking 8- and 9- chunk metrics which would run so slowly with so much noise I couldn't understand where things were going wrong. It has become clear now that I will eventually need to get my head around every parameter relationship in order to proceed. But that's fine. I've already identified a few more combinations that don't play nice.*
  • I am butting up against the limits of the JavaScript engine in various ways that are going to force some performance-related changes.
  • I am butting up against the limits of my own faculties to reason about this system in ways that are going to force me to statically type the code now, which will take a bit of time to retroactively apply.
Re the blue-threads: To eliminate them, you have to express the metric with the \(\sum_p\) on the outside, then inside that you need to fully expand it, by distributing multiplications over additions. Then look at all the terms of the resulting inner sum, and ensure that at least one term is either parameter-free, or has no parameter that corresponds to a scale factor. That fixes the scale of the metric.

If all the terms have parameters corresponding to scale factors, then all parameters can be scaled by the same amount without changing the metric, and we then have a line minimum instead of a point minimum. To fix this, just replace any one of these parameters with a constant.

For example, you can't have \(jpn_p + kpd_p\) (which is the inner sum of \(j\operatorname{sopfr}n + k\operatorname{sopfr}d\) ). You'd have to change that to \(\operatorname{sopfr}n + k\operatorname{sopfr}d\) or \(j\operatorname{sopfr}n + \operatorname{sopfr}d\). And you can't have \(jpn_p + kpd_p + cn_p + cd_p\) (which is the inner sum of \(j\operatorname{sopfr}n + k\operatorname{sopfr}d + c\operatorname{copfr}(nd)\) ). But you can have \(jpn_p + kpd_p + n_p + cd_p\) or \(jpn_p + kpd_p + n_p + d_p\).
[It gives me such joy to be able to typeset these math expressions]

The trouble is, it's sometimes not obvious when a parameter corresponds to a scale factor, such as when we had variable log bases. It wasn't immediately obvious that a log base `a` was equivalent to a scale factor of \((\operatorname{lb}a)^{-1}\). But at least we've fixed that one.
That said, I do have at least a couple fun tidbits that I'm confident enough about to share.

Among the 1-chunk metrics, my code found that the best choice was...
:drum:...
`\text{sopfr}`
aka SoPF>3. Right, no surprises there. It gives the SoS as 0.014206087, which is a bit higher than the 0.011375524 value found back here, but I attribute that to it occurring B.F.R. (Before Fractional Ranks). In this case though I'm happy to get the SoS nicked, as it just means whatever better SoS's we find with our new metrics will be relatively better than that which we're improving upon!
At least it's a sanity check of your code. I confirm that the F.R. SoS of plain `\text{sopfr}(nd)` is 0.014206087. That makes it easier than previously thought, to do twice as good as that, with a low chunk-count metric.
And among the 2-chunk metrics, my code found that the best choice was...
:drum:...
`\text{sopfr}(n) + k × \text{sopfr}(d), k = 0.7901234567901236`
(where n is the numerator and d is the denominator, not the numinator and diminuator)
that decimal for `k` has a suspicious pattern to its digits that a continued fraction calculator assisted me in determining is almost exactly `\frac{64}{81}`. Fascinating! That metric gets you an SoS of 0.009491243, about `\frac{2}{3}` that of SoPF>3.
That is fascinating. It reminds me of another 3-smooth ratio, that occurs in wind-turbine engineering. The Betz limit is the maximum efficiency of any wind turbine, irrespective of geometry. It is \(\frac{16}{27}\), or approximately 59%.

Sorry to be a party pooper, but while I confirm that SoS, I see that it persists for `0.790 <= k <= 0.799`, so several ratios simpler than \(\frac{64}{81}\)are in range. The simplest I can find is \(\frac{19}{24}\). But yes, it gives a worthwhile improvement for only one extra chunk.
As for 3-chunk metrics, the result was pretty amusing. It spit out
`\text{sopfr} + \text{copfr}(n) + k × \text{copfr}(d), k = -1.\overline{1}`
which gives an SoS of 0.008291286. So it would seem that adding the counts of the primes in the numerator while subtracting the counts of the primes in the denominator gets you a pretty good fit. Of course this doesn't feel psychologically motivated, so I think we should pass over this one.
I can't reproduce that SoS. I assume that by `\text{sopfr}` you mean `\text{sopfr}(nd)`? I get a much higher SoS of 0.018902287. But it is indeed a minimum, and it exists for \(-2<k<-1\).
Speaking of `\text{copfr}` (and `\text{copf}`), I discovered yesterday while chatting about this project with a friend that there's an established name for this function: the prime omega function. This might explain why Dave was using that fancy lookin' w earlier, because that's actually what a lowercase Greek letter omega looks like; little omega `\omega(p)` is our `\text{copf}` and big omega `\Omega(p)` is our `\text{copfr}`. I don't suggest we stop using `\text{copf}` and `\text{copfr}` (or `\text{coapf}` and `\text{coapfar}`, as it were) since those make the relationship with `\text{soapfar}` and `\text{soapf}` clear. But just thought I'd put it out here.
Which Dave are you talking about? Because I certainly didn't know about those. Thanks for letting me know.
*An interesting example: t and w get stuck in an infinite blue thread, though of a different nature I believe. When t is very tiny, you can nudge it by a tiny amount, and then nudge w by just the right corresponding tiny amount, even though this won't result in the exact same antivotes for each individual ratio, it will result in the aggregate across all 80 ratios an SoS which is very close. So even though w affects each prime and t affects each repetition, and those two counts don't match on each ratio, it can still continuously work out into a subtly different and thus "new" local minimum according to my code even as the space it is searching contracts into an ever tinier and tinier space. None of the obvious fixes I've thought of yet have solved the problem (max recurse depth? lose out on valuable results elsewhere. only pursue local minima which are lower than the one you're coming from? also lose out on results you actually do care about, i.e. sometimes as you stab around you do have to step up and out a little bit to find the correct pit that goes the deepest). My current solution: scratch t from the picture altogether, since we had determined it was a bit problematic anyway.
Definitely of a different nature. I don't understand that one. I found this 5 parameter (6 chunk?) approximate minimum by fixing `k = 1` and `v = y`, (`c = 0, s=0`, log base is `2`, numerator ≥ denominator), but having a separate `w` and `t` for the denominator, which I call `b` and `u`.

$$\text{metric}(n,d) = \sum_{p=5}^{p_{max}} \big((\operatorname{lb}(p+t)+w){n_p}^y + (\operatorname{lb}(p+u)+b){d_p}^y)\big)$$
$$y=0.861, w=-2.656, b=-2.829, t=4.471, u=3.850 \text{ gives } SoS=0.00659$$

So it gives barely any improvement over the 3 parameter (4 chunk?) metric in my previous post (SoS = 0.00676), where `t = u = 0`.

User avatar
cmloegcmluin
Site Admin
Posts: 721
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Dave Keenan wrote:
Fri Jul 10, 2020 12:37 pm
I found this 3 parameter (4 chunk?) maybe-minimum by fixing `k = 1` and `v = y`, (`c = 0, s=0`, log base is `2`, numerator ≥ denominator), but having a separate `w` for the denominator, which I call `b` (mnemonic: `w`hite and `b`lack).

$$\text{metric}(n,d) = \sum_{p=5}^{p_{max}} \big((\operatorname{lb}{p}+w){n_p}^y + (\operatorname{lb}p+b){d_p}^y)\big)$$
$$y=0.839, w=-1.465, b=-1.867, \text{ gives } SoS=0.00676$$
I get 0.007085843 for that on my end.
Re the blue-threads: ... But at least we've fixed that one.
I'm afraid I just can't understand any of this. We might need to go to the whiteboard at this point.
while I confirm that SoS, I see that it persists for `0.790 <= k <= 0.799`, so several ratios simpler than \(\frac{64}{81}\)are in range. The simplest I can find is \(\frac{19}{24}\). But yes, it gives a worthwhile improvement for for only one extra chunk.
That's just fine. I do wonder what compelled my code to pluck 64/81 as its ratio of choice from that stepped area. But importantly, yes, I agree re: that simple of a metric representing a worthwhile improvement. Not that we should give up looking for >2 chunk metrics!
As for 3-chunk metrics, the result was pretty amusing. It spit out
`\text{sopfr} + \text{copfr}(n) + k × \text{copfr}(d), k = -1.\overline{1}`
which gives an SoS of 0.008291286. So it would seem that adding the counts of the primes in the numerator while subtracting the counts of the primes in the denominator gets you a pretty good fit. Of course this doesn't feel psychologically motivated, so I think we should pass over this one.
I can't reproduce that SoS. I assume that by `\text{sopfr}` you mean `\text{sopfr}(nd)`? I get a much higher SoS of 0.018902287. But it is indeed a minimum, and it exists for \(-2<k<-1\).
I do mean `\text{sopfr}(nd)`, yes. I just double-checked and I'm definitely getting that. Well, damn. We've been able to reproduce each other's results more often than not. I still never reproduced your "mcopfr" ones. I wonder where we're going wrong now. Perhaps it's a communication issue or perhaps one of us is doing something funky in our machinery.
I found this 5 parameter (6 chunk?) approximate minimum by fixing `k = 1` and `v = y`, (`c = 0, s=0`, log base is `2`, numerator ≥ denominator), but having a separate `w` and `t` for the denominator, which I call `b` and `u`.

$$\text{metric}(n,d) = \sum_{p=5}^{p_{max}} \big((\operatorname{lb}(p+t)+w){n_p}^y + (\operatorname{lb}(p+u)+b){d_p}^y)\big)$$
$$y=0.861, w=-2.656, b=-2.829, t=4.471, u=3.850 \text{ gives } SoS=0.00659$$

So it gives barely any improvement over the 3 parameter (4 chunk?) metric in my previous post (SoS = 0.00676), where `t = u = 0`.
I can reproduce this one, yes.

User avatar
Dave Keenan
Site Admin
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

cmloegcmluin wrote:
Sat Jul 11, 2020 3:38 pm
Dave Keenan wrote:
Fri Jul 10, 2020 12:37 pm
$$y=0.839, w=-1.465, b=-1.867, \text{ gives } SoS=0.00676$$
I get 0.007085843 for that on my end.
Whadya know. So do I! It seems I lost the local minimum by rounding the parameters for publication. But I've now found a lower minimum, by manually nudging the parameters around.

$$y=0.8568, w=-1.4457, b=-1.8630, \text{ gives } SoS=0.006282743$$
Maybe your software can find an even lower minimum for this metric. It is my favourite metric so far.

It's up against:

$$k = 0.1796875, y = 0.4921875, w = -2.0197149, c = 0.570961685, \text{ gives } SoS=0.004059522$$
This is more accurate, but it has 2 more chunks (the `\text{copfr}()` and the `k` parameter) and seems harder to explain psychologically, or perhaps we should say "psychoacoustically".
Re the blue-threads: ... But at least we've fixed that one.
I'm afraid I just can't understand any of this. We might need to go to the whiteboard at this point.
I'm afraid I don't know what I could do on a whiteboard that would make any difference. But if you do as I say, for the specific examples I gave (repeated in hopefully more familiar terms below), then that's probably all that matters.

You can't have \(j\operatorname{sopfr}(n) + k\operatorname{sopfr}(d)\). You'd have to change that to \(\operatorname{sopfr}(n) + k\operatorname{sopfr}(d)\). And you can't have \(j\operatorname{sopfr}(n) + k\operatorname{sopfr}(d) + c\operatorname{copfr}(nd)\). But you can have \(\operatorname{sopfr}(n) + k\operatorname{sopfr}(d) + c\operatorname{copfr}(nd)\) or \(j\operatorname{sopfr}n + k\operatorname{sopfr}d + \operatorname{copfr}(n) + c\operatorname{copfr}(d)\). At least one term must have a constant multiplier (equal to 1 in these examples), to fix the scale of the metric. And there should be no variable log bases.
As for 3-chunk metrics, the result was pretty amusing. It spit out
`\text{sopfr} + \text{copfr}(n) + k × \text{copfr}(d), k = -1.\overline{1}`
which gives an SoS of 0.008291286. So it would seem that adding the counts of the primes in the numerator while subtracting the counts of the primes in the denominator gets you a pretty good fit. Of course this doesn't feel psychologically motivated, so I think we should pass over this one.
I can't reproduce that SoS. I assume that by `\text{sopfr}` you mean `\text{sopfr}(nd)`? I get a much higher SoS of 0.018902287. But it is indeed a minimum, and it exists for \(-2<k<-1\).
I do mean `\text{sopfr}(nd)`, yes. I just double-checked and I'm definitely getting that. Well, damn. We've been able to reproduce each other's results more often than not. I still never reproduced your "mcopfr" ones. I wonder where we're going wrong now. Perhaps it's a communication issue or perhaps one of us is doing something funky in our machinery.
Is this with `n` ≥ `d` or sopfr(`n`) ≥ sopfr(`d`)? In any case, Do you agree that we can ignore this one, since it's both psychologically implausible and not very good?
$$y=0.861, w=-2.656, b=-2.829, t=4.471, u=3.850 \text{ gives } SoS=0.00659$$
So it gives barely any improvement over the 3 parameter (4 chunk?) metric in my previous post (SoS = 0.00676), where `t = u = 0`.
I can reproduce this one, yes.
Good. Thanks. Do you agree it's not worth pursuing `t` or `u` any further, since the extra parameters don't seem to give much improvement, and 6 chunks is too many anyway?

User avatar
Dave Keenan
Site Admin
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

We should look at metrics of the form
n × d^y × k^sopfr × c^copfr
And related metrics which are linear or sublinear in n and d rather than logarithmic.

User avatar
Dave Keenan
Site Admin
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

Or equivalently, and more conveniently, try including the non-monzo-based g×lb(n) + h×lb(d) along with our existing soapfar and copfr terms.

User avatar
cmloegcmluin
Site Admin
Posts: 721
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Dave Keenan wrote:
Sat Jul 11, 2020 6:30 pm
cmloegcmluin wrote:
Sat Jul 11, 2020 3:38 pm
Dave Keenan wrote:
Fri Jul 10, 2020 12:37 pm
$$y=0.839, w=-1.465, b=-1.867, \text{ gives } SoS=0.00676$$
I get 0.007085843 for that on my end.
Whadya know. So do I! It seems I lost the local minimum by rounding the parameters for publication.
Phew. That's a relief.
But I've now found a lower minimum, by manually nudging the parameters around.

$$y=0.8568, w=-1.4457, b=-1.8630, \text{ gives } SoS=0.006282743$$
Maybe your software can find an even lower minimum for this metric. It is my favourite metric so far.
Yes. I always intend to when you send me cool stuff like that. I just didn't have the time last night. : D

And I'm especially glad I did this example, because it exposed a problem in my code with how I subdivide the search space! So last night, the first thing I did was plug your metric configuration into my "sum-of-squares for metric" command, just to confirm that exact configuration. But when I plugged the same configuration into my "recursive best metric finder from metric configuration" command — giving it some reasonable nearby ranges in which to search — the code came back with a totally bogus SoS: one that was waaaayyy higher. Of course, the expectation of such a command should be that the best SoS it finds must be at least as low as the one you already know exists in that search space! Otherwise, who knows how many other possibilities it may be failing to find? Having just ported the entire codebase from JavaScript to TypeScript over the last couple days, I was concerned that maybe I had borked something.

Before digging into any code, I tried your configuration again, but with slightly different ranges for the parameters. I also dramatically increased the resolution of the sampling of the search space, from a bisection — 2 points in each dimension, so that it searches `2^n` points where `n` is the number of dynamic parameters — to 5 points in each dimension. Now I got a much closer SoS, but it was still greater than the one I already knew about.

I realized I had thought about searching space wrong. Not that I hadn't thought about it carefully already. Just not carefully enough! The way I have it now, whether it bisects, trisects, or novendecisects, it iterates on each local minimum, searching a space which is `\frac{2}{3}` the size of the previous space. I chose that proportion carefully after triple-checking by graph paper, spreadsheet, and Wolfram Alpha. It is the solution, `x`, in:

$$\sum_{n=1}^{\infty} \frac{1}{2}x^n = 1$$

In other words, assume the worst case: one of the points you sampled, A, was not a local minimum, and yet a point just "next" to it is somehow still the global minimum. So you need to be able to get back to that point, starting from one of the neighboring points; let's pick one and call it B. So if you continue recursion at B and at each recursive step find a local minimum at one of the sample points which is closest to A, you can theoretically get all the way back to A. That's because at each step the search space gets smaller by a factor of `\frac{2}{3}`, and because this search space is centered on the point, only `\frac{1}{2}` of it gets you closer back to A.

The critical issue with the previous logic is "at each recursive step find a local minimum at one of the sample points which is closest to A". You won't necessarily find a path from B back to A this way! You could still miss out on valuable metrics. It's another example of that idea that sometimes you need to step up and out a bit to find the best pit down. And no reduction of search space per recursive step `\frac{2}{3} ≤ x ≤ 1` would solve that problem. Probably if I just read a book on optimization it would be clear that if I want total confidence that I found the best SoS, I may be setting myself up for disappointment. We could never know that for sure.

It's clear now that it's super important at the onset to have a very high resolution, because as I described before, my automated finder starts with, for each parameter, the widest possible reasonable range. So I will definitely need to redo my previous results for 2- and 3- chunk metrics. I didn't bork the code refactoring it — it was already borked! So I'll get back to you soon on that!

The problem with setting the resolution really high, though, is that it slows the code down a ton. This is exponential growth of sample points. It's intractable to set the resolution much higher than 5.

But this morning, addressing this issue, I had a good idea. I am going to change my code so that instead of subdividing the search space by a flat constant — 2, 15, or 31 — it chooses a different subdivision count at each recursive iteration. It will choose a count which is at least 2, but one which also satisfies a condition of dividing that dimension into units no greater than a new constant. I am thinking since we've generally been sharing our parameters with decimals expanded to the thousandths place, I'll start with 0.001. In other words, the search unit will be the thing with primacy, not the subdivision count. That way I'll know that the only possible better metrics would be found by adjusting parameters past the thousandths place. And it will also cause the resolution to slope off gracefully, where before I had stuck myself with the choice of either ratcheting up the resolution and having things crash or run for hours, or getting no-good results.

Without further ado, I gotcha down to 0.006136416. Unfortunately my code does not yet support locking the value of a given parameter across all submetrics it is used in, but that's on the agenda for today. In other words, my result has caused v and y to deviate: y=0.8735, v=0.8901, w=-1.4957, b=-2.0463. So perhaps you're uninterested in that result.
perhaps we should say "psychoacoustically"
I'm onboard with "psychoacoustically" over "psychologically", yes.
I'm afraid I don't know what I could do on a whiteboard that would make any difference. But if you do as I say, for the specific examples I gave (repeated in hopefully more familiar terms below), then that's probably all that matters.
Yes, I think in this case my command of the concepts is insufficient to work with abstractions. But these concrete examples make sense to me. And so I do not think there's anything important you understand about what needs to be done that I don't at least have some semblance of understanding about. I mean, the code won't run to completion until I resolve these things on a parameter-combo-by-parameter-combo basis.
Is this with `n` ≥ `d` or sopfr(`n`) ≥ sopfr(`d`)? In any case, Do you agree that we can ignore this one, since it's both psychologically implausible and not very good?
Don't you mean "psychoacoustically implausible"? ;)

I meant `n` ≥ `d`. But yes, I agree we can ignore it. Furthermore because as I said above I'll need to redo my code to feel more confident about this being the actual best result for 3 chunks anyway.
Do you agree it's not worth pursuing `t` or `u` any further, since the extra parameters don't seem to give much improvement, and 6 chunks is too many anyway?
Well, `t` and `u` are my `x`, one of the parameters that was giving me trouble the other day and which I have already temporarily stricken from the code. But I'm not ready to banish it forever yet. Just because it doesn't help enough in this specific case, and because it from time to time drowns me infinite rivers of minima, isn't cause yet for total dismissal. But it is on my short list!
Dave Keenan wrote:
Sun Jul 12, 2020 1:10 am
We should look at metrics of the form
n × d^y × k^sopfr × c^copfr
And related metrics which are linear or sublinear in n and d rather than logarithmic.
Dave Keenan wrote:
Sun Jul 12, 2020 1:46 am
Or equivalently, and more conveniently, try including the non-monzo-based g×lb(n) + h×lb(d) along with our existing soapfar and copfr terms.
Confused again. It sounds like you're proposing a new submetric. Which is welcome! I just don't really get what it is yet. When you write "k^sopfr" it makes me realize that while I currently support k as a coefficient, exponent, or logarithmic base, I do not yet support it as a power base, i.e. I have not enabled the code to instead of raising a submetric to an exponent k, have the submetric be the exponent and k be the base. That's an interesting and distinct idea. I can definitely try that. But I get the sense that you were trying to say something completely different.

User avatar
Dave Keenan
Site Admin
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

cmloegcmluin wrote:
Sun Jul 12, 2020 2:37 am
Dave Keenan wrote:
Sun Jul 12, 2020 1:46 am
Or equivalently, and more conveniently, try including the non-monzo-based g×lb(n) + h×lb(d) along with our existing soapfar and copfr terms.
Confused again. It sounds like you're proposing a new submetric. Which is welcome! I just don't really get what it is yet. When you write "k^sopfr" it makes me realize that while I currently support k as a coefficient, exponent, or logarithmic base, I do not yet support it as a power base, i.e. I have not enabled the code to instead of raising a submetric to an exponent k, have the submetric be the exponent and k be the base. That's an interesting and distinct idea. I can definitely try that. But I get the sense that you were trying to say something completely different.
Sorry I mentioned n × d^y × k^sopfr × c^copfr, since I realised we can do the equivalent while staying in log space. Please ignore it. A new sub-metric, yes. g×lb(n) + h×lb(d). But it's not really new. It's equivalent to g×soapfr(n) + h×soapfr(d) where ap = p → lb(p). I just think it might be worth trying that plus a j×sopfar(n) + k×sopfar(d) submetric with ar = r → ry.

But one of g, h, j, k must be set to 1 to avoid the blue thread of death. e.g. either
g×soapfr(n) + h×soapfr(d) + sopfar(n) + k×sopfar(d) = g×lb(n) + h×lb(d) + sopfar(n) + k×sopfar(d)
or
soapfr(n) + h×soapfr(d) + j×sopfar(n) + k×sopfar(d) = lb(n) + h×lb(d) + j×sopfar(n) + k×sopfar(d)

So 4 parameters, either g, h, k, y or h, j, k, y.

User avatar
cmloegcmluin
Site Admin
Posts: 721
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Dave Keenan wrote:
Sun Jul 12, 2020 11:47 am
Sorry I mentioned n × d^y × k^sopfr × c^copfr, since I realised we can do the equivalent while staying in log space. Please ignore it.
Do you have reason to believe raising a constant to the sopfr (or any other submetric result) would not be fruitful?
A new sub-metric, yes. g×lb(n) + h×lb(d). But it's not really new. It's equivalent to g×soapfr(n) + h×soapfr(d) where ap = p → lb(p). I just think it might be worth trying that plus a j×sopfar(n) + k×sopfar(d) submetric with ar = r → ry.
Okay, I'll get there.

This does make me think that "poapfar" (where the initial 'p' is "product") might be a worthwhile submetric to include in general, whether or not ap = p → lb(p).

User avatar
Dave Keenan
Site Admin
Posts: 1024
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

cmloegcmluin wrote:
Sun Jul 12, 2020 12:22 pm
Dave Keenan wrote:
Sun Jul 12, 2020 11:47 am
Sorry I mentioned n × d^y × k^sopfr × c^copfr, since I realised we can do the equivalent while staying in log space. Please ignore it.
Do you have reason to believe raising a constant to the sopfr (or any other submetric result) would not be fruitful?
No reason whatsoever. It's just that it would be exactly as fruitful as the log of that, because log is a monotonically-increasing function, and a monotonically-increasing function can't possibly make any difference to the ranking and hence the SoS.

First, I note that this y, k and c are not your standard y, k and c.

Now let's take the log of the whole thing,

lb(n × d^y × k^sopfr × c^copfr)
= lb(n) + y×lb(d) + lb(k)×sopfr + lb(c)×copfr

Now we just replace y with h, lb(k) with k and lb(c) with c to obtain
= lb(n) + h×lb(d) + k×sopfr + c×copfr
= soapfr(n) + h×soapfr(d) + k×sopfr + c×copfr, where ap = p -> lb(p)

For the same reason (just take the log of everything) I don't see any point in poapfar. I'm also getting heartily sick of this whole thing and would like to settle on a metric soon.

Also see the recent edits to my previous post.

Not hiking today. Sister doing other things.

Post Reply