cmloegcmluin wrote: ↑Wed Jul 08, 2020 1:37 am
Dave Keenan wrote: ↑Tue Jul 07, 2020 3:23 pm
This is missing the point. You can't calculate a from c and w. The point is:
I'm pretty confused by this section. I've gone over it several times (enough to notice your edits to it). I understand all the points you make about not monotonic functions not changing the ranking, and those arithmetic laws, but I don't understand how these add up or function in relation to the introduction of this section. In other words, I don't understand how these two things explain that one can't calculate a from c and w, how I've missed the point, how these two things
are the point, or how the difference in what is the point is related to not being able to calculate a from c and w.
Perhaps I overstated the case. There is a sense in which you can calculate a from c and w, in fact you can calculate the other two from any
one of them. But that's only
after you have found one set of them, call them a
1, c
1 and w
1. Then, given a different c, you can calculate the corresponding w and a as:
w = w
1 × c/c
1
a = a
1c1/c
But in regard to finding the first set, you must
fix a, not calculate it anew every time you try a new c or w during the search. Or you could choose to fix c, and vary a and w. Or fix w and vary a and c. But I find that fixing a (as 2) and using the lb(p) notation*, makes things easier to think about. * And when necessary g × lb(p). e.g. when two submetrics would otherwise have needed different log bases.
Would it not have to be a condition that all of the parameters to the coapfar for which c is a coefficient be the same as the parameters to the soapfar with the related a and w?
No. You only need to be able to scale the soapfar and the coapfar by the same factor, so that you are scaling the whole metric without changing the proportions of each.
Well... but I think we slice & dice this stuff differently. See below.
But you agree the two ways of "slicing and dicing" are mathematically equivalent.
I think this means that I waste resources searching for metrics which use all three.
Yes. And it means that the minimum, instead of being a bowl, is a canyon of infinite length, the same depth all the way along. Your search could waste a lot of time zig-zagging along the bottom of that canyon, in the hope of finding a drop-off, or a rising end, that will never come. All that would happen is that the metric would grow and grow, or shrink and shrink, in overall scale, without changing the ranking or the SoS, until some number overflows or underflows.
I can't imagine this with the same clarity that you can, but it might explain why my script ran overnight and when I woke up it was still running but hadn't found anything better than it had found in the first fraction of a second.
I'm guilty of mixing metaphors. Sorry. I earlier talked more realistically about hyperpolyhedral regions where SoS didn't change. In that metaphor, you had to imagine the SoS as a scalar field (physics sense, not mathematical sense) over the multidimensional parameter space. In that case you could at least visualise a 3D space (for 3 parameters) with the SoS being the colour (e.g. like a 3D heat map) at every point of that space. And we are trying to find the coldest/bluest point.
But in the metaphor used above, I am taking the SoS to be a spatial dimension (the vertical one), and so you can only visualise 2 parameters. The SoS forms a surface, a landscape, with hills and valleys, and we are trying to find the lowest point.
If I describe the above problem using the heatmap metaphor, it is that when you don't fix a, there is no single point of minimum temperature. Instead there is an infinite blue thread.
Setting w or c to zero doesn't work because that's not consistent with applying a monotonic function to the metric.
I thought that we had shown that c
can be redundant with w because c is a coefficient on the count and w is a constant added to each prime and those are equivalent. So assuming you get the other parameters (k and j in particular, I think) in the right place there shouldn't be anything you couldn't express with w that you couldn't express with c, and vice versa, and that's why I thought you could set one or the other to 0.
I think I said "almost" equivalent, and I think that was when the y values under discussion were closer to 1 than they are now. The only time c and w are equivalent is when y = 1 [or when the same y is applied to both soapfar and copfar]. That's one reason I wrote it with the Σ
p outside of everything, so you could see more clearly, the relationship between w and c.
(lb(p)+w) × n
py + c × n
p
Unless y=1, it's polynomial in n
p, so you can't combine the coefficients (lb(p)+w) and c.
I still don't really get most of this. Sorry. It's probably best for me not to try to wrangle with natural language to increase my understanding. I do understand my code and so I'll write some tests to prove facts about how the parameters relate to each other.
Good idea. You should be able to easily confirm that you get the same SoS with the different (a, w, c) sets (really tuples) I have given above (with your original a, a = 2 and a = e). And confirm that you will never get the same SoS with c = 0, although you may still get a usefully low minimum.
Dave Keenan wrote: ↑Wed Jul 08, 2020 12:13 am
So our best 4-parameter metric so far is of the form:
Σp(
1 × (lb(p)+w) × n
py + c × n
p +
k × (lb(p)+w) × d
py + c × d
p
)
where
k ≈ 0.180
y ≈ 0.493
w ≈ -2.02
c ≈ 0.571
Is that correct?
I believe your articulation of it is equivalent, where you slice it by prime first, submetric second, I slice it by submetric first, prime second. I could have designed the code the other way. I don't see any particular reason why it would work better or worse that way. But in any case, what I've got works more like this:
Σp(
1 × (lb(p)+w) × n
py +
k × (lb(p)+w) × d
py
) +
c ×
Σp(
n
p +
d
p
)
My spreadsheet actually does it in exactly the same way as your code. Of course it makes no difference how the code or spreadsheet does it. My rewriting was purely intended to aid human comprehension, in figuring out how parameters can or can't be combined or eliminated by rescaling the whole metric.
I think the next question is, what's the best one we can find that is less complex than that?
How well can we do with:
Σp(
( lb(p)+w) × npy +
(g×lb(p)+ z ) × dpy
)
Still 4 parameters, but arguably simpler. It could be tried with a simple n>d numinosity as well as with sopfr(n)>sofr(d).
Why is it called g now instead of k?
It's not very important. I thought it would be confusing if I called it k because k is traditionally
outside the brackets of (lb(p)+w). But yes, I could have written say k × (lb(p) + b). But I wanted a multiplier directly on the lb(p), so that it corresponded directly to a change of log base. And yes, I could have written (k×lb(p) + k×b), but it seemed strange to have k×b when it could just be a single parameter z. But if I write (k×lb(p) + z) then the k can no longer be taken outside the brackets and so it is no longer the same kind of k that we have been using, and so I thought it best to use g instead, as (g×lb(p) + z), because I had already been using g in my attempted explanation of change-of-base being equivalent to change-of-scale.
We found something as low as 0.00622 with an equivalent way of writing that. Back when you asked me to a c=0 variant on the lowest one I'd found yet at that time.
OK. But I thought you might set your wonderful new machine churning away to try to find a lower SoS for that form of metric.
By the way, I would call that a 6-parameter one myself, in terms of chunks of complexity presented to the user, as you've got
- the lb
- y
- w
- g
- z (because it's different than w)
- a point for the numinator/diminuator concept which people have to understand, unless this is truly n and d, but I assume they are numinator and diminuator because we've found better results that way and I'm pretty sure we've been assuming this for a while
Why don't we call it a 6-
chunk metric, and keep the term "parameter" for the numbers you change during a search for a SoS minimum? You may know that "chunk" is actually a respectable term in cognitive psychology.
https://en.wikipedia.org/wiki/Chunking_(psychology)