developing a notational comma popularity metric

User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

volleo6144 wrote: Fri Aug 07, 2020 6:30 am Wait, why would you be getting NaNs from antivotes calculations anyway?
Oh, gosh, who knows... probably getting some complex numbers in there when something negative gets its root taken or whatever. That antivotes calculation is beastly with all the possible parameters in the mix together.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Dave Keenan wrote: Fri Aug 07, 2020 7:50 am ...the vast majority of those metrics must be complete rubbish. There ought to be some way to reject them very quickly. Is that what you're saying you do, "in a few milliseconds"? Do the others then go on to take many seconds?
Off the top of my head I can think of a few ways I could make it smarter in order to run faster. Since it would be making it smarter it hardly feels right to call it quicker and dirtier, but I could try these sorts of tricks.

I'll have to get back to you with details on the runtimes. I recently changed it so that it doesn't recursively search during the first pass. It would have been nice to know the average runtime per combination it checked. Unfortunately I don't have the total counts of metrics it checked handy either, sorry. Perhaps I should have just waited until after work to try to respond.
Does that mean that wb should have come out of your 3-chunk search, along with laj (which I call kl)?
I have only a minute before I have to get back to work, but I can confirm that yes wb came through (in the top 5). My quick reference for definition of laj says it uses a log base 4/3 instead of 2, which wouldn't have shown up in my results since I locked down base 2. I don't find the equivalent of your kl as defined here but I'll have to investigate soon because I expect it should be there.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Alright, so:
  • `text{kl}` didn't show up in the 3-chunk results, because it's a 4-chunker: gpf itself, the l weight, the lb, and the k.
  • `text{laj}` shouldn't really be a 3-chunk metric either, by our newly shared understanding re: logarithms; we need to lock the base from 4/3 to 2 and then use a coefficient to correct.
I didn't realise how many 4-chunk metrics you were examining. I figure it can't be more than 25x24x23x22 = 303600. And 57 h 10 min is 205900 s. So that's an average of 678 ms per metric, which is faster than Excel. Well done.
I'll work out the total metrics it considered for 4 chunks:

It's 126 + 5600 + 61425 + 140400 = 207551. So just about 1 second per metric on average.

126 +

all combinations of 4 submetrics = 6 choose 4 w/ repeated elements = ((4+6-1)!)/((4!)((6-1)!)) = 126
but that times all combinations of 0 parameters = 25 choose 0 w/ repeated elements = ((0+25-1)!)/((0!)((25-1)!)) = 1
so 126 * 1 = 126, but then that times 1 b/c for each one you can distribute the parameters across the submetrics 5^0 ways
so 126 * 1 = 126

5600 +

all combinations of 3 submetrics = 6 choose 3 w/ repeated elements = ((3+6-1)!)/((3!)((6-1)!)) = 56
but that times all combinations of 1 parameters = 25 choose 1 w/ repeated elements = ((1+25-1)!)/((1!)((25-1)!)) = 25
so 56 * 25 = 1400, but then that times 4 b/c for each one you can distribute the parameters across the submetrics 4^1 ways
so 1400 * 4 = 5600

61425 +

all combinations of 2 submetrics = 6 choose 2 w/ repeated elements = ((2+6-1)!)/((2!)((6-1)!)) = 21, but that times all combinations of 2 parameters = 15 choose 2 w/ repeated elements = ((2+25-1)!)/((2!)((25-1)!)) = 325, so 21 * 325 = 6825, but then that times 9 b/c for each one you can distribute the parameters across the submetrics 3^2 ways, so 6825 * 9 = 61425

140400

all combinations of 1 submetric = 6 choose 1 w/ repeated elements = ((1+6-1)!)/((1!)((6-1)!)) = 6, but that times all combinations of 3 parameters = 15 choose 3 w/ repeated elements = ((3+25-1)!)/((3!)((25-1)!)) = 2925, so 6 * 2925 = 17550, but then that times 8 b/c for each one you can distribute the parameters across the submetrics 2^3 ways, so 17550 * 8 = 140400

I'm not sure how you came up with that 25x24x23x22 = 303600 estimate, but it wasn't far off!

------

For 4 chunk metrics, the maximum count of samples for a scope (which is the term for a metric but where each dynamic parameter hasn't resolved to a specific value yet but is instead a range of values to sample from a min to a max) is currently 6003 = 216,000,000, since there are a few parameters which check every value in a range that is 6 wide and they check every 0.01. The smallest count of samples for a scope would be 1, since there are some parameters which are not dynamic (e.g. using numinator instead of numerator, or your modified count trick, or using the prime index instead of the prime itself, etc...), i.e. they are either present or not. And I'm going to very roughly guestimate then, based on the proportions and the different submetric count phase breakdowns detailed above, that there's an average of 10,000,000 samples it checks per sample. So I'm checking about 2 trillion possibilities for 4 chunks, in the first pass.

Since I changed it to not recursively search during the first pass, it doesn't actually hit blue threads of death anymore. So it's never hitting the 10s timeout for the vast majority of the time it runs. It's only at the very tail end step, where it runs that recursive mode for the winners (which is only like 15 thousand scopes, each with an average of more like 200 samples each since it starts out way more focused, all within one step of the first run's resolution but with an order of magnitude in increased resolution) where it'll hit blue threads of death and time out.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

The re-run for chunk count 4 is running, by the way. I fixed the problem with NaNtivotes so this run should produce good results. See you again Sunday morning (my time). Sorry again for the delay.

I also verified that none of the results in chunk counts 1 through 3 were corrupted by NaNtivotes.

I'm not quite ready to share the final results from chunk count 3 because the perfecter layer doesn't actually work yet. I really wanted to get this chunk count 4 run off though, so I re-envisioned the perfecter layer as something I'd run separately afterwards on the output from the solver.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

cmloegcmluin wrote: Fri Aug 07, 2020 1:33 pm Alright, so:
  • `text{kl}` didn't show up in the 3-chunk results, because it's a 4-chunker: gpf itself, the l weight, the lb, and the k.
I'm glad we now agree that kl is more complex than wb, as I suggested here:
viewtopic.php?f=4&t=493&p=2121#wb

So kl has the same chunk count as wyb, and wyb beats it on both SoS figures.
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Indeed, we are in agreement there.

Edit to the above: The maximum sample count for 4 chunk metrics is actually only 603 = 216,000, a thousand times fewer than I said previously. I forgot I'd ratcheted down the resolution by an order of magnitude per metric. So it's actually only about 2 billion possibilities possibilities I'm checking, not 2 trillion.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

Is this the simplest way to adjust sopfr to improve it? Has this been mentioned before? How many chunks do you give it?

sopfr(n*d) + c*copfr(d)
c = -1.75, SoS = 0.009004, SoS(1) = 18153

sopfr(n*d) + c*copfr(d)
= sopfr(n) + soapfr(d), where ap = p + c
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

Here's a kind of metric we haven't considered before. The "altered-prime" function is piecewise linear.

soapfr(n*d) + c*copfr(d), where ap = if p<15 then p else p/3+10

c = -1.909228732, SoS = 0.00695829, SoS(1) = 13709

soapfr(n*d) + c*copfr(d), where ap = if p<15 then p else p/3+10
= soapfr(n) + sompfr(d)
where ap = if p<15 then p else p/3+10
and mp = ap + c = if p<15 then p+c else p/3+10+c

The 15 and the 1/3 are rounded parameters. i.e. they were suggested by the data. The 10 is just a combination of the 15 and the 1/3 that makes the "then" and the "else" be the same when p = 15. i.e. it makes the two linear pieces meet at (15, 15).
(p-15)/3 + 15 = p/3 + 10

By rounding c to 2, you could calculate this mentally in many cases.

This piecewise linear function is actually a better fit to the data than the offset log functions we've been using, such as ap = lb(p)+w.

I think this is psychoacoustically plausible. i.e. I think its plausible that we hear primes 17 and above differently from primes 13 and below.

When I free up the parameters which are set to 15 and 1/3 above, I get:

soapfr(n*d) + c*copfr(d), where ap = if p<15.24816546 then p else 15.24816546 + 0.354541887*(p-15.24816546)

c = -1.814421168, SoS = 0.006847526, SoS(1) = 13523
User avatar
cmloegcmluin
Site Admin
Posts: 1700
Joined: Tue Feb 11, 2020 3:10 pm
Location: San Francisco, California, USA
Real Name: Douglas Blumeyer (he/him/his)
Contact:

Re: developing a notational comma popularity metric

Post by cmloegcmluin »

Dave Keenan wrote: Sat Aug 08, 2020 7:04 pm sopfr(n*d) + c*copfr(d)
c = -1.75, SoS = 0.009004, SoS(1) = 18153

sopfr(n*d) + c*copfr(d)
= sopfr(n) + soapfr(d), where ap = p + c
Is this the simplest way to adjust sopfr to improve it?
I'm not sure what you mean by that. One way of defining "the simplest way to adjust sopfr to improve it" would be a ten-way tie between all the 2-chunk metrics for which 1 chunk is sopfr and the other chunk is anything else (a list of ten things which do not include this, which I would define as 3 chunks, the two extra chunks being the c and either the copfr or soapfr).
Has this been mentioned before?
I'm not sure, but I've been aware of this relationship.
How many chunks do you give it?
3; see above.

------
Dave Keenan wrote: Sat Aug 08, 2020 10:57 pm Here's a kind of metric we haven't considered before. The "altered-prime" function is piecewise linear.

soapfr(n*d) + c*copfr(d), where ap = if p<15 then p else p/3+10

c = -1.909228732, SoS = 0.00695829, SoS(1) = 13709
It's not my favorite, but I'd like to look into it. I understand it. I like how it's easy to calculate. That's interesting that it's a better fit. I accept it to be psychoacoustically plausible, though I doubt anyone would say it seems any more plausible than an offset log.

Would you want to count it as 2 chunks then, since the 10 is simply a byproduct of the 1/3 and 15 (15 - 1/3 * 15)? Or did you think the entire mechanism was just 1 chunk? I don't have a strong opinion on this.

Thanks for staying creative about the metrics while I've got to get creative on the implementation of our finder!

------

Speaking of which, in trying to extract the layer of the code which perfects the metrics with a higher-resolution and recursive search (after the cheaper first pass to identify the best place to start with that) I hit upon a great idea: pull the plug on a recursive search not when a timeout is hit, but whenever the difference between a sum of squares and the next sum of squares is smaller than a given threshold!

I set this threshold at 0.000000001 (one billionth) currently, since that's the decimal point we've regularly been giving out our sums of squares to. But I don't think we need that precise an answer to make our decisions so I'll see how much of a difference it makes to set that higher. A millionth would be fine, probably, and not result in skipping out on any important local-minima-paths deeper into the SoS canyon for any given metric.

I think this is going to work great and I may even be able to remove the timing-out mechanisms altogether, which as I explained before cause the code to slow down dramatically. It turns out it's really expensive, at least in JavaScript, at least with my level of command of software, to constantly interrupt sets big calculations to check on their progress and then elegantly abort things if necessary. I can barely get it to work at all, honestly, in terms of actually interrupting at the specified time anyway... usually it gets the message to stop but it can't manage to actually stop all the things it already kicked off.

I've already seen a ~20% increase in speed just from refactoring the code a bit to remove all traces of timers from the path the first non-recursive phase takes, since as I mentioned I realized it never times out anymore that it's been made non-recusive. Had I managed to ship this improvement before kicking off chunk count 4 a second time, it might only have taken about 48 hours.
User avatar
Dave Keenan
Site Admin
Posts: 2180
Joined: Tue Sep 01, 2015 2:59 pm
Location: Brisbane, Queensland, Australia
Contact:

Re: developing a notational comma popularity metric

Post by Dave Keenan »

cmloegcmluin wrote: Sun Aug 09, 2020 2:12 am
Dave Keenan wrote: Sat Aug 08, 2020 7:04 pm Is this the simplest way to adjust sopfr to improve it?
I'm not sure what you mean by that. One way of defining "the simplest way to adjust sopfr to improve it" would be a ten-way tie between all the 2-chunk metrics for which 1 chunk is sopfr and the other chunk is anything else (a list of ten things which do not include this, which I would define as 3 chunks, the two extra chunks being the c and either the copfr or soapfr).
Ah yes. Quite ambiguous. Sorry. I meant "adjust the value given by sopfr(n*d)", i.e. calculate sopfr(n*d) then adjust it. So soapfrs don't qualify unless they can be rewritten as adjusted sopfrs in that sense.
Dave Keenan wrote: Sat Aug 08, 2020 10:57 pm Here's a kind of metric we haven't considered before. The "altered-prime" function is piecewise linear.

soapfr(n*d) + c*copfr(d), where ap = if p<15 then p else p/3+10

c = -1.909228732, SoS = 0.00695829, SoS(1) = 13709
It's not my favorite, but I'd like to look into it. I understand it. I like how it's easy to calculate. That's interesting that it's a better fit. I accept it to be psychoacoustically plausible, though I doubt anyone would say it seems any more plausible than an offset log.[/quote]

Agreed.
Would you want to count it as 2 chunks then, since the 10 is simply a byproduct of the 1/3 and 15 (15 - 1/3 * 15)? Or did you think the entire mechanism was just 1 chunk? I don't have a strong opinion on this.
Yes. 2 chunks for the 2 parameters of the piecewise-linear function, so 5 total. The two parameters are 1. where it stops being p, and 2. the slope after that.
I've already seen a ~20% increase in speed just from refactoring the code a bit to remove all traces of timers from the path the first non-recursive phase takes, since as I mentioned I realized it never times out anymore that it's been made non-recusive. Had I managed to ship this improvement before kicking off chunk count 4 a second time, it might only have taken about 48 hours.
You don't need a 20% speed up, you need a 100-times speedup. Can't you rewrite it in something that runs on the bare metal, like C++, instead of running on a virtual machine.

But while you're in virtual-machine-land, do you have any code profiling tools so you can find out where it's spending most of its time. And just speed that up.

I'm guessing a lot of time is spent computing the base-2 logs of prime numbers. How about just computing them once, and sticking them in a sparse array, indexed by the prime number.
Post Reply