Here's my experience,
I generally feel that for every three or four things that we try, one works.
Then we take the thing that works and try to improve on it or at least do a lot more of it.
And we study the ones that didn't and see what we can learn from them
Then of course, usually whatever worked the first time stops working.
The things that didn't work get tried again and work the second time.
Is it random? Maybe,
My personal theory if that the logic is in the numbers. We are usually working with small numbers so that the statistics are misleading. It's only when we have tens of thousands of examples that patterns become clearer but we're often trying to interpret results based on dozens or hundreds of examples. Which is too small.
BTW, if I ever get to do my own math curriculum, interpreting data and statistics will move from a peripheral issue to a mainstream one. It horrifies me how many people can take differentials (ie do calculus), calculate sines and cotangents (geometry 2), and solve differential and quadratic equations (algebra 1 & 2) but can't answer the basic questions of, "we need a 25% response rate for this effort to be cost-effective, we've sent out a hundred random emails. Ten people responded. Should we keep testing or is it decided?" (BTW, I too am not sure that I know how to answer it).