It's not a 'hottest tech stack', but I would suggest people take 2020 to learn t...

crimsonalucard · on Jan 5, 2020

How is this something to learn? It's more of something to try. Anyone who can write code can write tests and anyone who can write tests can writes tests before they code. It's trivial.

Instead learn formal methods. Learn how to prove your code correct for all cases rather then verifying your code for one test case. This is learning and it won't be rehashing what you know like tdd. Formal methods is brutally hard.

inimino · on Jan 5, 2020

The concept of correctness by proof, rather than by spraying tests at the code and hoping, is a shift of perspective, but it doesn't have to be brutally hard and doesn't require going all the way to direct application of formal methods (which is often impractical). I encourage people to go partway in the right direction. Instead of telling me your test coverage, tell me how you can prove that the core algorithm of your product is correct. Or how you can prove that it is secure. This kind of thinking is the only thing I've ever seen lead to quality code.

crimsonalucard · on Jan 5, 2020

I don't mean to say that it's hard in the sense that you can't learn it. I mean it's hard in the sense that it's like you're learning programming from scratch again.

It will be a very different and much more challenging path then learning another framework/language which is what most people just do over and over again.

inimino · on Jan 6, 2020

I suppose this is true. I haven't had a chance to work with or teach anyone who is learning to think this way, and I don't really remember what it was like for me. However, I've noticed that when I talk to some people who are big advocates of TDD and so on, they seem to have such a different way of looking at things that there's almost no common ground.

crimsonalucard · on Jan 6, 2020

The variance arises from the fact that none of it is formalized or theoretical. It's just a bunch of opinions.

cmdshiftf4 · on Jan 5, 2020

>Anyone who can write code can write tests and anyone who can write tests can writes tests before they code. It's trivial.

It's also trivial to create an absolutely brittle mess of a test suite. Building a solid, performant and reliable test suite is an art that, in my experience, the vast majority of devs do not seem to have much skill in.

crimsonalucard · on Jan 6, 2020

A test suite is just a some code iterating across some test functions.

If you want you can add fancy scoping and contexts and assertion shortcuts go for it, but ultimately this is also trivial. I wouldn't spend too much effort in this area.

madhadron · on Jan 6, 2020

Writing good test plans and building testable code is actually a skill with some underlying theory. It's just not usually taught that way.

crimsonalucard · on Jan 6, 2020

There's no theory behind testable code. Mathematical theory exists only for formal methods.

There's a bunch of made up patterns and techniques for writing testable code though. Most of these techniques are actually bad.

Dependency injection with mocks is the one I hear about the most and it is also the worst possible way to organize your code. Do not write your code using this pattern... the complexity of this pattern hides the fact that it is, in fact, not improving anything.

madhadron · on Jan 6, 2020

> There's no theory behind testable code. Mathematical theory exists only for formal methods.

<snip>

> Dependency injection with mocks is the one I hear about the most

Correction, you don't happen to know the theory. Nor is it a mathematically super complicated theory. It's not a replacement for formal methods. The heuristic I use is "formal methods as far as can be straightforwardly done, tests thereafter."

It boils down to how to choose what elements of a parameter space to run experiments on so you can reason by induction with some confidence. I teach it as "boundary and bulk". If I have a parameter that is a list, then the boundary (empty list, one element list, two element list) needs to be probed carefully but in most cases the bulk (fifty element list vs fifty one element list) just needs a couple of samples. Then factorial designs to combine parameters. You reduce the combinatorial explosion of factorial designs by splitting parameters via formal methods. You reduce things like external service dependencies to this something susceptible to boundary and bulk using Parnas's trace assertion method.

From this point of view, writing testable code is a statement about controlling the complexity of test plans. Things like instead of having a function take a few representations, make it only take a canonical representation and provide adapter functions. For example, if you have a function f(t0, tn) that takes two timestamps, you could have t0 and tn be seconds since epoch, offsets relative to now, or some kind of text date format. If f accepts all three, then you have a test plan of size 9*N. If it accepts just seconds since the epoch, you have N + 2 (for the adapter functions). This kind of calculation provides concrete statements about making code more testable.

crimsonalucard · on Jan 6, 2020

>If I have a parameter that is a list, then the boundary (empty list, one element list, two element list) needs to be probed carefully but in most cases the bulk (fifty element list vs fifty one element list) just needs a couple of samples

Isn't this just a design methodology? You set the boundary parameter as the beginning elements and you arbitrarily choose a sample of a 50 element list. I wouldn't call this theory. Your boundary and bulk idea doesn't seem theoretically sound, it's more of a personal strategy. Additionally it doesn't even seem sufficiently random/scientific. Why would a one element list be more effective to test then a 3452 element list? Your tests are biased towards lower ordinal elements.

If testing has any theory behind it I would think it would be the same as the theory behind science/experimentation in general: probability. But it seems like you're getting into something else here.

>Then factorial designs to combine parameters. You reduce the combinatorial explosion of factorial designs by splitting parameters via formal methods. You reduce things like external service dependencies to this something susceptible to boundary and bulk using Parnas's trace assertion method.

Can you point me to a resource explaining the trace assertion method? I can't parse your language here. What do you mean by "splitting a parameter?" Here's what I can make of it: Your talking about using some method (Parnas's) to modularize external services like IO away from testable logic... is this correct? What is your condition for an optimal test?

>From this point of view, writing testable code is a statement about controlling the complexity of test plans. Things like instead of having a function take a few representations, make it only take a canonical representation and provide adapter functions. For example, if you have a function f(t0, tn) that takes two timestamps, you could have t0 and tn be seconds since epoch, offsets relative to now, or some kind of text date format. If f accepts all three, then you have a test plan of size 9*N. If it accepts just seconds since the epoch, you have N + 2 (for the adapter functions). This kind of calculation provides concrete statements about making code more testable.

Your statements are inconsistent here can you clarify with a more detailed example? You talk about a function that takes two variables then you suddenly say f takes all three. What is your definition of the size of a test plan? What is N? Is it the cardinality of the parameters? What is your definition of code that is "more testable"

Can you just write out a full example of the thing your testing and how you are using the theory to make the code more testable? It will give me a more clear understanding of what you're talking about.

and/or better yet point me to a resource on the mathematical theory behind software testing.

From what I can make out you're overall reducing the cardinality of the types of the parameters to a function but it's not clear to me exactly how or what you're doing .

madhadron · on Jan 6, 2020

> You set the boundary parameter as the beginning elements and you arbitrarily choose a sample of a 50 element list.

That isn't what I was trying to express. I was saying you would use: [], [5], [12, 3], and then a few long lists.

> I would think it would be the same as the theory behind science/experimentation in general: probability.

Probability isn't the underlying theory behind experiment selection in general. It's used in what's called design of experiments in statistics to calculate optimal sampling points for continuous variates, but if you look at what scientists actually do to choose what experiments to run, it is not based in probability.

> Can you point me to a resource explaining the trace assertion method?

There are a bunch of papers. A quick Google search should suffice.

> Your talking about using some method (Parnas's) to modularize external services like IO away from testable logic

No, I'm saying that you can use the trace assertion method to produce a description of a service that is amenable to choosing a set of test conditions the way you would for a list or a tree.

> You talk about a function that takes two variables then you suddenly say f takes all three.

No, I'm saying it takes two parameters, but we let each parameter accept all three of seconds since an epoch, a relative time reference, e.g., "2 days ago" or a text description "march 15, 2019".

The size of a test plan is the number of conditions to run. N is a constant characterizing the test plan. This is just a scaling argument so it kind of doesn't matter.

> From what I can make out you're overall reducing the cardinality of the types of the parameters to a function but it's not clear to me exactly how or what you're doing.

I was just trying to give an example. Obviously failed.

crimsonalucard · on Jan 6, 2020

>That isn't what I was trying to express. I was saying you would use: [], [5], [12, 3], and then a few long lists.

Yes and I'm saying this is an arbitrary design choice and therefore NOT part of some mathematical theory. What is it that made you choose these as test cases? How does choosing those test cases make your tests better?

>I was just trying to give an example. Obviously failed.

Yeah sorry, I'm saying can you just give a more clear example rather than using sentences to describe it, write out a full example, test cases and all. I may not be able to parse your descriptions but I could more readily understand a complete code example that is made more "testable" under your definition.

>No, I'm saying it takes two parameters, but we let each parameter accept all three of seconds since an epoch, a relative time reference, e.g., "2 days ago" or a text description "march 15, 2019".

Ok I see what you're saying now. The type of each parameter is a tuple of three values.

This doesn't make any sense in terms of test plan size. How are you Choosing N? It seems to me that you're implying a lower N is a more optimal test.

Let's make that example simpler. Let's reduce the cardinality of the types and make it bools so we can measure it. The cardinality of a bool is 2 (true, false). f(t0 bool, tn bool) will therefore have a total cardinality of 4 (2 times 2) meaning 4 possible variations of inputs (we are disregarding possible outputs and only testing expected output which removes the exponential increase in cardinality of the function type). Now let's make this a tuple of three values each: f((t0,t1,t2), (t3,t4,t5)), the t's are all bools. All possible input cases are now 64 in total. (2 times 2 times 2) times (2 times 2 times 2).

Your test space of possible inputs to measure goes from 4 possible tests to 64 possible tests. This is the measure of the total possible tests you can ever run on the function before you have exhausted every possibility.

If you have N conditions why does increasing the number of test cases required to fully test the experiment (which in your example is nearly infinite, but reduced down to 4 and 64 in my example) suddenly increase the N by a multiple of nine? This makes no sense. Also why do the adapter functions have a test size of 1? What is your metric for determining N?

>It's used in what's called design of experiments in statistics to calculate optimal sampling points for continuous variates, but if you look at what scientists actually do to choose what experiments to run, it is not based in probability.

It's based off of statistics which is itself based off of probability. Probability is the mathematical theory and statistics is theory applied to the real world. Both are math but the latter isn't theory in the sense I'm talking about it.

I'm not really talking about applied experimental design here. I'm talking about a theory that will give me the shortest possible path between point A and B in a cartesian plane. I don't need "design" to help me here, calculation and theory will give me the optimal answer.

In your examples, it seems that there is no exact definition of "optimal" and it seems you're making a bunch of arbitrary test choices to try to converge your tests onto this blurry definition of "optimal."

This is what I mean by there is no "theory" behind tests. Even if you have formulas that give you a bunch of other metrics like "test size" it doesn't mean anything unless N is a concrete number derived from concrete measures. If your "test theory" focuses around just reducing an arbitrary N then I'll give you that, but right now I'm not clear about how this number scales up or down with "testable code"

pezh0re · on Jan 5, 2020

The concept is easy to grasp, but for me the challenge is figuring out which testing framework/suite is best for the language I'm dealing with (and then learning the intricacies of how the tooling works).

crimsonalucard · on Jan 5, 2020

There's millions of frameworks out there. I don't think it's worth learning the details of those things. It's like learning one persons very specific way of folding clothes. If you want to use one go for it, but to use one for learning? Waste of time.

You don't need a framework to do TDD. Can't you put your assertions and functions and tests in some iterative loop?