> If programmer didn't know that repetitions are allowed, they wouldn't appear in the input to the fuzzer as well.
It totally would. The point of a fuzzer is to test the system with every technically possible input, to avoid bias and blind spots in the programmer's thinking.
Furthermore, assuming that no duplicates exist is a rather strong assumption that should always be questioned. Unless you know all about the business rules of an external system, you can't trust its data and can't assume much about its behavior.
Anyways, we are discussing about the wrong issue. Bugs happen, even halting the whole system can be justified, but the operators should have had an easier time figuring out what was actually going on, without the vendor having to pore through low-level logs.
No... that's not the point of fuzzing... You cannot write individual functions in such a way that they keep revalidating input handed to them. Because then, invariably, the validations will be different function to function, and once you have an error in your validation logic, you will have to track down all function that do this validation. So, functions have to make assumptions about input, if it doesn't come from an external source.
I.e. this function wasn't the one which did all the job -- it already knew that the input was valid because the function that provided the input already ensured validation happened.
It's pointless to deliberately send invalid input to a function that expects (for a good reason) that the input is valid -- you will create a ton of worthless noise instead of looking for actual problems.
> Furthermore, assuming that no duplicates exist is a rather strong assumption that should always be questioned.
How do you even come up with this? Do you write your code in such a way that any time it pulls a value from a dictionary, you iterate over the dictionary keys to make sure that they are unique?... There are plenty of things that are meant to be unique by design. The function in question wasn't meant to check if the points were unique. For all we know, the function might have been designed to take a map and the data was lost even before this function started processing it...
You really need to try doing what you suggest before suggesting it.
I am not going to comment the first paragraph since you turned my words around.
> How do you even come up with this? Do you write your code in such a way that any time it pulls a value from a dictionary, you iterate over the dictionary keys to make sure that they are unique?
A dictionary in my program is under my control and I can be sure that the key is unique since... well, I know it's a dictionary. I have no such knowledge about data coming from external systems.
> There are plenty of things that are meant to be unique by design. The function in question wasn't meant to check if the points were unique. For all we know, the function might have been designed to take a map and the data was lost even before this function started processing it...
"Meant to be" and "actually are" can be very different things, and it's the responsibility of a programmer to establish the difference, or to at least ask pointed questions. Actually, the programmers did the correct thing by not sweeping this unexpected problem under the rug. The reaction was just a big drastic, and the system did not make it easy for the operators to find out what went wrong.
Edit: as we have seen, input can be valid, but still not be processable by our code. That not fine, but it's a fact of life since specs are often unclear or incomplete. Also, the rules can actually change without us noticing. In these cases, we should make it as easy as possible to figure out what went wrong.
It totally would. The point of a fuzzer is to test the system with every technically possible input, to avoid bias and blind spots in the programmer's thinking.
Furthermore, assuming that no duplicates exist is a rather strong assumption that should always be questioned. Unless you know all about the business rules of an external system, you can't trust its data and can't assume much about its behavior.
Anyways, we are discussing about the wrong issue. Bugs happen, even halting the whole system can be justified, but the operators should have had an easier time figuring out what was actually going on, without the vendor having to pore through low-level logs.