Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think it's dreadfully naive to assume there's a single, identifiable reward function.


You can always map lots of small reward functions onto a single large one.

For a fictitious example, total_reward = sin(time_of_day) × dopamine_reward + (oxytocin_reward / adrenaline_reward)


How do you even add them.. it's like asking what is the utility of the statue of liberty plus the utility of NAFTA?

Just because it's a mathematicians/economists wish doesn't mean it's correct.

Why did the drunken man look under the streetlight for his keys? "Didn't loose them here, but that's where the light is"


> How do you even add them.. it's like asking what is the utility of the statue of liberty plus the utility of NAFTA?

You add them by asking how much each part of the utility function makes you want to do a thing.

As TeMPOraL says, for your specific example you could do that in dollars, but perhaps a more emotionally affective example than USA icons would be asking how many oranges someone would need to offer you to convince you to have a hand amputation without anaesthetic: if you’re well fed, it’s ridiculous to even ask as even a lifetime supply of citrus fruit won’t come close, but if you’re literally starving to death you might well choose to lose a hand to gain anything edible.


If you start by assuming a utility function exists, you can have it do anything you want.

But we haven't quite solved every problem in the world by creating a utility function, have we?

Why is that? Could it be that some domains resist mathematization because the objects are incommensurate?

Why don't we just create utility functions to solve politics?

We cant, because politics is unsolvable in terms of well-defined mathematization and it could very well be that human intelligence is like that too.


If you agree that starting from the assumption that a utility function exists leads to being always able to define one, then you can’t simultaneously take the position that there can’t be two incommensurate things.

You could argue that utility functions are “too powerful” on the grounds that being able to explain anything is equivalent to being able to explain nothing.

> Why don't we just create utility functions to solve politics?

What do you mean by “solve”? I reckon the utility function of politics is approximately “democracy” in many cases.

> it could very well be that human intelligence is [unsolvable in terms of well-defined mathematization] too

That’s equivalent to saying “whatever human intelligence depends on isn’t limited to the laws of physics” as the laws of physics are written in maths, and as we invent new maths for new understanding of physics, that new understanding is also available for modelling our own minds, as they are physical objects. A similar argument also applies if we have an immortal soul. ;)


> If you agree that starting from the assumption that a utility function exists leads to being always able to define one, then you can’t simultaneously take the position that there can’t be two incommensurate things.

> What do you mean by “solve”? I reckon the utility function of politics is approximately “democracy” in many cases.

I don't understand what you're getting at here and I feel like it's missing the broader point I'm making anyways.

I think this conversation ends here


> How do you even add them..

In dollars. It's literally what currency is for.

Of course there's no one set of dollar values for most items. They very much depend on a context. But in a particular, well-defined context, you can very much convert both Statue of Liberty and NAFTA to dollars and sum them up to something that makes sense in that context.

> Why did the drunken man look under the streetlight for his keys? "Didn't loose them here, but that's where the light is"

Well, drunk people tend to have problems calculating expected value.



Not a single reward function, but multiple. Although for each decision/action you could in theory calculate a single value that represents the weight toward making that decision. This value comes from multiple systems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: