I think you're ignoring a lot of ways in which this system will not easily extend to more complex tasks.
-While the retrieval heuristic is sensible for the domain, it's not applicable to all domains. In what situations should you favor more recent memories over more relevant ones?
-The prompt for evaluating importance is domain-specific, asking the model to rate on a scale of 1 to 10 how important a life event is, giving examples like "brushing teeth" (a specific action in the domain) as a 0, and college acceptance as a 10. How do you extend that to a real-world agent?
-The process of running importance evaluation over all memories is only tractable because the agents receive a very small number of short memories over the course of a day. This can't scale to a continuous stream of observations.
-Reflections help add new inferences to the agent's memory, but they can only be generated in limited quantities, guided by a heuristic. In more complex domains where many steps of reasoning may be required to solve a problem, how can an agent which relies on this sort of ad hoc reflection make progress?
-The planning step requires that the agent's actions be decomposable from high-level to fine-grained. In more challenging domains, the agent will need to reason about the fine-grained details of potential plan items to determine their feasibility.
> This can't scale to a continuous stream of observations.
My mind doesn’t scale to a continuous stream either.
While I’m typing this on my phone 99.99% of all my observations are immediately discarded, and since this memory ranks as zero, I very much doubt I’ll remember writing this tomorrow.
I did not read the original post, but your reflections are a great enrichment to what I think the post is about, so congratulations for this good addition.
-While the retrieval heuristic is sensible for the domain, it's not applicable to all domains. In what situations should you favor more recent memories over more relevant ones?
-The prompt for evaluating importance is domain-specific, asking the model to rate on a scale of 1 to 10 how important a life event is, giving examples like "brushing teeth" (a specific action in the domain) as a 0, and college acceptance as a 10. How do you extend that to a real-world agent?
-The process of running importance evaluation over all memories is only tractable because the agents receive a very small number of short memories over the course of a day. This can't scale to a continuous stream of observations.
-Reflections help add new inferences to the agent's memory, but they can only be generated in limited quantities, guided by a heuristic. In more complex domains where many steps of reasoning may be required to solve a problem, how can an agent which relies on this sort of ad hoc reflection make progress?
-The planning step requires that the agent's actions be decomposable from high-level to fine-grained. In more challenging domains, the agent will need to reason about the fine-grained details of potential plan items to determine their feasibility.