JA7. Learning Journal 7¶
The Learning Journal is a tool for self-reflection on the learning process. The Learning Journal will be assessed by your instructor as part of your Final Grade.
Answer the following questions¶
1. Describe what you did. You need to describe what you did and how you did it¶
This was the 7th week of this course; it was all about how to plan agents’ moves under uncertainty which affects reasoning in open or incomplete worlds. I started materials assigned in the learning guide, then I did the discussion assignment which was about analyzing a defining. I also did the self quizzes along with the learning journal and programming assignments.
2. Describe your reactions to what you did¶
I found the topics presented so far to be of great importance theoretically and practically. However, they seem very complex and condensed to be included in one week. The idea of uncertainty and conditional probability is very big and needs more time to be understood. Let alone putting the concepts of deterministic and probabilistic decision-making into practice using decision networks.
3. Describe any feedback you received or any specific interactions you had while participating discussion forum or the assignment Discuss how they were helpful¶
The discussion assignment asked us to create a decision network of a coffee robot that pours coffee and delivers it to a person. I found myself creating a decision network with 5 nodes so that the table to find possible utilities included more than 20 rows. I used optimization techniques to reduce the number of nodes which made the decision network easier to read and understand along with utilizing historical data to predict the utilities of future delivery trips.
I was surprised that my classmates were able to provide answers for the discussion assignment with less words; my assignment was very long compared to theirs. I received feedback that I need to make it shorter but I was not able to omit any of the information I wrote, as I thought it was necessary to explain my answers.
4. Describe your feelings and attitudes¶
A key focus of the week was on evaluating the Markov Decision Process (MDP) in fully observable environments for modeling sequential decision-making scenarios with probabilistic dynamics. We had the opportunity to try this in the programming assignment, which asked us to use MDP in a maze navigator agent that aims for the diamond.
I felt that the programming assignment was a good exercise but it was also hard to implement it in the standard way, like where do you draw the lines between different components of the agents while translating the theory into code, e.g. do I need to extract the reward system into its own function or can I keep it within the current function? I ended up abstracting lots of things into their own functions which caused be code to be over abstracted, but it worked.
5. Describe what you learned. You can think of one or more topics and explain your understanding in writings¶
Planning introduces the idea that an agent must be able to reason about the future. In the study of AI, planning is the decision-making process performed by intelligent agents like robots, or computer programs when trying to achieve a goal state. Planning determines a sequence of necessary actions and WHEN those actions are necessary to accomplish the goal (UoPeople, n.d.).
When there is certainty there are a number of different ways to represent planning, including explicit state-space representations, feature-based representations of actions, and the STRIPS representation. In some cases, a goal can be achieved with a single action, while in many cases, many actions must be performed to complete a goal. Developing these lists of actions requires a planning strategy such as forward planning, regression planning, partial-order planning, or planning as CSPs (Constraint Satisfaction Problems) (UoPeople, n.d.).
When there is uncertainty, the best that we sometimes have is a strong likelihood of an outcome from an action. In such uncertain situations we act based upon two factors: our beliefs and our preferences. The belief is an estimate of the probability that an event will occur. The preference is the value of the goal that we are trying to attain. Utility can be seen as a measure of this relationship between belief and preference (UoPeople, n.d.).
Decision networks are belief networks that have been extended to include the utility of actions along with decision variables. Belief networks are directed acyclic graphs (DAGs) that include potential states and the probability (belief) that this state will occur along with the dependencies between these states. Decision networks draw chance nodes as ovals, decision nodes as rectangles, and utility nodes as diamonds (Poole & Mackworth, 2017).
The arcs coming into decision nodes represent the information that will be available when the decision is made. The arcs coming into chance nodes represent probabilistic dependence. The arcs coming into the utility node represent what the utility depends on. Decision networks are used to determine the best course of action based on the action that produces the greatest utility (Poole & Mackworth, 2017).
6. Did you face any challenges while doing the discussion or the development assignment? Were you able to solve it by yourself?¶
The programming assignment was particularly challenging for me. Not the working code itself, but rather me trying to design my code in a way similar to what we have learned in this week. That is, I tried to keep my code as modular as possible while having different layers and hierarchies calling each other, but that was difficult.
I overcome this challenge by recording everything in the environment into one central state and ended up passing this state around to all different functions in my code. I also had to abstract a lot of things into their own functions although it may not be necessary, which may caused my code to be over abstracted, but it worked.
Here is the shape of central state I used, you see that it includes information about the internals of the agent, the external environment, and some abstract information about the goal and the history of the moves:
state = {
"current_pos": {"x": 0, "y": 0},
"cols": 3,
"rows": 3,
"grid": [
["Start", "", "Blocked"],
["", "", ""],
["Fire", "", "Diamond"]
],
"history": [],
"target_pos": {"x": 2, "y": 2},
"goal_achieved": False,
"rewards": {
"Start": 0,
"Blocked": -100,
"Fire": -1000,
"Diamond": 100,
"Move": -1
},
"actions": ["UP", "DOWN", "LEFT", "RIGHT"],
"total_reward": 0
}
References¶
- Poole, D. L., & Mackworth, A. K. (2017). Artificial Intelligence: Foundations of computational agents. Cambridge University Press. https://artint.info/2e/html/ArtInt2e.html Chapter 5 - Propositions and Inference.
- UoPeople. (n.d.). CS4408 Artificial Intelligence. Learning Guide Unit 7: Introduction. Uopeople.edu. https://my.uopeople.edu/mod/book/view.php?id=454716&chapterid=555065