Intelligent Systems – Assignment 4 Problem Solving Solutions

# Intelligent Systems – Assignment 4 Problem Solving Solutions - **Course**: Intelligent Systems - **Assignment**: 4 – Problem Solving - **Due date**: December 1st - **Max grade**: 50 points This document contains complete solutions with step-by-step explanations for Assignment 4. --- ## Problem 1 [15 pts]: Probabilities ### A. [5 pts] Probability Table Sizes
**Question:** Let $X$, $Y$, and $Z$ be discrete random variables with domains: - $X \in \{x_1, x_2, x_3\}$ (3 values) - $Y \in \{y_1, y_2, y_3\}$ (3 values) - $Z \in \{z_1, z_2, z_3, z_4\}$ (4 values) How many entries are in each probability table, and what is the sum of values?
**Solution:** | Table | Full Size (all entries) | Slices | Sum per slice | |-------|------------------------|--------|---------------| | $P(X,Z \mid Y)$ | $3 \times 4 \times 3 = 36$ | 3 (one per $y$) | $1$ | | $P(Y \mid X,Z)$ | $3 \times 3 \times 4 = 36$ | 12 (one per $x,z$) | $1$ | | $P(z_1 \mid X)$ | $3$ | 1 | N/A (not a distribution over $x$) | | $P(X, z_3)$ | $3$ | 1 | $P(z_3)$ | | $P(X \mid y_2, z_3)$ | $3$ | 1 | $1$ | **Step-by-step explanation:** 1. **$P(X,Z \mid Y)$**: This is a conditional distribution of $(X,Z)$ given $Y$. - Full table size: Need an entry for each combination $(x,z,y)$ = $3 \times 4 \times 3 = 36$ - Each slice $P(X,Z \mid Y=y_i)$ is a joint distribution over $X$ and $Z$ - Each slice sums to 1 (normalization) 2. **$P(Y \mid X,Z)$**: Conditional distribution of $Y$ given $(X,Z)$. - Full table size: $|Y| \times |X| \times |Z| = 3 \times 3 \times 4 = 36$ - There are $|X| \times |Z| = 12$ slices (one per conditioning value) - Each slice $P(Y \mid X=x_i, Z=z_j)$ is a distribution over $Y$, summing to 1 3. **$P(z_1 \mid X)$**: Probability of specific value $z_1$ given each $X$. - Size: 3 (one probability per value of $X$) - Sum is NOT constrained to 1 (this is not a distribution over $X$) 4. **$P(X, z_3)$**: Joint probability of $X$ and specific $z_3$. - Size: 3 (one value per $x$) - Sum = $\sum_x P(x, z_3) = P(z_3)$ (marginal probability) 5. **$P(X \mid y_2, z_3)$**: Distribution of $X$ given specific $y_2, z_3$. - Size: 3 (one probability per value of $X$) - Sums to 1 (conditional distribution) --- ### B. [5 pts] True or False
**Question:** State whether each statement is True or False. No independence assumptions are made.
**1.** $P(A, B) = P(A \mid B) P(A)$
**Answer: False** **Explanation:** The correct formula is $P(A, B) = P(A \mid B) P(B)$, not $P(A \mid B) P(A)$. This is the definition of conditional probability rearranged: $P(A \mid B) = \frac{P(A,B)}{P(B)}$.
--- **2.** $P(A \mid B) P(C \mid B) = P(A, C \mid B)$
**Answer: False** **Explanation:** This equality only holds when $A \perp C \mid B$ (A and C are conditionally independent given B). In general: $$P(A, C \mid B) = P(A \mid B) P(C \mid A, B)$$ The given expression implicitly assumes independence, which is not stated.
--- **3.** $P(B,C) = \sum_{a \in A} P(B, C \mid A)$
**Answer: False** **Explanation:** The law of total probability requires multiplying by the prior: $$P(B, C) = \sum_a P(B, C \mid A=a) P(A=a)$$ The expression $\sum_{a \in A} P(B, C \mid A)$ is missing the $P(A=a)$ term.
--- **4.** $P(A, B, C, D) = P(C) P(D \mid C) P(A \mid C, D) P(B \mid A, C, D)$
**Answer: True** **Explanation:** This is a valid chain rule factorization. We can verify: 1. Start with $P(C)$ 2. Multiply by $P(D \mid C)$ → gives $P(C, D)$ 3. Multiply by $P(A \mid C, D)$ → gives $P(A, C, D)$ 4. Multiply by $P(B \mid A, C, D)$ → gives $P(A, B, C, D)$ ✓
--- **5.** $P(C \mid B, D) = \frac{P(C) P(B \mid C) P(D \mid C, B)}{\sum_{c'} P(C=c') P(B \mid C=c') P(D \mid C=c', B)}$
**Answer: True** **Explanation:** This is Bayes' rule applied correctly: 1. Numerator: $P(C) P(B \mid C) P(D \mid C, B) = P(C, B, D)$ by chain rule 2. Denominator: $\sum_{c'} P(c') P(B \mid c') P(D \mid c', B) = P(B, D)$ by marginalization 3. Result: $\frac{P(C, B, D)}{P(B, D)} = P(C \mid B, D)$ ✓
--- ### C. [5 pts] Probability Expressions #### (i) Compute $P(A, B \mid C)$
**Given:** $P(A)$, $P(A \mid C)$, $P(B \mid C)$, $P(C \mid A, B)$ with no independence assumptions.
**Solution:** $$P(A, B \mid C) = \frac{P(A, B, C)}{P(C)} = \frac{P(C \mid A, B) P(A, B)}{P(C)}$$ However, we don't have $P(A, B)$ directly and cannot derive it from the given tables. **Answer: Not possible** with only the given tables. --- #### (ii) Compute $P(B \mid A, C)$
**Given:** $P(A)$, $P(A \mid C)$, $P(B \mid A)$, $P(C \mid A, B)$ with no independence assumptions.
**Solution:** **Step 1:** Apply Bayes' rule: $$P(B \mid A, C) = \frac{P(A, B, C)}{P(A, C)}$$ **Step 2:** Expand numerator using chain rule: $$P(A, B, C) = P(C \mid A, B) \cdot P(B \mid A) \cdot P(A)$$ **Step 3:** Compute denominator by marginalization: $$P(A, C) = \sum_b P(A, B=b, C) = \sum_b P(C \mid A, b) \cdot P(b \mid A) \cdot P(A)$$ **Step 4:** Combine and simplify (note $P(A)$ cancels): $$P(B \mid A, C) = \frac{P(C \mid A, B) \cdot P(B \mid A)}{\sum_b P(C \mid A, b) \cdot P(b \mid A)}$$ --- #### (iii) Compute $P(C)$
**Given:** $P(A \mid B)$, $P(B)$, $P(B \mid A, C)$, $P(C \mid A)$ with assumption $A \perp B$.
**Solution:** **Step 1:** Use independence $A \perp B$: $$P(A, B) = P(A) \cdot P(B)$$ **Step 2:** From definition of conditional probability: $$P(A \mid B) = \frac{P(A, B)}{P(B)} = \frac{P(A) \cdot P(B)}{P(B)} = P(A)$$ **Step 3:** Marginalize over $A$: $$P(C) = \sum_a P(C, a) = \sum_a P(C \mid a) \cdot P(a) = \sum_a P(C \mid a) \cdot P(a \mid B)$$ --- #### (iv) Compute $P(A, B, C)$
**Given:** $P(A \mid B, C)$, $P(B)$, $P(B \mid A, C)$, $P(C \mid B, A)$ with assumption $A \perp B \mid C$.
**Solution:** From $A \perp B \mid C$: $P(A, B \mid C) = P(A \mid C) \cdot P(B \mid C)$ Also, from the assumption: $P(A \mid B, C) = P(A \mid C)$ We need $P(C)$ to compute: $$P(A, B, C) = P(A, B \mid C) \cdot P(C) = P(A \mid C) \cdot P(B \mid C) \cdot P(C)$$ **Answer: Not directly possible** - we cannot derive $P(C)$ from the given tables. --- ## Problem 2 [15 pts]: BN Representation ### A. [2 pts] Joint Probability Distribution
**Question:** Write the joint probability distribution for the given Bayes Net. ```mermaid graph TD A --> C B --> C C --> D B --> E C --> E ```
**Solution:** $$P(A, B, C, D, E) = P(A) \cdot P(B) \cdot P(C \mid A, B) \cdot P(D \mid C) \cdot P(E \mid B, C)$$ **Step-by-step derivation:** 1. $A$ has no parents → $P(A)$ 2. $B$ has no parents → $P(B)$ 3. $C$ has parents $A, B$ → $P(C \mid A, B)$ 4. $D$ has parent $C$ → $P(D \mid C)$ 5. $E$ has parents $B, C$ → $P(E \mid B, C)$ --- ### B. [2 pts] Draw Bayes Net
**Question:** Draw the Bayes net for $P(A) \cdot P(B) \cdot P(C \mid A, B) \cdot P(D \mid C) \cdot P(E \mid B, C)$
**Solution:** ```mermaid graph TD A --> C B --> C C --> D B --> E C --> E ``` **Explanation:** Each conditional term determines the edges: - $P(A)$, $P(B)$: Root nodes (no parents) - $P(C \mid A, B)$: Edges from $A→C$ and $B→C$ - $P(D \mid C)$: Edge from $C→D$ - $P(E \mid B, C)$: Edges from $B→E$ and $C→E$ --- ### C. [5 pts] Space Complexity
**Question:** For $N$ binary variables, compare space needed for joint distribution vs. Bayes net.
**Key concept:** For $N$ binary variables: - Joint distribution: $2^N$ entries ($2^N - 1$ free parameters) - Bayes net: Depends on parent set sizes #### (ii) Bayes Net with Less Space ```mermaid graph TD A --> B A --> C A --> D ``` **Space calculation:** | CPT | Entries | Free Parameters | |-----|---------|-----------------| | $P(A)$ | 2 | 1 | | $P(B \mid A)$ | 4 | 2 | | $P(C \mid A)$ | 4 | 2 | | $P(D \mid A)$ | 4 | 2 | | **Total** | 14 | **7** | Joint distribution: 15 free parameters. **7 < 15** ✓ --- #### (iii) Bayes Net with More/Equal Space ```mermaid graph TD A --> B A --> C A --> D B --> C B --> D C --> D ``` **Space calculation:** | CPT | Entries | Free Parameters | |-----|---------|-----------------| | $P(A)$ | 2 | 1 | | $P(B \mid A)$ | 4 | 2 | | $P(C \mid A, B)$ | 8 | 4 | | $P(D \mid A, B, C)$ | 16 | 8 | | **Total** | 30 | **15** | This equals the joint distribution (15 free parameters). --- ### D. [2 pts] Factor Multiplication #### (i) $P(A) \cdot P(B \mid A) \cdot P(C \mid A) \cdot P(E \mid B, C, D)$
**Question:** What factor is missing to form a valid joint distribution?
**Solution:** **Step 1:** Identify variables in each factor: - $P(A)$: covers $A$ - $P(B \mid A)$: covers $A, B$ - $P(C \mid A)$: covers $A, C$ - $P(E \mid B, C, D)$: covers $B, C, D, E$ **Step 2:** Note that $D$ appears only in the conditioning of $P(E \mid B, C, D)$ but has no factor defining its distribution. **Answer: $P(D)$** (or $P(D \mid \text{parents})$ if $D$ has dependencies) --- #### (ii) $P(D) \cdot P(B) \cdot P(C \mid D, B) \cdot P(E \mid C, D, A)$ **Solution:** **Step 1:** Identify variables: - $P(D)$: covers $D$ - $P(B)$: covers $B$ - $P(C \mid D, B)$: covers $B, C, D$ - $P(E \mid C, D, A)$: covers $A, C, D, E$ **Step 2:** Variable $A$ appears only in the conditioning of $P(E \mid C, D, A)$ but has no distribution. **Answer: $P(A)$** --- ## Problem 3 [10 pts]: BN Independence
**Question:** Use d-separation to determine independence in the following Bayes net: ```mermaid graph TD A --> B A --> C B --> D C --> D B --> E C --> F D --> G E --> H F --> H G --> H ```
**1. Is $A \perp B$ guaranteed?**
**Answer: False** There is a direct edge $A → B$, so $A$ and $B$ are directly dependent.
--- **2. Is $A \perp C$ guaranteed?**
**Answer: False** There is a direct edge $A → C$, so $A$ and $C$ are directly dependent.
--- **3. Is $A \perp D \mid \{B, H\}$ guaranteed?**
**Answer: False** **Explanation:** Consider path $A → C → D$: - $C$ is not in the conditioning set $\{B, H\}$ - This path is **active** (chain rule, middle node not observed) Therefore, $A$ and $D$ are NOT conditionally independent given $\{B, H\}$.
--- **4. Is $A \perp E \mid F$ guaranteed?**
**Answer: False** **Explanation:** Consider path $A → B → E$: - $B$ is not in the conditioning set $\{F\}$ - This is a chain path with middle node not observed - The path is **active** Therefore, there exists an active path, so $A$ and $E$ are NOT independent given $F$.
--- **5. Is $C \perp H \mid G$ guaranteed?**
**Answer: False** **Explanation:** Consider path $C → F → H$: - $F$ is not in the conditioning set - This chain path is **active** Also, path $C → D → G → H$ where $G$ is observed: - At $G$, this is a chain ($D → G → H$), and $G$ is observed - The path is **blocked** at $G$ But the first path $C → F → H$ remains active, so $C$ and $H$ are NOT independent given $G$.
--- ## Problem 4 [10 pts]: BN Inference ### a. [7 pts] Medical Diagnosis Bayes Net
**Given Bayes Net:** ```mermaid graph TD G[Gene G] --> A[Disease A] G --> B[Disease B] A --> S[Symptom S] B --> S ```
#### (i) Compute $P(g, a, b, s)$ **Solution:** By the chain rule for Bayes networks: $$P(g, a, b, s) = P(G=g) \cdot P(A=a \mid G=g) \cdot P(B=b \mid G=g) \cdot P(S=s \mid A=a, B=b)$$ --- #### (ii) Compute $P(A)$ **Solution:** Marginalize over the hidden variable $G$: $$P(A) = \sum_g P(A \mid G=g) \cdot P(G=g)$$ **Step-by-step:** 1. $P(A=\text{true}) = P(A \mid G=\text{true}) \cdot P(G=\text{true}) + P(A \mid G=\text{false}) \cdot P(G=\text{false})$ --- #### (iii) Compute $P(A \mid S, B)$ **Solution:** **Step 1:** Apply Bayes' rule: $$P(A \mid S, B) = \frac{P(A, S, B)}{P(S, B)}$$ **Step 2:** Expand using the BN structure: $$P(A \mid S, B) \propto \sum_g P(g) \cdot P(A \mid g) \cdot P(B \mid g) \cdot P(S \mid A, B)$$ **Step 3:** Since $P(S \mid A, B)$ doesn't depend on $g$: $$P(A \mid S, B) \propto P(S \mid A, B) \cdot \sum_g P(g) \cdot P(A \mid g) \cdot P(B \mid g)$$ **Step 4:** Normalize over values of $A$: $$P(A \mid S, B) = \frac{P(S \mid A, B) \cdot \sum_g P(g) P(A \mid g) P(B \mid g)}{\sum_{a'} P(S \mid a', B) \cdot \sum_g P(g) P(a' \mid g) P(B \mid g)}$$ --- #### (iv) Compute $P(G \mid B)$ **Solution:** **Step 1:** Apply Bayes' rule: $$P(G \mid B) = \frac{P(B \mid G) \cdot P(G)}{P(B)}$$ **Step 2:** Compute denominator: $$P(B) = \sum_g P(B \mid g) \cdot P(g)$$ **Step 3:** Final expression: $$P(G \mid B) = \frac{P(B \mid G) \cdot P(G)}{\sum_g P(B \mid g) \cdot P(g)}$$ --- ### b. [3 pts] Variable Elimination
**Given Bayes Net (all binary variables):** ```mermaid graph TD A --> C B --> C C --> D D --> E E --> F ```
#### (i) Size of Bayes Net **Solution:** Count free parameters for each CPT: | CPT | Parents | Rows | Free Params | |-----|---------|------|-------------| | $P(A)$ | 0 | 2 | 1 | | $P(B)$ | 0 | 2 | 1 | | $P(C \mid A, B)$ | 2 | 8 | 4 | | $P(D \mid C)$ | 1 | 4 | 2 | | $P(E \mid D)$ | 1 | 4 | 2 | | $P(F \mid E)$ | 1 | 4 | 2 | | **Total** | | | **12** | **Answer:** 12 free parameters --- #### (ii) Variable Elimination Ordering for $P(C \mid F)$ **Solution:** Variables to eliminate: $A, B, D, E$ (not $C$ or $F$) **Optimal ordering: $A, B, D, E$** **Step-by-step analysis:** | Step | Eliminate | Factors Involved | New Factor | Size | |------|-----------|------------------|------------|------| | 1 | $A$ | $P(A), P(C \mid A,B)$ | $f_1(B,C)$ | 4 | | 2 | $B$ | $P(B), f_1(B,C)$ | $f_2(C)$ | 2 | | 3 | $D$ | $P(D \mid C), P(E \mid D)$ | $f_3(C,E)$ | 4 | | 4 | $E$ | $f_3(C,E), P(F \mid E)$ | $f_4(C,F)$ | 4 | **Maximum intermediate factor size: 4** This ordering keeps factors small because we eliminate variables that don't create large intermediate factors. --- ## Summary - All solutions verified for mathematical correctness - Each solution includes step-by-step derivation - Probability distributions properly normalized - Independence relationships determined using d-separation - Variable elimination ordering minimizes intermediate factor size