[AIMA] Russel, Stuart J., Peter Norvig, "Artificial Intelligence, A Modern Approach" 3rd Ed., Prentice Hall, New Jersey, 2010
KI091322 Kecerdasan Buatan Materi 13: Learning Probabilistic Models (Bayesian Network) Teknik Informatika, Fakultas Teknologi Informasi Institut Teknologi Sepuluh Nopember Surabaya 2012 Pengembangan Bahan Ajar sebagai Pendukung Student Centered-Learning (SCL) melalui e-Learning : Share ITS
Daftar Materi 1.
Pengenalan Intelligent Agents, Uninfomed Search (pencarian tanpa informasi tambahan) 2. Informed Search (Heuristics) 3. Local Search & Optimization (pencarian dengan optimasi) 4. Adversarial Search (pencarian dengan melihat status pihak lainnya) 5. Constraint Satisfaction Problems (pencarian dengan batasan kondisi) 6. Reasoning: propositional logic 7. Reasoning: predicate logic 8. Reasoning: first order logic 9. Reasoning: inference in first order logic (forward-backward chaining untuk inference) 10. Learning Probabilistic Models (1+2) 2
Latar Belakang • Learning Probabilistic Models = solusi yang diambil dari ketidakpastian kondisi • Kurangnya informasi di problem nyata • Solusi dari problem nyata dapat ditemukan jika ada pembelajaran kemungkinan kejadian • Langkah praktis: buat representasi problem nyata menjadi model dugaan berdasarkan probabilitas (probabilistic inference ) • Pendekatan dasar: teknik Bayesian 3
Ketidakpastian (Uncertainty) Didefinisikan At = ke bandara t menit sebelum penerbangan Apakah kejadian At membuat penumpang tidak terlambat? Permasalahan: jalan macet, ban bocor dan semacamnya Kemungkinan dugaan kejadian: “A25 memungkinkan penumpang tidak terlambat penerbangan JIKA tidak ada kecelakaan di jalan, tidak ada kemacetan, tidak ada hujan, tidak terjadi ban bocor, dll.” (A1440 juga memungkinkan tepat waktu tetapi harus bermalam hari sebelumnya di hotel bandara …)
Keputusan dengan Ketidakpastian Kondisi Semisal dugaan kemungkinan P(A25 jadi on time | …) P(A90 jadi on time | …) P(A120 jadi on time | …) P(A1440 jadi on time | …)
= 0.04 = 0.70 = 0.95 = 0.9999
Keputusan yang diambil bergantung pada preferences untuk missing flight vs. waktu tunggu di bandara, etc. – Utility theory is used to represent and infer preferences – Decision theory = probability theory + utility theory
Probabilitas • Probabilitas koin (bagian depan, head, dan belakang, tail)
1 P(COIN = tail) = 2 1 P(tail) = 2
6
Probabilitas • Probabilitas penderita kanker
P(has cancer) = 0.02 Þ P(Ø has cancer) = 0.98
7
Probabilitas Gabung, Joint Probability • Variasi kejadian (event): cancer, test result – Berdasarkan data di lapangan (misal 1000 pasien)
P(has cancer, test positive) Has cancer?
Test positive?
P(C,TP)
yes
yes
0.018
yes
no
0.002
no
yes
0.196
no
no
0.784
18 org + dari hasil tes
784 org - dari hasil tes
8
Probabilitas Bersyarat, Conditional Probability • Pertanyaan diagnosa: seberapa mungkin indikasi kanker ditemukan dengan tes?
P(has cancer | test positive) = ? Has cancer?
Test positive?
P(TP, C)
yes
yes
0.018
yes
no
0.002
no
yes
0.196
no
no
0.784
P(C | TP) = P(C, TP) / P(TP) = 0.018 / 0.214 = 0.084 9
Bayes Network (BN) vs Independence MODEL BN: P(cancer) and P(Test positive | cancer)
P(C, TP ) P(C) P(TP ) Independence
Bayes Network
PREDICTION BN: Hitung P(Test positive)
Cancer
DIAGNOSTIC REASONING BN: Hitung P(Cancer | test positive)
Cancer
versus
Test positive
Test positive 10
Notasi Grafik: Coin • N independent coin flips
X1
X2
Xn
• Tidak ada ketergantungan untuk setiap kejadian pelemparan: absolute independence 11
Example: Coin Flips X1
X2
Xn
h
0.5
h
0.5
h
0.5
t
0.5
t
0.5
t
0.5
Only distributions whose variables are absolutely independent can be represented by a Bayes’ net with no arcs. 12
Notasi Grafik: Traffic • Variables: – R: It rains – T: There is traffic
R
• Model 1: independence • Model 2 (better): rain causes traffic
T
• Model 2 lebih baik karena lebih sesuai dengan kondisi nyata 13
Example: Traffic R
+r
T r
+r
1/4
r
3/4
+t
3/4
t
1/4
+t
1/2
t
1/2
14
Independence • 2 variabel disebut independent jika: – Dengan kata lain:
– Tertulis :
15
Contoh: Independence • N pelemparan koin (fair, independent): h
0.5
h
0.5
h
0.5
t
0.5
t
0.5
t
0.5
16
Contoh Lain: Independence? T = temperature W = weather
T
P
warm
0.5
cold
0.5
T
W
P
T
W
P
warm
sun
0.4
warm
sun
0.3
warm
rain
0.1
warm
rain
0.2
cold
sun
0.2
cold
sun
0.3
cold
rain
0.3
cold
rain
0.2
W
P
sun
0.6
rain
0.4 17
Contoh Bayes Network: Sakit Gigi P(cavity) Cavity
P(catch | cavity)
1 parameter
P(toothache | cavity)
2 parameters
2 parameters
Catch
Headache
Versus: 23-1 = 7 parameters
P(toothache) = 0.108 + 0.012 + 0.016 + 0.064 = 0.2
18
Conditional Independence • P(Toothache, Cavity, Catch) has 23 – 1 = 7 independent entries • If I have a Toothache, a dental probe might be more likely to catch • But: if I have a cavity, the probability that the probe catches doesn't depend on whether I have a toothache: – P(+catch | +toothache, +cavity) = P(+catch | +cavity)
• The same independence holds if I don’t have a cavity: – P(+catch | +toothache, cavity) = P(+catch| cavity)
• Catch is conditionally independent of Toothache given Cavity: – P(Catch | Toothache, Cavity) = P(Catch | Cavity)
• Equivalent statements: – P(Toothache | Catch , Cavity) = P(Toothache | Cavity) – P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch | Cavity) – One can be derived from the other easily
• Tertulis : 19
Contoh Bayes Network: Rumput & Air
20
Contoh Bayes Network: Rumput & Air • Apabila fakta bahwa kondisi rumput adalah basah, maka bagaimana kemungkinan terjadinya sprinkler menyala, dan kemungkinan terjadinya hujan. normalizing constant
kemungkinan
Contoh Bayes Network: Car
22
Contoh: Alarm Network • Variables – B: Burglary – A: Alarm goes off – M: Mary calls – J: John calls – E: Earthquake!
Earthquake
Burglary
Alarm
John calls
P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? Yes P(B | A, J, M) = P(B)? No P(E | B, A ,J, M) = P(E | A)? No P(E | B, A, J, M) = P(E | A, B)? Yes
Mary calls
23
Example: Alarm Network B
P(B)
+b
0.001
b
0.999
Burglary
Earthqk
E
P(E)
+e
0.002
e
0.998
B
E
A
P(A|B,E)
+b
+e
+a
0.95
+b
+e
a
0.05
+b
e
+a
0.94
Alarm
John calls
Mary calls
A
J
P(J|A)
A
M
P(M|A)
+b
e
a
0.06
+a
+j
0.9
+a
+m
0.7
b
+e
+a
0.29
+a
j
0.1
+a
m
0.3
b
+e
a
0.71
a
+j
0.05
a
+m
0.01
b
e
+a
0.001
a
j
0.95
a
m
0.99
b
e
a
0.999
Bayes Net Semantics • A set of nodes, one per variable X
• A directed, acyclic graph
A1
An
• A conditional distribution for each node – A collection of distributions over X, one for each combination of parents’ values
X – CPT: conditional probability table – Description of a noisy “causal” process
A Bayes net = Topology (graph) + Local Conditional Probabilities 25
Probabilities in BNs • Bayes nets implicitly encode joint distributions – As a product of local conditional distributions – To see what probability a BN gives to a full assignment, multiply all the relevant conditionals together:
– Example:
• This lets us reconstruct any entry of the full joint • Not every BN can represent every joint distribution – The topology enforces certain conditional independencies 26
Bayes’ Nets • A Bayes’ net is an efficient encoding of a probabilistic model of a domain • Questions we can ask: – Inference: given a fixed BN, what is P(X | e)? – Representation: given a BN graph, what kinds of distributions can it encode? – Modeling: what BN is most appropriate for a given domain? 27
Causal Chains • This configuration is a “causal chain” X: Low pressure
X
Y
Z
Y: Rain
Z: Traffic
– Is X independent of Z given Y?
Yes! – Evidence along the chain “blocks” the influence 28
Common Cause • Another basic configuration: two effects of the same cause
Y
– Are X and Z independent? – Are X and Z independent given Y?
X
Z
Y: Alarm
X: John calls Z: Mary calls
Yes! – Observing the cause blocks influence between effects. 29
Common Effect • Last configuration: two causes of one effect (v-structures) – Are X and Z independent? • Yes: the ballgame and the rain cause traffic, but they are not correlated • Still need to prove they must be (try it!)
– Are X and Z independent given Y? • No: seeing traffic puts the rain and the ballgame in competition as explanation?
– This is backwards from the other cases
X
Z
Y X: Raining Z: Ballgame
Y: Traffic
• Observing an effect activates influence between possible causes. 30
The General Case • Any complex example can be analyzed using these three canonical cases • General question: in a given BN, are two variables independent (given evidence)? • Solution: analyze the graph
31
Reachability • Recipe: shade evidence nodes L
• Attempt 1: Remove shaded nodes. If two nodes are still connected by an undirected path, they are not conditionally independent
R
B
• Almost works, but not quite – Where does it break? – Answer: the v-structure at T doesn’t count as a link in a path unless “active”
D
T
32
Reachability (D-Separation) • Question: Are X and Y conditionally independent given evidence vars {Z}? – Yes, if X and Y “separated” by Z – Look for active paths from X to Y – No active paths = independence!
• A path is active if each triple is active:
– Causal chain A B C where B is unobserved (either direction) – Common cause A B C where B is unobserved – Common effect (aka v-structure) A B C where B or one of its descendents is observed
• All it takes to block a path is a single inactive segment
Active Triples
Inactive Triples
Example
Yes
R
B
T
T’
34
Example L Yes R
Yes
D
B
T
Yes T’ 35
Example • Variables: – R: Raining – T: Traffic – D: Roof drips – S: I’m sad
R
T
• Questions:
D
S Yes 36
Causality? • When Bayes’ nets reflect the true causal patterns: – Often simpler (nodes have fewer parents) – Often easier to think about – Often easier to elicit from experts
• BNs need not actually be causal – Sometimes no causal net exists over the domain – End up with arrows that reflect correlation, not causation
• What do the arrows really mean? – Topology may happen to encode causal structure – Topology only guaranteed to encode conditional independence
37