KI091322 Kecerdasan Buatan - Share ITS

KI091322 Kecerdasan Buatan Materi 13: Learning Probabilistic Models (Bayesian Network) Teknik Informatika, Fakultas Teknologi Informasi Institut Tekno...

3 downloads 604 Views 727KB Size
[AIMA] Russel, Stuart J., Peter Norvig, "Artificial Intelligence, A Modern Approach" 3rd Ed., Prentice Hall, New Jersey, 2010

KI091322 Kecerdasan Buatan Materi 13: Learning Probabilistic Models (Bayesian Network) Teknik Informatika, Fakultas Teknologi Informasi Institut Teknologi Sepuluh Nopember Surabaya 2012 Pengembangan Bahan Ajar sebagai Pendukung Student Centered-Learning (SCL) melalui e-Learning : Share ITS

Daftar Materi 1.

Pengenalan Intelligent Agents, Uninfomed Search (pencarian tanpa informasi tambahan) 2. Informed Search (Heuristics) 3. Local Search & Optimization (pencarian dengan optimasi) 4. Adversarial Search (pencarian dengan melihat status pihak lainnya) 5. Constraint Satisfaction Problems (pencarian dengan batasan kondisi) 6. Reasoning: propositional logic 7. Reasoning: predicate logic 8. Reasoning: first order logic 9. Reasoning: inference in first order logic (forward-backward chaining untuk inference) 10. Learning Probabilistic Models (1+2) 2

Latar Belakang • Learning Probabilistic Models = solusi yang diambil dari ketidakpastian kondisi • Kurangnya informasi di problem nyata • Solusi dari problem nyata dapat ditemukan jika ada pembelajaran kemungkinan kejadian • Langkah praktis: buat representasi problem nyata menjadi model dugaan berdasarkan probabilitas (probabilistic inference ) • Pendekatan dasar: teknik Bayesian 3

Ketidakpastian (Uncertainty) Didefinisikan At = ke bandara t menit sebelum penerbangan Apakah kejadian At membuat penumpang tidak terlambat? Permasalahan: jalan macet, ban bocor dan semacamnya Kemungkinan dugaan kejadian: “A25 memungkinkan penumpang tidak terlambat penerbangan JIKA tidak ada kecelakaan di jalan, tidak ada kemacetan, tidak ada hujan, tidak terjadi ban bocor, dll.” (A1440 juga memungkinkan tepat waktu tetapi harus bermalam hari sebelumnya di hotel bandara …)

Keputusan dengan Ketidakpastian Kondisi Semisal dugaan kemungkinan P(A25 jadi on time | …) P(A90 jadi on time | …) P(A120 jadi on time | …) P(A1440 jadi on time | …)

= 0.04 = 0.70 = 0.95 = 0.9999

Keputusan yang diambil bergantung pada preferences untuk missing flight vs. waktu tunggu di bandara, etc. – Utility theory is used to represent and infer preferences – Decision theory = probability theory + utility theory

Probabilitas • Probabilitas koin (bagian depan, head, dan belakang, tail)

1 P(COIN = tail) = 2 1 P(tail) = 2

6

Probabilitas • Probabilitas penderita kanker

P(has cancer) = 0.02 Þ P(Ø has cancer) = 0.98

7

Probabilitas Gabung, Joint Probability • Variasi kejadian (event): cancer, test result – Berdasarkan data di lapangan (misal 1000 pasien)

P(has cancer, test positive) Has cancer?

Test positive?

P(C,TP)

yes

yes

0.018

yes

no

0.002

no

yes

0.196

no

no

0.784

18 org + dari hasil tes

784 org - dari hasil tes

8

Probabilitas Bersyarat, Conditional Probability • Pertanyaan diagnosa: seberapa mungkin indikasi kanker ditemukan dengan tes?

P(has cancer | test positive) = ? Has cancer?

Test positive?

P(TP, C)

yes

yes

0.018

yes

no

0.002

no

yes

0.196

no

no

0.784

P(C | TP) = P(C, TP) / P(TP) = 0.018 / 0.214 = 0.084 9

Bayes Network (BN) vs Independence MODEL BN: P(cancer) and P(Test positive | cancer)

P(C, TP )  P(C)  P(TP ) Independence

Bayes Network

PREDICTION BN: Hitung P(Test positive)

Cancer

DIAGNOSTIC REASONING BN: Hitung P(Cancer | test positive)

Cancer

versus

Test positive

Test positive 10

Notasi Grafik: Coin • N independent coin flips

X1

X2

Xn

• Tidak ada ketergantungan untuk setiap kejadian pelemparan: absolute independence 11

Example: Coin Flips X1

X2

Xn

h

0.5

h

0.5

h

0.5

t

0.5

t

0.5

t

0.5

Only distributions whose variables are absolutely independent can be represented by a Bayes’ net with no arcs. 12

Notasi Grafik: Traffic • Variables: – R: It rains – T: There is traffic

R

• Model 1: independence • Model 2 (better): rain causes traffic

T

• Model 2 lebih baik karena lebih sesuai dengan kondisi nyata 13

Example: Traffic R

+r

T r

+r

1/4

r

3/4

+t

3/4

t

1/4

+t

1/2

t

1/2

14

Independence • 2 variabel disebut independent jika: – Dengan kata lain:

– Tertulis :

15

Contoh: Independence • N pelemparan koin (fair, independent): h

0.5

h

0.5

h

0.5

t

0.5

t

0.5

t

0.5

16

Contoh Lain: Independence? T = temperature W = weather

T

P

warm

0.5

cold

0.5

T

W

P

T

W

P

warm

sun

0.4

warm

sun

0.3

warm

rain

0.1

warm

rain

0.2

cold

sun

0.2

cold

sun

0.3

cold

rain

0.3

cold

rain

0.2

W

P

sun

0.6

rain

0.4 17

Contoh Bayes Network: Sakit Gigi P(cavity) Cavity

P(catch | cavity)

1 parameter

P(toothache | cavity)

2 parameters

2 parameters

Catch

Headache

Versus: 23-1 = 7 parameters

P(toothache) = 0.108 + 0.012 + 0.016 + 0.064 = 0.2

18

Conditional Independence • P(Toothache, Cavity, Catch) has 23 – 1 = 7 independent entries • If I have a Toothache, a dental probe might be more likely to catch • But: if I have a cavity, the probability that the probe catches doesn't depend on whether I have a toothache: – P(+catch | +toothache, +cavity) = P(+catch | +cavity)

• The same independence holds if I don’t have a cavity: – P(+catch | +toothache, cavity) = P(+catch| cavity)

• Catch is conditionally independent of Toothache given Cavity: – P(Catch | Toothache, Cavity) = P(Catch | Cavity)

• Equivalent statements: – P(Toothache | Catch , Cavity) = P(Toothache | Cavity) – P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch | Cavity) – One can be derived from the other easily

• Tertulis : 19

Contoh Bayes Network: Rumput & Air

20

Contoh Bayes Network: Rumput & Air • Apabila fakta bahwa kondisi rumput adalah basah, maka bagaimana kemungkinan terjadinya sprinkler menyala, dan kemungkinan terjadinya hujan. normalizing constant

kemungkinan

Contoh Bayes Network: Car

22

Contoh: Alarm Network • Variables – B: Burglary – A: Alarm goes off – M: Mary calls – J: John calls – E: Earthquake!

Earthquake

Burglary

Alarm

John calls

P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? Yes P(B | A, J, M) = P(B)? No P(E | B, A ,J, M) = P(E | A)? No P(E | B, A, J, M) = P(E | A, B)? Yes

Mary calls

23

Example: Alarm Network B

P(B)

+b

0.001

b

0.999

Burglary

Earthqk

E

P(E)

+e

0.002

e

0.998

B

E

A

P(A|B,E)

+b

+e

+a

0.95

+b

+e

a

0.05

+b

e

+a

0.94

Alarm

John calls

Mary calls

A

J

P(J|A)

A

M

P(M|A)

+b

e

a

0.06

+a

+j

0.9

+a

+m

0.7

b

+e

+a

0.29

+a

j

0.1

+a

m

0.3

b

+e

a

0.71

a

+j

0.05

a

+m

0.01

b

e

+a

0.001

a

j

0.95

a

m

0.99

b

e

a

0.999

Bayes Net Semantics • A set of nodes, one per variable X

• A directed, acyclic graph

A1

An

• A conditional distribution for each node – A collection of distributions over X, one for each combination of parents’ values

X – CPT: conditional probability table – Description of a noisy “causal” process

A Bayes net = Topology (graph) + Local Conditional Probabilities 25

Probabilities in BNs • Bayes nets implicitly encode joint distributions – As a product of local conditional distributions – To see what probability a BN gives to a full assignment, multiply all the relevant conditionals together:

– Example:

• This lets us reconstruct any entry of the full joint • Not every BN can represent every joint distribution – The topology enforces certain conditional independencies 26

Bayes’ Nets • A Bayes’ net is an efficient encoding of a probabilistic model of a domain • Questions we can ask: – Inference: given a fixed BN, what is P(X | e)? – Representation: given a BN graph, what kinds of distributions can it encode? – Modeling: what BN is most appropriate for a given domain? 27

Causal Chains • This configuration is a “causal chain” X: Low pressure

X

Y

Z

Y: Rain

Z: Traffic

– Is X independent of Z given Y?

Yes! – Evidence along the chain “blocks” the influence 28

Common Cause • Another basic configuration: two effects of the same cause

Y

– Are X and Z independent? – Are X and Z independent given Y?

X

Z

Y: Alarm

X: John calls Z: Mary calls

Yes! – Observing the cause blocks influence between effects. 29

Common Effect • Last configuration: two causes of one effect (v-structures) – Are X and Z independent? • Yes: the ballgame and the rain cause traffic, but they are not correlated • Still need to prove they must be (try it!)

– Are X and Z independent given Y? • No: seeing traffic puts the rain and the ballgame in competition as explanation?

– This is backwards from the other cases

X

Z

Y X: Raining Z: Ballgame

Y: Traffic

• Observing an effect activates influence between possible causes. 30

The General Case • Any complex example can be analyzed using these three canonical cases • General question: in a given BN, are two variables independent (given evidence)? • Solution: analyze the graph

31

Reachability • Recipe: shade evidence nodes L

• Attempt 1: Remove shaded nodes. If two nodes are still connected by an undirected path, they are not conditionally independent

R

B

• Almost works, but not quite – Where does it break? – Answer: the v-structure at T doesn’t count as a link in a path unless “active”

D

T

32

Reachability (D-Separation) • Question: Are X and Y conditionally independent given evidence vars {Z}? – Yes, if X and Y “separated” by Z – Look for active paths from X to Y – No active paths = independence!

• A path is active if each triple is active:

– Causal chain A  B  C where B is unobserved (either direction) – Common cause A  B  C where B is unobserved – Common effect (aka v-structure) A  B  C where B or one of its descendents is observed

• All it takes to block a path is a single inactive segment

Active Triples

Inactive Triples

Example

Yes

R

B

T

T’

34

Example L Yes R

Yes

D

B

T

Yes T’ 35

Example • Variables: – R: Raining – T: Traffic – D: Roof drips – S: I’m sad

R

T

• Questions:

D

S Yes 36

Causality? • When Bayes’ nets reflect the true causal patterns: – Often simpler (nodes have fewer parents) – Often easier to think about – Often easier to elicit from experts

• BNs need not actually be causal – Sometimes no causal net exists over the domain – End up with arrows that reflect correlation, not causation

• What do the arrows really mean? – Topology may happen to encode causal structure – Topology only guaranteed to encode conditional independence

37