Propagating Belief Functions in AND Trees (Dempster-Shafer Theory)

Srivastava & Shenoy, 1995.

General propagation method is quite complex and can be simplified when the network is an AND-tree.

Consider simple evidential network with binary variables XX, O1O_1 and O2O_2 with frames (values) ΘX={x,x}\Theta_X=\{x, \sim x\}, ΘO1={o1,o1}\Theta_{O_1}=\{o_1, \sim o_1\}, ΘO2={o2,o2}\Theta_{O_2}=\{o_2, \sim o_2\}.

simple network

We assume O1,O2O_1, O_2 are related to XX through and AND node: X=xX=x iff O1=o1O_1=o_1 and O2=o2O_2=o_2. This relationship is incorporated by assuming that the frame of the relational node RR is:

ΘR={(x,o1,o2),(x,o1,o2),(x,o1,o2),(x,o1,o2)}\Theta_R=\{(x, o_1, o_2), (\sim x, \sim o_1, \sim o_2), (\sim x, o_1, \sim o_2), (\sim x, o_1, \sim o_2)\}

Evidence for a variable is represented by a basic probability assignment (bpa) function:

mX=(m(x),m(x),m({x,x}))m_X = \big(m(x), m(\sim x), m(\{x, \sim x\})\big)

We assumed one item of evidence for each variable. If more than one item of evidence bear on a node, we need to first combine the items of evidence using Dempster's rule of combination.

Recall from Dempster-Shafer theory

Vacuous Extension

Whenever a set of m-values is propagated from a smaller node (fewer variables) to a bigger node (more variables), the m-values are said to be vacuously extended onto the frame of the bigger node.

Suppose we have mO1(o1),mO1(o1),mO1({o1,o1})m_{O_1}(o_1), m_{O_1}(\sim o_1), m_{O_1}(\{o_1, \sim o_1\}) defined on the frame ΘO1={o1,o1}\Theta_{O_1} = \{o_1, \sim o_1\}

We want to vacuously extend them to a bigger node consisting of O1O_1 and O2O_2. Entire frame of combined node is given by cartesian product: ΘO1O2={(o1,o2),(o1,o2),(o1,o2),(o1,o2)}\Theta_{O_1O_2}=\{(o_1, o_2), (\sim o_1, o_2), (o_1, \sim o_2), (\sim o_1, \sim o_2)\}. The vacuous extension gives:

m({(o1,o2),(o1,o2)})=mO1(o1)m(\{(o_1, o_2), (o_1, \sim o_2)\}) = m_{O_1}(o_1)

m({(o1,o2),(o1,o2)})=mO1(o1)m(\{(\sim o_1, o_2), (\sim o_1, \sim o_2)\}) = m_{O_1}(\sim o_1)

m(ΘO1O2)=mO1(ΘO1) (entire frame=uncertainty)m(\Theta_{O_1O_2}) = m_{O_1}(\Theta_{O_1}) \text{ (entire frame=uncertainty)}

m-values for other subsets of ΘO1O2\Theta_{O_1O_2} are zero.

Marginalization

Propagating m-values from a node to a smaller node. Suppose we are marginalizing ΘO1O2\Theta_{O_1O_2} onto ΘO1\Theta_{O_1}. Similar to marginalization of probabilities, we sum all m-values over elements of the bigger frame that intersect with elements of the smaller frame:

m(o1)=m(o1,o2)+m(o1,o2)+m({(o1,o2),(o1,o2)})m(o_1) = m(o_1, o_2) + m(o_1, \sim o_2) + m(\{(o_1, o_2), (o_1, \sim o_2)\})

Same for o1\sim o_1 and {o1,o1}\{o_1, \sim o_1\}

Propagation in AND-tree

Note that:

  • evidence bearing directly on node XX will impact indirectly O1O_1 and O2O_2
  • evidence at O1O_1 or O2O_2 will not affect XX by themselves because of the AND
  • evidence at O1O_1 alone will not affect O2O_2 and vice-versa

These properties are special features of AND-trees: in general trees, each node is indirectly affected by the evidence at the other nodes (see Propagating belief functions with local computations, Shenoy and Shafer).

Let's denote:

mXT={mY variable Y}Xm_X^T = \oplus \{m_Y \forall \text{ variable } Y\}^{\downarrow X} (combine all nodes and marginalize on XX).

Goal is to compute mXTm_X^T for all nodes XX given mYm_Y for all nodes YY.

mX{O1,,On}m_{X\leftarrow \{O_1, \dots, O_n\}} denotes bpa function for XX representing marginal of the combination of bpa functions mOim_{O_i}.

Proposition 1

Propagation of m-values from sub-objectives OiO_i to main objective XX.

mXall O’s(x)=i=1nmOi(oi)m_{X\leftarrow \text{all O's}} (x) = \prod_{i=1}^n m_{O_i}(o_i)

mXall O’s(x)=1i=1n[1mOi(oi)]plausibility of oim_{X\leftarrow \text{all O's}} (\sim x) = 1-\prod_{i=1}^n \underbrace{[1-m_{O_i}(\sim o_i)]}_{\text{plausibility of }o_i}

mXall O’s({x,x})=1mXall O’s(x)mXall O’s(x)m_{X\leftarrow \text{all O's}}(\{x, \sim x\}) = 1-m_{X\leftarrow \text{all O's}} (x)-m_{X\leftarrow \text{all O's}} (\sim x)

Proof of Proposition 1

See paper

Proposition 2

Propagation of m-values to a given sub-objective OiO_i from the main objective XX and other subobjectives Oj,jiO_j, j\ne i.

m_{O_i\leftarrow \text{X & all other O's}}(o_i) = K_i^{-1} m_X(x)\prod_{j\ne i}[1-m_{o_j}(\sim o_j)]

m_{O_i\leftarrow \text{X & all other O's}} (\sim o_i) = K_i^{-1} m_X(\sim x)\prod_{j\ne i}m_{O_j}(o_j)

m_{O_i\leftarrow \text{X & all other O's}}(\{o_i, \sim o_i\}) = 1-m_{O_i\leftarrow \text{X & all other O's}}(o_i) - m_{O_i\leftarrow \text{X & all other O's}}(\sim o_i)

where KiK_i is the normalization constant given by Ki=[1mX(x)Ci]K_i=[1-m_X(x)C_i] and Ci=1ji[1moj(oj)]C_i = 1-\prod_{j\ne i}[1-m_{o_j}(\sim o_j)] (measure of conflict).

Proof of Proposition 2

See paper