Jasper de Bock’s PhD thesis on Credal Networks under Epistemic Irrelevance

On 13 May 2015, after four years of intensive research under the enthousiastic supervision of Gert de Cooman, I succesfully defended my PhD Thesis, entitled “Credal Networks under Epistemic Irrelevance: Theory and Algorithms”. The jury was composed of Fabio Cozman, Enrique Miranda, Serafín Moral, Joris Walraevens, Dirk Aeyels, Dries Benoit, Jan Van Campenhout and Rik Van de Walle.

Very brief summary

My dissertation presents a detailed study of credal networks under epistemic irrelevance [1], which are probabilistic graphical models that can compactly and intuitively represent the uncertainty that is associated with the key variables in some domain, and which can then be used to answer various domain-specific queries (compute inferences) that are of interest to the user. They share many of the nice features of Pearl’s celebrated Bayesian networks [2], but have the added advantage that they can represent uncertainty in a more flexible and realistic way.

Unlike credal networks under strong independence, which generalise Bayesian networks in a similar way, very little was so far known about credal networks under epistemic irrelevance and their properties, and efficient inference was possible only for specific inferences in networks with a tree structure [3]. The goal of my PhD was to change this, that is, to study the properties of credal networks under epistemic irrelevance, and to use these properties to develop efficient inference algorithms for them. If you don’t feel like reading through the more detailed summary that I am about to provide, let me just say that I did indeed achieve this goal: I have unraveled the theoretical properties of credal networks under epistemic irrelevance and I have developed multiple efficient exact inference algorithms for them.

Modelling uncertainty

Since a credal network under epistemic irrelevance is a special type of imprecise probability model, my dissertation starts with a general introduction to the theory of imprecise probabilities [4]. Simply put, whenever it is infeasible to reliably estimate a single probability, this theory allows for the use of a set of probabilities instead, each of whose elements is regarded as a candidate for some ideal ‘true’ probability. However, this simplified view is only one of the many ways to look at or interpret imprecise probabilities. In fact, uncertainty can be expressed without any reference to probabilities, using other imprecise-probabilistic frameworks such as sets of desirable gambles, lower previsions and sets of linear previsions. In the first part of my dissertation, I provide a detailed overview of these different frameworks and their interpretation, and discuss how they are connected to each other. I pay special attention to conditional models, which I regard as primitive concepts whose connection with unconditional models should be established by means of rationality criteria. The main advantage of the resulting so-called coherent conditional models is that they do not suffer from the traditional problems that arise when some of the conditioning events have probability zero. This is especially important in the context of imprecise probabilities, where probability zero cannot be ignored because it may easily happen that an event has lower probability zero but positive upper probability.

Updating and conditioning

Although my overview of imprecise probability theory contains new results that fill some gaps in the literature, its contribution mainly consists in bringing together results from various existing frameworks and connecting them to each other.The first real contribution of this dissertation is my discussion of updating, which is the act of changing a model based on the information that some event has occurred. In probability theory, it has become standard practice to do this by conditioning on that event using Bayes’s rule. Similarly, in an imprecise-probabilistic setting, updating is typically performed by applying an imprecise conditioning rule such as regular or natural extension. However, little argumentation is usually given as to why such an approach would make sense. I help address this problem by providing a firm philosophical justification for using natural and regular extension as updating rules. What makes this justification especially powerful is that I derive it directly in terms of sets of desirable gambles. In this way, I avoid making some of the unnecessarily strong assumptions that are traditionally adopted, such as the existence of an ideal ‘true’ probability mass function.

Multivariate models

In order to apply imprecise probabilities in a multivariate context, additional tools are needed, such as marginalisation, as well as ways of combining these tools with concepts such as conditioning and updating. All of this is well known and relatively easy in terms of probabilities, but it becomes more challenging for some of the other imprecise-probabilistic frameworks that I consider. My dissertation devotes a complete chapter to these issues. I gather the existing tools, add new ones whenever something is missing, and connect all of them with one another. The result is a complete and well-founded theory of multivariate imprecise probabilities that is, to the best of my knowledge, novel in its completeness, generality and consistency. Using this theory, I then formally introduce one of the most important concepts of my dissertation: epistemic irrelevance, which is an imprecise-probabilistic generalisation of independence that is assymetric. I recall several existing definitions for this notion, argue why only one of them is really adequate, and compare epistemic irrelevance to other imprecise-probabilistic independence notions. Finally, I explain how structural assessments such as epistemic irrelevance can be combined with direct or local partial probability assessments to construct a multivariate uncertainty model.

Credal networks under epistemic irrelevance

The rest of my dissertation is concerned with one particular type of multivariate uncertainty model: the irrelevant natural extension of a credal network under epistemic irrelevance. The basic idea is very similar to that of a Bayesian network. The starting point is a collection of domain-specific variables that are connected by means of arrows that reflect how these variables depend on each other. The arrows form a Directed Acyclic Graph (DAG), which simply means that there are no directed cycles. The interpretation of the graph is that for any variable, conditional on its parents, its non-parent non-descendants are epistemically irrelevant. Each variable also has a given local imprecise-probabilistic model, conditional on the values of its parents in the graph. In combination with the assessments of epistemic irrelevance that correspond to the DAG, these local models form a credal network under epistemic irrelevance. The most conservative global uncertainty model that is compatible with all these assessments is called the irrelevant natural extension of the network. This concept was first introduced by Cozman [1], who defined it in terms of sets of probabilities under the simplifying assumption that all probabilities are strictly positive. I drop this positivity assumption and provide definitions in terms of three other frameworks as well: sets of desirable gambles, lower previsions and sets of linear previsions. These different definitions turn out to be closely related, which allows me to translate results that are proved in one framework to analogous results in other frameworks.

Why I studied this type of credal networks

Credal networks under epistemic irrelevance are not the only imprecise-probabilistic generalisations of Bayesian networks. In fact, they are not even all that popular. Most authors prefer to consider a different type of credal networks, called credal networks under strong independence. I believe that the main reason for this lack of popularity of credal networks under epistemic irrelevance has been a profound lack of known theoretical properties.  Surely, this has severely inhibited the development of tractable inference algorithms. In fact, untill now, only one inference algorithm was available, and even then, only for a particular type of inference and for networks whose DAG has a tree structure [3]. Nevertheless, due to the remarkable efficiency of this particular algorithm, which is linear in the size of the network, and because that same inference problem is NP-hard in credal networks under strong independence [5], credal networks under epistemic irrelevance are regarded as a promising alternative that requires—and deserves—further research [6, Section 10.6]. This further research is what my dissertation is all about.

Theoretical properties

One of the main contributions of my dissertation is a detailed study of the theoretical properties of the multivariate uncertainty model that corresponds to a credal network under epistemic irrelevance: the irrelevant natural extension. By focusing on the framework of sets of desirable gambles, I was able to derive some remarkable properties of this model, which I then managed to translate to other frameworks as well. A first important example is a fundamental separating hyperplane result that establishes a connection between the irrelevant natural extension of a complete network and that of its subnetworks. This result leads to various marginalisation, factorisation and external additivity properties. A second important result is that the irrelevant natural extension satisfies a collection of epistemic irrelevancies that is induced by AD-separation, an asymmetric adaptation of d-separation that is proved to satisfy all graphoid properties except symmetry. I also establish connections with the notions of independent natural extension and marginal extension and study the updated models that are obtained by applying regular extension to the irrelevant natural extension of a credal network.

Inference algorithms

In the final part of my dissertation, I show how the theoretical properties that I have proved can be used to develop efficient inference algorithms for credal networks under epistemic irrelevance. A first important contribution consists of two preprocessing techniques that can be used to simplify inference problems before the actual algorithm is applied. I explain how and when it is possible to translate an inference problem in a large network into a similar problem in a smaller network, and show how solving a conditional inference problem can be reduced to solving a series of unconditional ones. In a second set of results, I rephrase inference as a linear optimisation problem. As was already mentioned by Cozman [1], every unconditional inference can be computed by solving a linear program. However, in order to establish this result, he required a simplifying positivity assumption. I show that this positivity assumption is not needed; unconditional inferences can always be characterised as the solution of a linear program. For conditional inferences, multiple such linear programs need to be solved. Unfortunately, the size of these linear programs is exponential in the size of the network and this in principle generally applicable method is therefore only tractable for small networks. For the specific case of a network that consists of two disconnected binary variables, I was able to solve the corresponding linear program symbolically. In this way, I obtained closed-form expressions for the extreme points of the independent natural extension of two binary models. Fortunately, the intractability of brute force linear programming methods can often be circumvented by developing other, more efficient and often recursive computational techniques. I illustrate this by means of a number of examples. My most important algorithmic contribution, and the proverbial icing on the cake, is a collection of recursive algorithms that can efficiently compute various inferences in credal networks under epistemic irrelevance whose graphical structure is a recursively decomposable DAG, which is a new type of DAG that includes trees as a special case.

Conclusion

The main conclusion of my dissertation is that credal networks under epistemic irrelevance satisfy surprisingly many powerful theoretical properties, and that these properties can be exploited to develop efficient exact inference algorithms, for large classes of inference problems that were previously presumed to be intractable. Since many of these inference problems are NP-hard in credal networks under strong independence, our results turn credal networks under epistemic irrelevance into a serious, practically feasible alternative type of credal network that should enable practitioners to solve real-life problems for which the corresponding necessary inferences were hitherto regarded as intractable.

References

[1] Fabio G. Cozman. Credal networks. Artificial Intelligence, 120(2):199– 233, 2000.
[2] Judea Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Mateo, 1988.
[3] Gert de Cooman, Filip Hermans, Alessandro Antonucci, and Marco Zaffalon. Epistemic irrelevance in credal nets: the case of imprecise Markov trees. International Journal of Approximate Reasoning, 51(9):1029–1052, 2010.
[4] Peter Walley. Statistical reasoning with imprecise probabilities. Chapman and Hall, London, 1991.
[5] Denis D. Mauá, Cassio P. de Campos, Alessio Benavoli, and Alessandro Antonucci. Probabilistic inference in credal networks: new complexity results. Journal of Artificial Intelligence Research, 50:603–637, 2014.
[6] Alessandro Antonucci, Cassio P. de Campos, and Marco Zaffalon. Probabilistic graphical models. In Thomas Augustin, Frank P. A. Coolen, Gert de Cooman, and Matthias C. M. Troffaes, editors, Introduction to Imprecise Probabilities, pages 207–229. John Wiley & Sons, Chichester, 2014.

About the author

Jasper De Bock is currently a Post-Doctoral Researcher of the Research Foundation – Flanders (FWO), working at the Electronics and Information Systems department of Ghent University. On 13 May 2015, he obtained his PhD in Mathematical Engineering at the same university.

Ignacio Montes’s PhD thesis on Comparison of alternatives under Uncertainty and imprecision

This thesis, supervised by Enrique Miranda and Susana Montes, was defended on May 16th. The jury was composed of Susana Díaz, Serafín Moral and Bernard De Baets.

Summary

This thesis deals with the problem of comparing alternatives defined under some lack of information, that is considered to be either uncertainty, imprecision or both together.

  •  Alternatives defined under uncertainty are modeled by means of random variables, and therefore they are compared using stochastic orders [4].
  • When the alternatives are defined under both uncertainty and imprecision, they are modeled by means of sets of random variables, and tools of the Imprecise Probability Theory [7] are used.
  • When the alternatives to be compared are defined under imprecision, but without uncertainty, they can be modeled using fuzzy sets or any of its extensions, like for instance Atanassov Intuitionistic Fuzzy Sets [1].

Comparison of random variables

Stochastic orders are methods that allow the comparison of random quantities. In this thesis we have focused on stochastic dominance and statistical preference, two stochastic orders that possess totally different interpretations. The former is based on the direct comparison of the cumulative distribution functions associated with the random variables, so it only uses marginal information, while the latter is based on a probabilistic relation providing preference degrees and using the joint distribution.

Although these stochastic orders are not related in general, we have found conditions under which stochastic dominance implies statistical preference. These conditions are based on the type of variables and the copula [5] that links them: independent random variables, simple or continuous and comonotone random variables, simple or continuous and countermonotone random variables or continuous random variables coupled by an Archimedean copula.

Both stochastic dominance and statistical preference are intended for the pairwise comparison of random variables. The reason is that stochastic dominance imposes a too strong condition for comparing three (or more) cumulative distribution functions (they must be ordered!)  and statistical preference does not prevent the existence of cycles, that is, it is not transitive [2]. For this reason we have proposed an extension of statistical preference for the comparison of more than two random variables at the same time. It preserves the same advantages than the pairwise statistical preference: it is a complete order, it provides degrees of preference, so the greater the degree the most preferred is the variable, and it uses all the available information because it is based on the joint distribution of all the random variables. Furthermore, we have shown that this general statistical preference can be applied to decision making problems with linguistic labels.

Comparison of sets of random variables

When the alternatives are defined under both uncertainty and imprecision, they are modeled by means of sets of random variables with an epistemic interpretation, meaning that all we know about the real (but unknown) random variable is that it belongs to the set.

In this framework, the first thing we have done is to extend stochastic orders to the comparison of sets of random variables instead of single ones. Any stochastic order has been extended in six different ways, and some of the choice between these extensions depends on the given interpretation (pessimistic or optimistic). We have seen that when the stochastic order to be extended is the expected utility, our six extensions are quite related to some usual criteria of the decision making with imprecise probabilities [6], such as interval dominance, maximax or maximin criteria,…

When the stochastic order to be extended is stochastic dominance, the comparison of the set of random variables is made by means of the comparison of their associated sets of cumulative distribution functions. In this sense, since each set of cumulative distribution functions can be summarized by means of its associated p-box, we have seen that there is a strong connection between the six extensions of stochastic dominance and the comparison of the bounds of the associated p-boxes by means of the stochastic dominance. Finally, when we extend statistical preference, the extensions are related to the comparison of the lower or upper medians of the adequate set of random variables.

Our general approach can be applied to three particular situations:

1)    When we want to compare belief functions, we can consider their associated credal sets, and apply there our six extensions of stochastic dominance. In this framework, our six extensions give rise to the four possibilities considered by Denouex in [3] for the comparison of belief functions with respect to stochastic dominance.

2)    When we want to compare random variables with imprecise utilities, we can consider random sets modeling our lack of information. Then, since we are using an epistemic interpretation, all we know about the real (but unknown) random variables is that they belong to the sets of measurable selections. Thus, in order to compare the random sets we can compare their sets of measurable selections by means of the extension of the adequate stochastic order.

3)    When we want to compare random variables and we have imprecise knowledge about the probability of the initial space, we can consider a credal set. In this situation, any random variable defines a set of random variables formed by the combinations of the random variable with the different probabilities in the credal set.

Next step was to investigate how to perform the comparison of random variables with imprecise joint distribution. In this situation we use p-boxes and bivariate p-boxes to model the imprecise knowledge about the marginal and joint distribution functions, respectively, and we use sets of copulas, whose information is summarized by an imprecise copula, to model the unknown copula. Sklar’s Theorem [5] is a well-known result on Probability Theory that allows expressing the joint distribution function in terms of the marginals. However, when there is imprecision about the joint (or the marginals) distribution function, it cannot be applied. For this reason, we have extended Sklar’s Theorem to an imprecise setting. This result has shown to be very useful in two situations:

1)    When we have two marginal p-boxes and we want to compute their natural extension, we can apply the imprecise Sklar Theorem to the marginal with the imprecise copula determined by Lukasiewicz’s and  minimum copulas.

2)    When we want to compute the strong product of the marginal p-boxes, we can apply the imprecise Sklar Theorem to the marginals with the product copula.

Comparison of intuitionistic fuzzy sets

Finally, when the alternatives to be compared are defined under imprecision, but without uncertainty, they can be modeled using fuzzy sets or any of its extensions, like for instance Atanassov Intuitionistic Fuzzy Sets (AIFS, for short). Several measures of comparison of this kind of sets can be found in the literature, and they are classified into two main families: distances and dissimilarities. This thesis introduces a new family of measures of comparison of AIFS: divergences, that impose stronger conditions than dissimilarities. We have also seen that these three measures can be included in a more general measure of comparison of AIFS, in the sense that depending on the required conditions, we may obtain a distance, a dissimilarity or a divergence.

A particular type of divergences, those satisfying a local property, is studied in detail, shown to have interesting properties, and applied to decision making and pattern recognition.

Conclusions

The main contributions of this thesis can be summarized as follows:

  • Investigation of the properties of stochastic dominance and statistical preference. In particular, study of conditions under which both are related.
  • Extension of statistical preference for the comparison of more than two random variables simultaneously.
  • Extension of stochastic orders to the comparison of sets of random variables instead of single ones.
  • Particular cases: imprecise stochastic dominance, related to the comparison of bounds of p-boxes, and imprecise statistical preference, related to the comparison of lower and upper medians.
  • The definitions introduced by Denoeux are included as particular case of our more general approach.
  • Two particular situations in decision making: comparison of random variables with imprecise utilities and beliefs.
  • Imprecise version of Sklar’s Theorem.
  • Divergences as a new measure of comparison of AIFS.
  • Local divergences applied to decision making and pattern recognition.

Basic references

[1] K. Atanassov. Intuitionistic Fuzzy Sets. Fuzzy Sets and Systems, 20:87-96, 1986.

[2] B. De Schuymer, H. De Meyer, B. De Baets. A fuzzy approach to stochastic dominance of random variables. Lecture Notes in Artificial Intelligence 2715, 253-260, 2003.

[3] T. Denoeux. Extending stochastic ordering to belief functions on the real line. Information Sciences, 179: 1362-1376.

[4] A. Müller and D. Stoyan. Comparison Methods for Stochastic Models and Risks. Wiley, 2002.

[5] R. Nelsen. An introduction to copulas. Springer, New York, 1999.

[6] M. C. M. Troffaes. Decision making under uncertainty using imprecise probabilities. International Journal of Approximate Reasoning, 45(1):17-29, 2007.

[7] P. Walley. Statistical reasoning with imprecise probabilities. Chapman and Hall, London, 1991.

About the author

 

Ignacio Montes started his PhD in the UNIMODE Research Unit (http://unimode.uniovi.es) in 2010 with a grant of the Spanish Ministry of Science. He is currently a member of the Dep. of Statistics and O.R of the University of Oviedo, Spain.

 

PhD thesis Regression analysis with imprecise data by Andrea Wiencierz

My PhD thesis deals with the statistical problem of analyzing the
relationship between a response variable and one or more explanatory
variables when these quantities are only imprecisely observed.

Regression methods are some of the most popular and commonly
employed methods of statistical data analysis. Like most statistical
tools, regression methods are usually based on the assumption that
the analyzed data are precise and correct observations of the
variables of interest. In statistical practice, however, often only
incomplete or uncertain information about the data values is
available. In many cases, the incomplete or uncertain information
about the precise values of interest can be expressed by subsets of
the observation space. For example, interval-censored and rounded
data can be represented by intervals. As the representation by
subsets allows considering many different forms of uncertainty about
data within the same framework, this representation is adopted in my
thesis and set-valued observations of real-valued variables are
simply called imprecise data.

 

 

 

 

 

The aim of my PhD research was to find a regression method that provides reliable insights about the analyzed relationship, even if the variables are only imprecisely observed.

After a review of different approaches proposed in the literature, in my thesis, I present the likelihood-based approach to regression with imprecisely observed variables that we developed in Cattaneo and Wiencierz (2012) and that is named Likelihood-based Imprecise Regression (LIR). In the LIR framework, the regression problem is formalized as a decision problem whose actions are the possible regression functions, whose states are the considered probability distributions, and whose loss function is usually a characteristic of the residuals’ distribution. The LIR methodology consists in
determining likelihood-based confidence regions for the loss of the regression problem on the basis of imprecise data and in regarding the set of all regression functions that are not strictly dominated as the imprecise result of the regression analysis. The confidence
regions consist of the loss values associated with all probability measures that are (to a chosen degree) plausible in the light of the imprecise data, where the relative plausibility of the considered probability measures is measured by the likelihood function induced by the observations. Given the imprecise decision criteria, the Interval Dominance principle is applied to identify the set-valued result. Hence, a LIR analysis usually yields an imprecise result, which can be interpreted as a confidence set for the unknown regression function.

From the general LIR methodology, a robust regression method was derived in Cattaneo and Wiencierz (2012), where quantiles of the residuals’ distribution are considered as loss. Furthermore, an exact algorithm to implement this regression method for the special case of simple linear regression with interval data was developed in Cattaneo and Wiencierz (2013) and implemented in an R package (Wiencierz, 2012). In my thesis, I present and discuss these in detail. Moreover, I study several statistical properties of the robust LIR method. It turns out that this LIR method is robust in terms of a high breakdown point and that it yields highly reliable results in the sense that the coverage probability of the resulting set of regression functions seems to be generally rather high.

In addition to the robust LIR method, I also investigate an alternative approach that was proposed in Utkin and Coolen (2011). This approach generalizes Support Vector Regression (SVR) to situations where the response variable is imprecisely observed. It
consists in applying a Minimin or a Minimax rule to the imprecise maximum likelihood estimates of the loss values associated with the regression functions, and in both cases yields a precise regression estimate. The set-valued decision criteria here consist of the loss values associated with all marginal probability distributions of the unobserved precise data that are compatible with the empirical distribution of the imprecise data. After discussing this approach in detail, I develop an alternative adaptation of SVR to this
situation by following the LIR approach, which further generalizes these methods. In contrast to the Minimin and Minimax methods, the LIR method for SVR usually yields a set-valued result, which appears to be more appropriate when dealing with imprecise data. Moreover, the LIR framework has the advantage that it allows reflecting the data imprecision and accounting for statistical uncertainty at the same time, since both types of uncertainty are expressed by the extent of the set-valued result of the regression analysis.

Finally, I apply the different regression methods to two practical data sets from the contexts of social sciences and winemaking, respectively. In both cases, the LIR analyses provide very cautious inferences.

References

  • M. Cattaneo and A. Wiencierz (2012). Likelihood-based Imprecise Regression. International Journal of Approximate Reasoning 53, 1137-1154.
  • M. Cattaneo and A. Wiencierz (2013). On the implementation of LIR: the case of simple linear regression with interval data. Computational Statistics, in press.
  • L. V. Utkin and F. P. A. Coolen (2011). Interval-valued Regression and Classification Models in the Framework of Machine Learning. Proceedings of the 7th International Symposium on Imprecise Probability: Theories and Applications. SIPTA. pp. 371-380.
  • A. Wiencierz (2012). linLIR: linear Likelihood-based Imprecise Regression. R-package.

About the author:

 

Andrea Wiencierz works in the Working Group Methodological Foundations of Statistics and their Applications at the Department of Statistics of the LMU Munich, Germany, where she defended her PhD thesis on December 13, 2013.

Denis D. Mauá’s PhD Thesis on Algorithms and Complexity Results for Discrete Probabilistic Reasoning Tasks

The Three Dots

My PhD thesis is about connecting three hard computational problems that arise in tasks involving graph-based probabilistic reasoning, namely, the problems of

Roughly speaking, in the MAP inference problem we seek the most
probable explanation of a complex phenomena represented as a Bayesian
network, a graph-based description of a multivariate joint probability
distribution where nodes are identified with random variables and local
conditional probability distributions. By extending a Bayesian network
with actions and utilities, we get an influence diagram. The problem of
planning with influence diagrams is to select a set of actions that
maximizes expected utility. A (strong) credal network is a Bayesian
network whose numerical parameters (the local conditional probability
distributions) have been imprecisely specified as convex sets (instead
point estimates). Belief updating in (strong) credal networks is the
task of computing tight upper and lower bounds on the (conditional)
probability of a certain event, and is largely equivalent to the problem
of assessing the sensitivity of a probabilistic inference in a Bayesian
network to global changes in the model parameters. The similarities and
differences among these three problems are more easily captured by the
following simple example in fault analysis.

Probabilistic Fault Analysis

Consider the simple electric circuit below, in which a light bulb is
powered by a battery according to the state of four switches.

electric circuit
Suppose we are interested in estimating the probable causes of the
light being off. By inspecting the circuit, we see that such an event
could have been caused by a burned light bulb, a depleted battery, or
simply because there is no power at the light bulb socket (among other
causes that we ignore). The last event could have been caused by an
interruption of the energy flow on the left or right trails; the trail
on the left is interrupted only if both switches 1 and 2 are off. The
right trail is interrupted if any of the switches 3 and 4 is off. Such a
collection of intercausal reasonings can be graphically represented by
the acyclic directed graph below.

graph-based causal reasoning
If we assign to each node in that graph a conditional probability
distribution of the corresponding event given its parent events (i.e.,
its immediate predecessors in the graph), the result is a fully
specified Bayesian network. If instead, we associate a closed and convex
set of probability distributions to each node and each configuration of
its parents, we obtain a (separately specified) credal network. To get
an influence diagram, we extend the Bayesian network with actions (for
example, possible fixes or information gathering routines) and utilities
(e.g., the costs of replacing the light bulb or the battery, or the loss
caused by having no light), as in the figure below.

influence diagram
Connecting the dots

At first sight, these three problems seem connected only by means of
their combinatorial or optimization nature, and by the use of graphs as
a concise yet intuitive representation of complex phenomena. Nevertheless, correspondences between instances of these problems have long been noticed in the literature. For instance, it was previously known that belief updating in strong credal networks can be reduced to MAP inference in Bayesian networks [Cano, Cano & Moral, 1994] and vice-versa [De Campos & Cozman, 2005]. De Campos & Ji [2008] showed that planning with influence diagrams can be reduced to belief updating in credal networks, while the converse was proved by Antonucci & Zaffalon [2008].

In my thesis, and jointly with Cassio de Campos and Marco Zaffalon,
we proved the last two missing correspondences, namely, that MAP
inference and planning with influence diagrams can be reduced to one
another. We now know that these three problems are strongly tied by
their computational equivalences. These equivalences increase the
algorithmic toolset available to each problem with the algorithms
developed for the other problems, and allow us to derive bounds on the
computational hardness of each problem. Moreover, they provide an
interesting view of each problem instance by the perspective of the
other corresponding problem instances. For example, Cano, Cano &
Moral [1994] reduced belief updating in strong credal networks to MAP
problem instances in order to use available algorithms for the
latter. In a similar fashion, De Campos & Ji [2008] reduced planning
in influence diagrams to belief updating in credal networks so that the
former problem could be solved using algorithms designed for the
latter. Antonucci & Zaffalon [2008] reduced belief updating in
credal networks to planning in influence diagrams in order to provide a
decision-theoretic view of credal networks. De Campos & Cozman
[2005] showed NP-hardness of belief updating in strong credal networks by a reduction from MAP inference. More recently, we were able to show that a
certain class of planning problems can be solved in polynomial time by
reducing them into instances of credal belief updating that are known to
be polynomial-time computable. Using the converse reduction, we were
able to prove the NP-hardness of structurally very simple instances of
credal belief updating. Notably, these correspondences allowed us to
extend De Campos’ proof of polynomial-time approximability of MAP
inference in Bayesian networks of bounded treewidth and variable
cardinality to planning in influence diagrams and belief updating in
strong credal networks. These are the first results concerning the
approximability of these problems. On the more practical side, we were
able to develop a unified algorithmic framework that approximately
solves any of the problems in polynomial time when both treewidth and
variable cardinality are small.

Conclusion

In summary, the main contributions of my work are:

  • a state-of-the-art anytime algorithm for MAP inference;
  • a state-of-the-art exact algorithm for strategy selection in influence diagrams;
  • a proof of NP-hardness of strategy selection in polytree-shaped influence diagrams of bounded treewidth, even in the approximate case;
  • an FPTAS for the strategy selection problem on diagrams of bounded
    treewidth and bounded cardinality;
  • a proof of NP-hardness of strategy selection even in polytree-shaped diagrams with only binary variables, and a result of tractability of the case in which there is a single value node;
  • a proof of NP-hardness of approximate belief updating in credal networks of bounded treewidth (and unbounded variable cardinality);
  • an FPTAS for belief updating in strong credal networks of bounded treewidth and variable cardinality.

About the author

Denis D. Mauá is currently a Post-Doctoral Researcher at the Decision Making Lab of the University of São Paulo. He obtained his PhD in Informatics from the University of Lugano (Università della Svizzera italiana) in Sep., 2013. From Sep., 2009 and Dec., 2013 he was a Research Fellow at IDSIA’s Imprecise Probability Group.