introduction.tex

In \cite{haugeland1989artificial}, Haugeland introduced a popular categorization of AI methodologies: symbolic and sub-symbolic AI. He simultaneously coined the term Good Old Fashioned AI (GOFAI) for the former. Symbolic AI systems work with knowledge explicitly declared by humans using symbols in knowledge bases (KBs). Manipulation rules are applied to the elements of this KB to answer queries or solve problems. Sub-symbolic methodologies do not require humans to explicitly encode knowledge, but rather infer model structure and/or parameters from data in some way. Logic programming is a prominent form of symbolic AI while neural networks are renowned sub-symbolic AI models. \par
\cite{lighthill1973ai} infamously concluded that none of the discoveries made by 1973 in symbolic AI had produced the promised impact. After its publication, the UK government cut funding for all but two of its universities' AI research groups in a move that is widely regarded as the start of AI's first winter. Disillusionment with machine translation in particular was great. Interest in symbolic AI resurged in the 80s with expert systems. But when expert systems were found to lack common sense to a problematic degree, the second AI winter began and symbolic AI's reputation suffered damage lasting to today. Ever more researchers agreed with Lighthill's diagnosis and conclusions: symbolic AI suffered from a severe combinatorial complexity explosion problem both in KB creation and use due to its heavy reliance on search algorithms and researchers' failure to mitigate this issue meant symbolic AI was unlikely to ever scale up to real world problems. After the second AI winter, sub-symbolic AI methods would dominate new AI efforts and breakthroughs. In contrast to symbolic approaches, sub-symbolic methods had a hard and slow start but manage to tackle industrial challenges such as image classification, text generation and machine translation now after this decade's deep learning revolution. They have truly become excellent go-to methods for regression and classification problems. \par
However, in recent years researchers have come to reappreciate symbolic methods, as while they might not scale enough on their own they still achieved and offer several unique and powerful features and properties. For all the leaps forward in sub-symbolic AI, the incorporation of a priori knowledge is still very difficult and the ability to actively reason and explain this reasoning is still mostly missing. We should thus see symbolic and sub-symbolic methods not as competing with but possibly complementary to one another. Many works such as \cite{garcez2015neural} even already argue that combining symbolic and sub-symbolic ideas is a necessity to build AI that can effectively learn and reason. These claims are supported by independent observations such as the one by \cite{susskind2021neuro} that these combination methods dominate image and video question answering workloads. This view led to the subfield of neuro-symbolic integration. Many means of integration are being explored, but we are particularly interested in doing so through extending the paradigm of probabilistic logic programming (PLP) since it would aid generally interesting fields KU Leuven's DTAI Research Group (for whom this thesis is written) is working in like statistical relational learning (SRL) and probabilistic inference. \cite{ng1992probabilistic} already considered PLP, which could be a powerful SRL framework, in 1992. \cite{fierens2015inference} also discusses how, because both PLP and probabilistic inference can be reduced to the same problem of weighted \#SAT PLP is attractive for the latter given the potential ease of modelling problems in it. \par
Logic programs are KBs containing logical facts and rules (also known as predicates) declared in a formal logic programming language such as Prolog. To execute a logic program is to query the KB with a logical statement that might contain variables, i.e. a first-order logical statement. A logic program then returns whether or not the logical statement is true and optionally the set of variable assignments that make it so. Logic programs ideally have no procedural meaning but do in practice, where one has to consider the query evaluation algorithm used (such as Prolog's unification) to ensure the KB is written in a way that gives the correct results. \par
There are some extensions to logic programming which add additional semantics to facts, rules and query results, such as PLP. In PLP, facts and rules are assigned probabilities of holding and query results are extended by the probability of returned models. Our extension of interest here is the integration of neural networks in PLPs. \cite{manhaeve2018deepproblog} showed such an extension of ProbLog (a PLP language based on Prolog developed by KU Leuven's DTAI Research Group) in 2018 called DeepProbLog. DeepProbLog introduces neural predicates. Neural predicates take several input variables and an output variable. The output variable will be unified, given the instantiated input variables, with either an underlying neural network's output or a symbol from a set representing a classification made by an underlying neural network. In the latter case so called neural annotated disjunction (nAD) are created behind the scenes at query time to do this. When the parameters (such as unknown probabilities) of the DeepProbLog program are determined, the neural networks included in the model will also be trained based on the same queries, which now can also contain complex tensor inputs. \par
In appendix \ref{problog_deepproblog_essentials} we explain and clarify ProbLog and DeepProbLog essentials for interested readers who need more explanation. \par
DeepProbLog still faces a major challenge, though. While the use of deeper architectures has dramatically improved classification and regression performances, as \cite{guo2017calibration} points out this has come at the cost of model calibration. A model's calibration is its ability to associate probabilities to its predictions that match the ground truth likelihoods of it being correct. Good calibration is naturally crucial when incorporating neural networks into PLP (or other probabilistic models for that matter). In the case of DeepProbLog we are indeed primarily interested in the neural networks producing accurate probability distributions over target space given the inputs rather than concrete deterministic predictions, since we are performing probabilistic rather than deterministic queries. \par
In this work we aim to answer a two-fold research question: can applying a top performing calibration method to the (relevant parts of) a DeepProbLog model improve its query inference performance? And if so, how specifically should it be applied? We hope that the answer will aid and provide valuable heuristics and guidance on this aspect of neuro-symbolic integration to future system designers using DeepProbLog.