PracMLN and Markov Logic Networks
May 26, 2018 15:25 · 510 words · 3 minute read
Markov Logic Networks are a powerful generalisation of Probabilistic and Logic Based Models. I first learnt about these just a few months back, when I learnt about PracMLN via GSoC. As far as I understand, a Markov Logic Network - MLN for short, consists of various statements in First-Order Logic, like any logical model does. What makes it special, however, and immensely more powerful, is that each statement in this model is assigned a weight. These weights can be translated into probabilities, and thus form a model of the world that can handle uncertainty. This is how MLNs model the world.
PracMLN
PracMLN is a statistical relational learning toolbox, written in Python. It was developed at the Institute for Artificial Intelligence, at the University of Bremen. The use of MLNs typically involves a two step process. In the first, an MLN is designed, and the training process assigns weights to various logical formulae based upon an evidence database. In the second, step, a trained MLN is queried to perform inference required for a specific task. PracMLN provides implementations of many algorithms that perform either exact or approximate learning and inference.
Python for AI
AI and Python mix very well together. As of May 2018, Python is the 4th most popular language on the TIOBE index. Within the data science and AI community, Python is by far the most popular language being used, with open-source frameworks such as Scikit-Learn, NumPy, and Pandas surging in popularity. It is worthy to note, however, that these libraries make extensive use of static compilation to improve performance. Thus, as part of the Google Summer of Code (GSoC) in 2018, we sought to bring the same speedup to PracMLN.
GSoC 2018
As part of this project, under GSoC 2018, I plan to port parts of PracMLN to Cython. Cython allows static compilation of Python code, and allows variables to be typed. This can dramatically reduce runtimes, if used effectively. I will be working under the guidance and mentorship of Daniel Nyga, the creater and maintainer of PracMLN.
Work Ahead and Anticipated Challenges
While programming may be considered easy, it is always challenging to write good code, that not only works, but works correctly and optimally. In the case of PracMLN, this is especially valid since the entire purpose of the project is to optimise the code for speed. While I have used Cython before, it has been in a very limited capacity, and never on a codebase so large and complicated. I suspect that the outcome of this project will depend significantly on my ability to understand the MLN learning and inference algorithms implemented in PracMLN, and my ability to type variables appropriately. To get some understanding of MLN theory, I am going to try to read Dominik Jain’s doctoral dissertation on Markov Logic Networks, available online.
This is a challenging GSoC project, but also an immense opportunity for me to learn the intricacies of MLNs, and of writing fast, high-quality code in Cython. I hope I will be able to rise up to the challenge!