The Logic Module

Jun 29, 2018 13:54 · 1321 words · 7 minute read

As mentioned in the last post, to convert EnumerationAsk to an extension type required the conversion of all its dependencies first. These included the logic module.

The complication

The <LOGIC> in the following figure refers not to a single file, but the entire logic subdirectory. The important files here are common.pyx which contains the Logic superclass and fol.py and fuzzy.py which contain the derived FirstOrderLogic and FuzzyLogic subclasses respectively.

Cython conversions required for EnumerationAsk

Speeding up Enumeration-Ask essentially required the logic related classes to be converted to extension types first. This involved two main challenges:

  1. Widespread use of inner classes.
  2. Use of multiple inheritance.

Handling these was quite difficult, and I will not be detailing all the various techniques tried, but will delve straight to the final solution actually used.

Inner Classes

Here is an idea of the inner classes present in Logic, in common.pyx. The FirstOrderLogic(Logic) and FuzzyLogic(Logic) subclasses have corresponding inner classes too.

... pracmln/python3/pracmln/logic $ cat common.py | grep class
class Logic(object):
    class Constraint(object):
    class Formula(Constraint): 
    class ComplexFormula(Formula):
    class Conjunction(ComplexFormula):
    class Disjunction(ComplexFormula):
    class Lit(Formula):
    class LitGroup(Formula):
    class GroundLit(Formula):
    class GroundAtom:
    class Equality(ComplexFormula):
    class Implication(ComplexFormula):
    class Biimplication(ComplexFormula):
    class Negation(ComplexFormula):
    class Exist(ComplexFormula):
    class TrueFalse(Formula):
    class NonLogicalConstraint(Constraint):
    class CountConstraint(NonLogicalConstraint):
    class GroundCountConstraint(NonLogicalConstraint):
# this is a little hack to make nested classes pickleable

The Error

The Cython compiler wouldn’t allow declaring any classes inside other cdef classes. Defining the Logic outer class with cdef, and then compiling common.pyx yields the following error:

... pracmln/python3/pracmln/logic $ python3 setup.py build_ext --inplace
Compiling common.pyx because it changed.
[1/1] Cythonizing common.pyx

Error compiling Cython file:
------------------------------------------------------------
...
        self.__dict__ = d
        self.grammar = eval(d['grammar'])(self)
        
    
#  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #  #   
    class Constraint(object):
   ^
------------------------------------------------------------

common.pyx:81:4: class definition not allowed here
Traceback (most recent call last):
  File "setup.py", line 5, in <module>
    ext_modules=cythonize( ['*.pyx'] )
  File "/home/kaivalya/ ... /python3.5/site-packages/Cython/Build/Dependencies.py", line 1026, in cythonize
    cythonize_one(*args)
  File "/home/kaivalya/ ... /python3.5/site-packages/Cython/Build/Dependencies.py", line 1146, in cythonize_one
    raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: common.pyx

I couldn’t find any mention online of a similar problem. Cython seemed to be disallowing a class definition inside another, yet the official user guide on extension types made no note of this except in the case of wrapping C++ code. A few arcane Google searches later, I hit upon this, self-admittedly unreliable resource from 2008, which reads

... if inner class support gets added to Cython (I don't think it is now? but not 100% sure), then ...

So apparently I had to rewrite the logic classes and remove the inner classes. It is important here to remember that Python supports inner classes, but they are not essential to the language. Anything that can be achieved with inner classes, can be achieved without them - they just increase code readability and organisation for humans.

After discussing with my mentor Daniel, I decided to remove the inner classes and readd them as instance attributes. I thus moved all the inner classes to a separate file, misc.pyx, and then imported them back for use inside Logic.

Pickle and Multiprocessing

This caused a problem and the original ‘hack’ introduced to allow pickling. This is more important than just an extra feature in PracMLN. The multicore versions of PracMLN use multiprocessing, and this requires pickle to work properly. Thus, this functionality was important to retain.

Python inner classes do not add any features to the languages, unlike say Java inner classes. As stated in this answer on StackOverflow, inner classes can just be declared at the same level as normal ‘outer’ classes, and then a reference to ech can be inside inside the ‘outer’ class. The equivalence between Python-with-inner-classes and Python-without-inner-classes isn’t theoretical, with both being Turing complete, but much simpler and more practical.

I thus moved the inner classes back into the common.pyx file, but this time at the outermost level. This further simplified the code structure and made the extra misc.pyx file redundant, while also automatically solving the python pickling problem.

Multiple Inheritance

This second issue was more challenging. There are many classes defined in the logic module that use Python’s multiple inheritance.

 ... pracmln/python3/pracmln/logic $ grep 'class.*[,].*:' *.py
fol.py:    class Formula(Logic.Formula, Constraint): 
fol.py:    class ComplexFormula(Logic.ComplexFormula, Formula): pass
fol.py:    class Lit(Logic.Lit, Formula): pass
fol.py:    class Litgroup(Logic.LitGroup, Formula): pass
fol.py:    class GroundLit(Logic.GroundLit, Formula):
fol.py:    class Disjunction(Logic.Disjunction, ComplexFormula):
fol.py:    class Conjunction(Logic.Conjunction, ComplexFormula):
fol.py:    class Implication(Logic.Implication, ComplexFormula):
fol.py:    class Biimplication(Logic.Biimplication, ComplexFormula):
fol.py:    class Negation(Logic.Negation, ComplexFormula): pass
fol.py:    class Exist(Logic.Exist, ComplexFormula): pass
fol.py:    class Equality(Logic.Equality, ComplexFormula): pass
fol.py:    class TrueFalse(Logic.TrueFalse, Formula):
fuzzy.py:    class Negation(Logic.Negation, ComplexFormula):
fuzzy.py:    class Conjunction(Logic.Conjunction, ComplexFormula):
fuzzy.py:    class Disjunction(Logic.Disjunction, ComplexFormula):
fuzzy.py:    class Implication(Logic.Implication, ComplexFormula):
fuzzy.py:    class Biimplication(Logic.Biimplication, ComplexFormula):
fuzzy.py:    class TrueFalse(Formula, Logic.TrueFalse):
fuzzy.py:    class Exist(Logic.Exist, Logic.ComplexFormula):

As the Cython extension types introduction notes,

Cython requires to know the complete inheritance hierarchy ... and restricts it to single inheritance.

So it was impossible to convert this code to Cython without an extensive amount of rewriting. This was a roadblock that was hard to get past.

Running the tests at this point in time … succeeded!

That there was no need for any modifications to counter this ‘problem’. The goal of this exercise was to eventually convert EnumerationAsk to an extension type. This required the Logic class to be an extension type - which it was. Now, at compile time, the compiler was able to compile the code since all the types were known. At runtime, one of these variables, previosly known to be of the Logic type, would actually become a FirstOrderLogic or FuzzyLogic type variable. At this point, operation upon the subclass attributes would be limited by Python rate limits, however, at all other times, and elsewhere in the code, Cython speed would apply.

Wrapping Up

Thus, the logic module was converted to Cython. However, running the code at this point of time still led to errors.

Code Coupling

For instance, the following error occurred at runtime shortly after these edits:

Traceback (most recent call last):
  File "test.py", line 41, in <module>
    main(sys.argv[1])
  File "test.py", line 28, in main
    query(queries='Cancer,Smokes,Friends', method='EnumerationAsk', mln=mln, db=db, verbose=False, multicore=False).run()
  File "/home/kaivalya/ ... /python3/pracmln/mlnquery.py", line 241, in run
    mrf = mln_.ground(db)
  File "base.pyx", line 374, in pracmln.mln.base.MLN.ground
  File "mrf.pyx", line 339, in pracmln.mln.mrf.MRF.gndatom
  File "mrf.pyx", line 378, in pracmln.mln.mrf.MRF.new_gndatom
  File "mrf.pyx", line 351, in pracmln.mln.mrf.MRF.variable
AttributeError: type object 'pracmln.logic.common.Logic' has no attribute 'GroundAtom'

The issue here is that the Logic module is closely coupled with the rest of pracmln. The solutions are trivial though, and the code can be edited quite easily to reflect the modified logic classes. I have run multiple tests to ensure I don’t miss any cases, and tried to scrub all old-style usage of the inner classes, but I may have missed a few edge cases.

A Final ‘type’ Trick

A fundamental question arose now, about how many attributes should be typed. The main portions of the dependency tree for Enumeration-Ask had been successfully converted to Cython, and this included the computationally intensive _evidence attribute from the MRF class, as well as the logic module. It is impractical, and unnecessary to type all the attributes throughout PracMLN. However, at the same time, the Cython compiler would expectantly complain unless all attributes were known before runtime. The cython documentation suggest an easy hack to workaround this. For each class with untyped attributes, add a single dict type __dict__ variable in the class definition in the corresponding .pxd file:

class foo:
    cdef dict __dict__

While this is definitely slower than typing each attribute, expensive / extensively used attribute lookups can now be identified at ease, later, and typed gradually.

This last trick allowed the entire code to compile and run, and returned the lost functionality of the gsoc18-cython branch to match that of the master branch. Finally, EnumerationAsk was compiled and running entirely in Cython, as an extension type.