The Logic Module
Jun 29, 2018 13:54 · 1321 words · 7 minute read
As mentioned in the last post, to convert EnumerationAsk
to an extension type required the conversion of all its dependencies first. These included the logic
module.
The complication
The <LOGIC>
in the following figure refers not to a single file, but the entire logic subdirectory. The important files here are common.pyx which contains the Logic
superclass and fol.py and fuzzy.py which contain the derived FirstOrderLogic
and FuzzyLogic
subclasses respectively.
Speeding up Enumeration-Ask essentially required the logic related classes to be converted to extension types first. This involved two main challenges:
- Widespread use of inner classes.
- Use of multiple inheritance.
Handling these was quite difficult, and I will not be detailing all the various techniques tried, but will delve straight to the final solution actually used.
Inner Classes
Here is an idea of the inner classes present in Logic
, in common.pyx
. The FirstOrderLogic(Logic)
and FuzzyLogic(Logic)
subclasses have corresponding inner classes too.
... pracmln/python3/pracmln/logic $ cat common.py | grep class
class Logic(object):
class Constraint(object):
class Formula(Constraint):
class ComplexFormula(Formula):
class Conjunction(ComplexFormula):
class Disjunction(ComplexFormula):
class Lit(Formula):
class LitGroup(Formula):
class GroundLit(Formula):
class GroundAtom:
class Equality(ComplexFormula):
class Implication(ComplexFormula):
class Biimplication(ComplexFormula):
class Negation(ComplexFormula):
class Exist(ComplexFormula):
class TrueFalse(Formula):
class NonLogicalConstraint(Constraint):
class CountConstraint(NonLogicalConstraint):
class GroundCountConstraint(NonLogicalConstraint):
# this is a little hack to make nested classes pickleable
The Error
The Cython compiler wouldn’t allow declaring any classes inside other cdef
classes. Defining the Logic
outer class with cdef
, and then compiling common.pyx
yields the following error:
... pracmln/python3/pracmln/logic $ python3 setup.py build_ext --inplace
Compiling common.pyx because it changed.
[1/1] Cythonizing common.pyx
Error compiling Cython file:
------------------------------------------------------------
...
self.__dict__ = d
self.grammar = eval(d['grammar'])(self)
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
class Constraint(object):
^
------------------------------------------------------------
common.pyx:81:4: class definition not allowed here
Traceback (most recent call last):
File "setup.py", line 5, in <module>
ext_modules=cythonize( ['*.pyx'] )
File "/home/kaivalya/ ... /python3.5/site-packages/Cython/Build/Dependencies.py", line 1026, in cythonize
cythonize_one(*args)
File "/home/kaivalya/ ... /python3.5/site-packages/Cython/Build/Dependencies.py", line 1146, in cythonize_one
raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: common.pyx
I couldn’t find any mention online of a similar problem. Cython seemed to be disallowing a class definition inside another, yet the official user guide on extension types made no note of this except in the case of wrapping C++ code. A few arcane Google searches later, I hit upon this, self-admittedly unreliable resource from 2008, which reads
So apparently I had to rewrite the logic classes and remove the inner classes. It is important here to remember that Python supports inner classes, but they are not essential to the language. Anything that can be achieved with inner classes, can be achieved without them - they just increase code readability and organisation for humans.
After discussing with my mentor Daniel, I decided to remove the inner classes and readd them as instance attributes. I thus moved all the inner classes to a separate file, misc.pyx
, and then imported them back for use inside Logic
.
Pickle and Multiprocessing
This caused a problem and the original ‘hack’ introduced to allow pickling. This is more important than just an extra feature in PracMLN. The multicore versions of PracMLN use multiprocessing
, and this requires pickle
to work properly. Thus, this functionality was important to retain.
Python inner classes do not add any features to the languages, unlike say Java inner classes. As stated in this answer on StackOverflow, inner classes can just be declared at the same level as normal ‘outer’ classes, and then a reference to ech can be inside inside the ‘outer’ class. The equivalence between Python-with-inner-classes and Python-without-inner-classes isn’t theoretical, with both being Turing complete, but much simpler and more practical.
I thus moved the inner classes back into the common.pyx
file, but this time at the outermost level. This further simplified the code structure and made the extra misc.pyx
file redundant, while also automatically solving the python pickling problem.
Multiple Inheritance
This second issue was more challenging. There are many classes defined in the logic module that use Python’s multiple inheritance.
... pracmln/python3/pracmln/logic $ grep 'class.*[,].*:' *.py
fol.py: class Formula(Logic.Formula, Constraint):
fol.py: class ComplexFormula(Logic.ComplexFormula, Formula): pass
fol.py: class Lit(Logic.Lit, Formula): pass
fol.py: class Litgroup(Logic.LitGroup, Formula): pass
fol.py: class GroundLit(Logic.GroundLit, Formula):
fol.py: class Disjunction(Logic.Disjunction, ComplexFormula):
fol.py: class Conjunction(Logic.Conjunction, ComplexFormula):
fol.py: class Implication(Logic.Implication, ComplexFormula):
fol.py: class Biimplication(Logic.Biimplication, ComplexFormula):
fol.py: class Negation(Logic.Negation, ComplexFormula): pass
fol.py: class Exist(Logic.Exist, ComplexFormula): pass
fol.py: class Equality(Logic.Equality, ComplexFormula): pass
fol.py: class TrueFalse(Logic.TrueFalse, Formula):
fuzzy.py: class Negation(Logic.Negation, ComplexFormula):
fuzzy.py: class Conjunction(Logic.Conjunction, ComplexFormula):
fuzzy.py: class Disjunction(Logic.Disjunction, ComplexFormula):
fuzzy.py: class Implication(Logic.Implication, ComplexFormula):
fuzzy.py: class Biimplication(Logic.Biimplication, ComplexFormula):
fuzzy.py: class TrueFalse(Formula, Logic.TrueFalse):
fuzzy.py: class Exist(Logic.Exist, Logic.ComplexFormula):
As the Cython extension types introduction notes,
So it was impossible to convert this code to Cython without an extensive amount of rewriting. This was a roadblock that was hard to get past.
Running the tests at this point in time … succeeded!
That there was no need for any modifications to counter this ‘problem’. The goal of this exercise was to eventually convert EnumerationAsk
to an extension type. This required the Logic
class to be an extension type - which it was. Now, at compile time, the compiler was able to compile the code since all the types were known. At runtime, one of these variables, previosly known to be of the Logic
type, would actually become a FirstOrderLogic
or FuzzyLogic
type variable. At this point, operation upon the subclass attributes would be limited by Python rate limits, however, at all other times, and elsewhere in the code, Cython speed would apply.
Wrapping Up
Thus, the logic module was converted to Cython. However, running the code at this point of time still led to errors.
Code Coupling
For instance, the following error occurred at runtime shortly after these edits:
Traceback (most recent call last):
File "test.py", line 41, in <module>
main(sys.argv[1])
File "test.py", line 28, in main
query(queries='Cancer,Smokes,Friends', method='EnumerationAsk', mln=mln, db=db, verbose=False, multicore=False).run()
File "/home/kaivalya/ ... /python3/pracmln/mlnquery.py", line 241, in run
mrf = mln_.ground(db)
File "base.pyx", line 374, in pracmln.mln.base.MLN.ground
File "mrf.pyx", line 339, in pracmln.mln.mrf.MRF.gndatom
File "mrf.pyx", line 378, in pracmln.mln.mrf.MRF.new_gndatom
File "mrf.pyx", line 351, in pracmln.mln.mrf.MRF.variable
AttributeError: type object 'pracmln.logic.common.Logic' has no attribute 'GroundAtom'
The issue here is that the Logic
module is closely coupled with the rest of pracmln
. The solutions are trivial though, and the code can be edited quite easily to reflect the modified logic classes. I have run multiple tests to ensure I don’t miss any cases, and tried to scrub all old-style usage of the inner classes, but I may have missed a few edge cases.
A Final ‘type’ Trick
A fundamental question arose now, about how many attributes should be typed. The main portions of the dependency tree for Enumeration-Ask had been successfully converted to Cython, and this included the computationally intensive _evidence
attribute from the MRF
class, as well as the logic module. It is impractical, and unnecessary to type all the attributes throughout PracMLN. However, at the same time, the Cython compiler would expectantly complain unless all attributes were known before runtime. The cython documentation suggest an easy hack to workaround this. For each class with untyped attributes, add a single dict
type __dict__
variable in the class definition in the corresponding .pxd
file:
class foo:
cdef dict __dict__
While this is definitely slower than typing each attribute, expensive / extensively used attribute lookups can now be identified at ease, later, and typed gradually.
This last trick allowed the entire code to compile and run, and returned the lost functionality of the gsoc18-cython
branch to match that of the master
branch. Finally, EnumerationAsk was compiled and running entirely in Cython, as an extension type.