====================
Wounded hero revived
====================
Lessons learned from porting M2Crypto to Py3k
:Author: Matěj Cepl <mcepl@cepl.eu>
Origins
=======
* Mitch Kapor sold Lotus to IBM and decided to perpetrate good.
One of his projects was Chandler_.
* Project is gone_ The only remainder of the project is
M2Crypto_, full Python binding for OpenSSL.
.. class:: handout
Once upon a time, one Mitch Kapor, who sold Lotus to IBM, and
with money he got decided to perpetrate good. He did many
truly good things for the computer world, was co-founder of
EFF, helped with Mozilla Foundation, but he also founded
rather unsuccessful project, a Python-based universal PIM,
called Chandler_.
Subversion of the project has been mirrored couple of times,
one by me.
.. _Chandler:
https://en.wikipedia.org/wiki/Chandler_(software)
.. _gone:
https://gitlab.com/mcepl/chandler
.. _M2Crypto:
https://gitlab.com/m2crypto/m2crypto/
M2Crypto
========
* M2Crypto was maintained by `Heikki Toivonen`_ few years after
Chandler folded, but his last release 0.21.1 was from 2011.
* Maintenaned in Red Hat by `Miloslav Trmač`_.
* I took over the project in May 2015.
.. class:: handout
Maintenaned in Red Hat by `Miloslav Trmač`_, who collected
all patches in RHEL package.
I took over the project in May 2015 with the intention just
to publish all patches and be a point of contact for any
issue reports. I haven’t expected much activity, because
package was very silent in RHEL.
.. _`Heikki Toivonen`:
https://www.heikkitoivonen.net/
.. _`Miloslav Trmač`:
https://github.com/mtrmac
Strengths and weaknesses
========================
.. Strengths
* Backed up by stable C library
* Rather large coverage of OpenSSL API
* Surprisingly widespread use
* Large test suite
.. Weakness
* Unknown issues
* Python 3
* M2Crypto API copies OpenSSL too closely
* Support for Mac OS X and Windows (not mentioning ``*BSD``) was
broken.
.. class:: handout
Backed by … comparing to PyCrypto and other horrors.
Opportunities & Threats
=======================
.. Opportunities
* Satisfying current user base
* Replacing horrors like PyCrypto
* Goal of maitenance is to maintain API
* Extend support on non-Linux platforms
.. Threats
* Python ``ssl`` module
* Python cryptography_
.. class:: handout
Distribution bug tracker (especially an enterprise one) is
not a good measure of the real state of use and quality of
package.
There are apparently many programmers for custom software,
who use M2Crypto (still it is one of the most complete
bindings for OpenSSL).
Threats as a “competing” projects, which may replace
M2Crypto.
.. _cryptography:
https://github.com/pyca/cryptography
Unicode
=======
* The biggest problem of all Python 2 programs: complete
confusion between py2k ``str`` means py3k ``str`` and when
``bytes``.
* There are numerous uses of both in M2Crypto, because of course
both strings and binary data are present in all functions of
OpenSSL.
Strategy
========
* Type Hints
* CI
* Extension of platform support
Type Hints
==========
* `PEP 484`_ providing **optional** type annotations. Quite
controversial, but clearly very useful for libraries
* Native for Python >= 3.5, but supports py2k compatible syntax::
def sum(x, y):
# type: (int, int) -> int
return x + y
* Especially useful for our situation: marking types helps us to
analyze what invidivual py2k ``str`` actually mean.
.. _`PEP 484`:
https://www.python.org/dev/peps/pep-0484/
C API
=====
* All Unicode/bytes translation happens on C level as well
* Based on ``swig``, which fortunately natively supports
``--py3``.
* Also need to support two versions of OpenSSL API, 1.1 and
older.
* Minimize use of ``#ifdef`` s and rather use included shims
for missing functions.
C shims of missing functions
============================
* For OpenSSL < 1.1
* For Python 2
- ``PyLong_FromLong()`` and ``PyUnicode_AsUTF8()`` just simple
``#define`` s.
- All Pythons >= 2.6 contain whole set of Py3k function stubs
in ``bytesobject.h``.
* For Python 3
- ``PyFile_AsFile()`` I have no idea, why it was removed from
py3k API