Tuesday, 16 September 2014

The art of python setup

Author: <limkokhole@gmail.com>

The mission was import pylab (or matplotlib.pyplot) with ipython notebook and draw something, but something wrong happen.
......


 UnicodeUCS2_AsEncodedString? stackoverflow got similar answer but there are stupid answer, so let rely on my own. Based on the documentation at https://docs.python.org/2/faq/extending.html#when-importing-module-x-why-do-i-get-undefined-symbol-pyunicodeucs2:

When importing module X, why do I get “undefined symbol: PyUnicodeUCS2*”?

You are using a version of Python that uses a 4-byte representation for Unicode characters, but some C extension module you are importing was compiled using a Python that uses a 2-byte representation for Unicode characters (the default).
If instead the name of the undefined symbol starts with PyUnicodeUCS4, the problem is the reverse: Python was built using 2-byte Unicode characters, and the extension module was compiled using a Python with 4-byte Unicode characters.
This can easily occur when using pre-built extension packages. RedHat Linux 7.x, in particular, provided a “python2” binary that is compiled with 4-byte Unicode. This only causes the link failure if the extension uses any of the PyUnicode_*() functions. It is also a problem if an extension uses any of the Unicode-related format specifiers for Py_BuildValue() (or similar) or parameter specifications for PyArg_ParseTuple().
You can check the size of the Unicode character a Python interpreter is using by checking the value of sys.maxunicode:
>>> import sys
>>> if sys.maxunicode > 65535:
...     print 'UCS4 build'
... else:
...     print 'UCS2 build'
The only way to solve this problem is to use extension modules compiled with a Python binary built using the same size for Unicode characters.


Ok, lets check my python, i have a lot of python version installed, but surprising me there are the same 65535:




Weird, i thought the default python is difference.  Just dive into a bit:



Nope, the ipython is use different python, we need to locate ipython:


The /usr/local/bin/python might not be the default python for ipython, so lets check ipython:


Ermm... /usr/local, nothing special. Lets check the file type:


yup...is a ASCII. So just vi it:


The answer is so obvious now, it use /bin/python. So, just find out its UCS:


yup, 1114111, it use UCS 4, so just sudo and vi it to /usr/local/bin/python



Ok, successfully change its UCS:


There's another error when import pylab, "from PyQt4 import QtCore, QtGui":



No quick way to solve it, the PyQt4 source code obviously use a lot of UCS 4 and no such thing UCS3:


unless i can compile PyQt3 but i got more annoying errors and hard to fix it in this moment:

================== bonus note1 ===================
the PyQt4 source code's configure-ng.py have bug:


You have to vi configure-ng.py and add pylib_dir default value under else condition:
==============================================

================== bonus note2 ===================
UCS is not the only problem i met, there's core dump occur and make the situation worst:

So how i fix it? Lets use `py278 -v` to make it clear:


Now i know something wrong to numpy module, lets import it directly:


Combine the problem with UCS i mentioned above, it's make my brain broken. But if i try hard and experiment with various python version, i.e. python,  ipython and py278:


It's not happen to all python, there's some rules hidden.  Now i know python and ipython is the same and no problem to import numpy(this solve the numpy problem but it's wrong conclusion(ipython use /bin/python !) and eventually affect misleading to the UCS problem, luckily i figure it out as already mentiond above.)

So the rule is how i configure python source code. --with-pydebug will lead to numpy failed. Someone open discussion at https://bugzilla.redhat.com/show_bug.cgi?id=1030830 and created bug report at https://bugzilla.redhat.com/show_bug.cgi?id=1031998 (Originally i thought it might due to either "--with-pydebug" OR `make CC="gcc -g"` but i tried and proved the latter is not the cause.)
==============================================

Actually the ipython notebook's pylab is ready to works after i fixed numpy's UCS (change /usr/lcoal/bin/ipython 's header) and recompile python without  --with-debug. Of course, this is new compilation and i have to change /usr/local/bin/ipython header, set py2782 alias to ~/.bashrc, and do some extra works (i don't want to add entire site-packages, i'm a geek :-p)
[UPDATE] But there's alternative way  using link(so, you only need include this dir path in your .pth file):

vi /home/ack0hole/.localpython2.7.8.2/lib/python2.7/site-packages/kokhole.pth and add required modules, not entire python2.7/site-packages/


Lets start `ipython notebook` and create new notebook. import pylab and do some math.


Ok, whatever, i need to fix "from PyQt4 import QtCore, QtGui" too. But i realized there's something wrong, even i use /usr/local/bin/python(the original ipython header) it should not have this error, because the new compiled  /home/ack0hole/.localpython2.7.8.2/bin/python2.7(just called it py2782) have no such problem to trigger pylab. Finally i figure it out the difference after i tried `from PyQt4 import QtGui`, the error is No module named sip, not UCS 4 error.


After i `sudo mv /usr/local/lib/python2.7/site-packages/sip.so /usr/local/lib/python2.7/site-packages/sip/` and add the sip path to py2782's .pth file, and now the error is same with /usr/local/bin/python, and most importantly, i fail to import pylab in the fist place and both of their behaviour is 100% match now:


So, to recap, if `import pylab` without sip(i.e. sip module not found), it still success. But if  `import pylab` with sip, it will fail because sip need extra import QtGui and QtCore.

So i configure python with --enable-unicode=ucs4(without pydebug, of course :) and recompile it. And now new error is appear:


So i need to recompile sip.so, but i can't simply use hg-git to clone it:

Seems like a lot of work have to recompile ? No, because its not use python2783, i don't want to bother  how to change the default python for hg-git, i just download tar.gz source code and  `py2783 configure.py && make && make install'`.


Finally, we make the QtGui and QtCore works, what's surprising me is i don't have to recompile PyQt4. But some error is so familliar, it's now back to the first scenario i mentioned above. Previously i fix it by changing the header of `which ipython`, but what's now? i have to make it UCS 4 compatible to py2783.

So i do `make clean`, `py2783 setup.py build` and `sudo ~/.localpython2.7.8.3/bin/python setup.py install` inside matplotlib source code and now another error come out when`import pylab`:


`import numpy` is the same error as expected. It's so obvious numpy  have to recompile with UCS4. So just `git clone http://github.com/numpy/numpy` and `py2783 setup.py build`, but another error appear:


I have to install cpython first. So, just do `sudo easy_install cython`.

Now i download numpy source code, do `sudo /home/ack0hole/.localpython2.7.8.3/bin/python setup.py build` and `sudo /home/ack0hole/.localpython2.7.8.3/bin/python setup.py build`. But still no luck and failed to import numpy. It still saying UCS2 error. But i already compile it with UCS4, What the heck is going on? Finally i realized something when performing `clean --all`


I check py2783's sys.path and found 2 numpy path. So i disable numpy path inside '/usr/lib/python2.7/site-packages/numpy-1.9.0-py2.7.egg-info/' and perform `sudo rm -rf numpy*` inside '/home/ack0hole/.localpython2.7.8.3/lib/python2.7/site-packages' directory. (i don't think the first one is the main cause because numpy-1.9.0-py2.7-linux-i686.egg is located on the latter directory). When the numpy fail to setup(i tried a few), it's not clean correctly ! You have to manually clean it.


Lets try it now. wow! successfully import numpy. But still no luck for matplotlib/_path.so which was still UCS2.

Even i rebuild the matplotlib(it depends on numpy), sill no luck. What to do? let's remove the existing matplotlib inside 2783's site-packages directory, i.e. sudo rm -rf  /home/ack0hole/.localpython2.7.8.3/lib/python2.7/site-packages/matplotlib-1.4.0-py2.7-linux-i686.egg

And i know the existing matplotlib is something wrong after a few rebuild attempts. So i decide (sudo)`make clean` and sudo rm -rf its entire matplotlib source code and download matplotlib source code with `git clone git://github.com/matplotlib/matplotlib.git. cd into it, `git pull` to double confirm it's the latest version(seems like redundant step, but i don't want to mess up again, minimize the factor is the MUST on debugging).
`
So, `py2783 setup.py build` and `/home/ack0hole/.localpython2.7.8.3/bin/python setup.py install`(something wrong it need sudo)


Yup, sudo can make it but system-wide is bad practice. So, lets change owner to my home user, i.e. `sudo chown ack0hole:ack0hole /home/ack0hole/.localpython2.7.8.3/lib/python2.7/site-packages/easy-install.pth` .


But it's not end there, i realize four files in my local python bin have root owner. So just chown all of them.


All import is working perfect now, lets celebrate it :)



[oops].. i forgot mentioned about scipy, just download Tempita source code and do `py2783 setup.py install --prefix=/home/ack0hole/.localpython2.7.8.3/`, `sudo yum install lapack-debuginfo lapack-devel lapack-static lapack` and download scipy source code and do `py2783 setup.py build`, `py2783 setup.py install --prefix=/home/ack0hole/.localpython2.7.8.3/`. That's all.

[Last words] This article is not meant to be cover all of the possible exceptions you might encounter, your setup process could facing issues such as 32/64 bits incompatible, ”No module named xxx“...etc.