Tuesday, 16 September 2014

The art of python setup

Author: <limkokhole@gmail.com>

The mission was import pylab (or matplotlib.pyplot) with ipython notebook and draw something, but something wrong happen.
......


 UnicodeUCS2_AsEncodedString? stackoverflow got similar answer but there are stupid answer, so let rely on my own. Based on the documentation at https://docs.python.org/2/faq/extending.html#when-importing-module-x-why-do-i-get-undefined-symbol-pyunicodeucs2:

When importing module X, why do I get “undefined symbol: PyUnicodeUCS2*”?

You are using a version of Python that uses a 4-byte representation for Unicode characters, but some C extension module you are importing was compiled using a Python that uses a 2-byte representation for Unicode characters (the default).
If instead the name of the undefined symbol starts with PyUnicodeUCS4, the problem is the reverse: Python was built using 2-byte Unicode characters, and the extension module was compiled using a Python with 4-byte Unicode characters.
This can easily occur when using pre-built extension packages. RedHat Linux 7.x, in particular, provided a “python2” binary that is compiled with 4-byte Unicode. This only causes the link failure if the extension uses any of the PyUnicode_*() functions. It is also a problem if an extension uses any of the Unicode-related format specifiers for Py_BuildValue() (or similar) or parameter specifications for PyArg_ParseTuple().
You can check the size of the Unicode character a Python interpreter is using by checking the value of sys.maxunicode:
>>> import sys
>>> if sys.maxunicode > 65535:
...     print 'UCS4 build'
... else:
...     print 'UCS2 build'
The only way to solve this problem is to use extension modules compiled with a Python binary built using the same size for Unicode characters.


Ok, lets check my python, i have a lot of python version installed, but surprising me there are the same 65535:




Weird, i thought the default python is difference.  Just dive into a bit:



Nope, the ipython is use different python, we need to locate ipython:


The /usr/local/bin/python might not be the default python for ipython, so lets check ipython:


Ermm... /usr/local, nothing special. Lets check the file type:


yup...is a ASCII. So just vi it:


The answer is so obvious now, it use /bin/python. So, just find out its UCS:


yup, 1114111, it use UCS 4, so just sudo and vi it to /usr/local/bin/python



Ok, successfully change its UCS:


There's another error when import pylab, "from PyQt4 import QtCore, QtGui":



No quick way to solve it, the PyQt4 source code obviously use a lot of UCS 4 and no such thing UCS3:


unless i can compile PyQt3 but i got more annoying errors and hard to fix it in this moment:

================== bonus note1 ===================
the PyQt4 source code's configure-ng.py have bug:


You have to vi configure-ng.py and add pylib_dir default value under else condition:
==============================================

================== bonus note2 ===================
UCS is not the only problem i met, there's core dump occur and make the situation worst:

So how i fix it? Lets use `py278 -v` to make it clear:


Now i know something wrong to numpy module, lets import it directly:


Combine the problem with UCS i mentioned above, it's make my brain broken. But if i try hard and experiment with various python version, i.e. python,  ipython and py278:


It's not happen to all python, there's some rules hidden.  Now i know python and ipython is the same and no problem to import numpy(this solve the numpy problem but it's wrong conclusion(ipython use /bin/python !) and eventually affect misleading to the UCS problem, luckily i figure it out as already mentiond above.)

So the rule is how i configure python source code. --with-pydebug will lead to numpy failed. Someone open discussion at https://bugzilla.redhat.com/show_bug.cgi?id=1030830 and created bug report at https://bugzilla.redhat.com/show_bug.cgi?id=1031998 (Originally i thought it might due to either "--with-pydebug" OR `make CC="gcc -g"` but i tried and proved the latter is not the cause.)
==============================================

Actually the ipython notebook's pylab is ready to works after i fixed numpy's UCS (change /usr/lcoal/bin/ipython 's header) and recompile python without  --with-debug. Of course, this is new compilation and i have to change /usr/local/bin/ipython header, set py2782 alias to ~/.bashrc, and do some extra works (i don't want to add entire site-packages, i'm a geek :-p)
[UPDATE] But there's alternative way  using link(so, you only need include this dir path in your .pth file):

vi /home/ack0hole/.localpython2.7.8.2/lib/python2.7/site-packages/kokhole.pth and add required modules, not entire python2.7/site-packages/


Lets start `ipython notebook` and create new notebook. import pylab and do some math.


Ok, whatever, i need to fix "from PyQt4 import QtCore, QtGui" too. But i realized there's something wrong, even i use /usr/local/bin/python(the original ipython header) it should not have this error, because the new compiled  /home/ack0hole/.localpython2.7.8.2/bin/python2.7(just called it py2782) have no such problem to trigger pylab. Finally i figure it out the difference after i tried `from PyQt4 import QtGui`, the error is No module named sip, not UCS 4 error.


After i `sudo mv /usr/local/lib/python2.7/site-packages/sip.so /usr/local/lib/python2.7/site-packages/sip/` and add the sip path to py2782's .pth file, and now the error is same with /usr/local/bin/python, and most importantly, i fail to import pylab in the fist place and both of their behaviour is 100% match now:


So, to recap, if `import pylab` without sip(i.e. sip module not found), it still success. But if  `import pylab` with sip, it will fail because sip need extra import QtGui and QtCore.

So i configure python with --enable-unicode=ucs4(without pydebug, of course :) and recompile it. And now new error is appear:


So i need to recompile sip.so, but i can't simply use hg-git to clone it:

Seems like a lot of work have to recompile ? No, because its not use python2783, i don't want to bother  how to change the default python for hg-git, i just download tar.gz source code and  `py2783 configure.py && make && make install'`.


Finally, we make the QtGui and QtCore works, what's surprising me is i don't have to recompile PyQt4. But some error is so familliar, it's now back to the first scenario i mentioned above. Previously i fix it by changing the header of `which ipython`, but what's now? i have to make it UCS 4 compatible to py2783.

So i do `make clean`, `py2783 setup.py build` and `sudo ~/.localpython2.7.8.3/bin/python setup.py install` inside matplotlib source code and now another error come out when`import pylab`:


`import numpy` is the same error as expected. It's so obvious numpy  have to recompile with UCS4. So just `git clone http://github.com/numpy/numpy` and `py2783 setup.py build`, but another error appear:


I have to install cpython first. So, just do `sudo easy_install cython`.

Now i download numpy source code, do `sudo /home/ack0hole/.localpython2.7.8.3/bin/python setup.py build` and `sudo /home/ack0hole/.localpython2.7.8.3/bin/python setup.py build`. But still no luck and failed to import numpy. It still saying UCS2 error. But i already compile it with UCS4, What the heck is going on? Finally i realized something when performing `clean --all`


I check py2783's sys.path and found 2 numpy path. So i disable numpy path inside '/usr/lib/python2.7/site-packages/numpy-1.9.0-py2.7.egg-info/' and perform `sudo rm -rf numpy*` inside '/home/ack0hole/.localpython2.7.8.3/lib/python2.7/site-packages' directory. (i don't think the first one is the main cause because numpy-1.9.0-py2.7-linux-i686.egg is located on the latter directory). When the numpy fail to setup(i tried a few), it's not clean correctly ! You have to manually clean it.


Lets try it now. wow! successfully import numpy. But still no luck for matplotlib/_path.so which was still UCS2.

Even i rebuild the matplotlib(it depends on numpy), sill no luck. What to do? let's remove the existing matplotlib inside 2783's site-packages directory, i.e. sudo rm -rf  /home/ack0hole/.localpython2.7.8.3/lib/python2.7/site-packages/matplotlib-1.4.0-py2.7-linux-i686.egg

And i know the existing matplotlib is something wrong after a few rebuild attempts. So i decide (sudo)`make clean` and sudo rm -rf its entire matplotlib source code and download matplotlib source code with `git clone git://github.com/matplotlib/matplotlib.git. cd into it, `git pull` to double confirm it's the latest version(seems like redundant step, but i don't want to mess up again, minimize the factor is the MUST on debugging).
`
So, `py2783 setup.py build` and `/home/ack0hole/.localpython2.7.8.3/bin/python setup.py install`(something wrong it need sudo)


Yup, sudo can make it but system-wide is bad practice. So, lets change owner to my home user, i.e. `sudo chown ack0hole:ack0hole /home/ack0hole/.localpython2.7.8.3/lib/python2.7/site-packages/easy-install.pth` .


But it's not end there, i realize four files in my local python bin have root owner. So just chown all of them.


All import is working perfect now, lets celebrate it :)



[oops].. i forgot mentioned about scipy, just download Tempita source code and do `py2783 setup.py install --prefix=/home/ack0hole/.localpython2.7.8.3/`, `sudo yum install lapack-debuginfo lapack-devel lapack-static lapack` and download scipy source code and do `py2783 setup.py build`, `py2783 setup.py install --prefix=/home/ack0hole/.localpython2.7.8.3/`. That's all.

[Last words] This article is not meant to be cover all of the possible exceptions you might encounter, your setup process could facing issues such as 32/64 bits incompatible, ”No module named xxx“...etc.


Friday, 30 May 2014

面子书 - 图库之旅


按大头贴, 选 Open Image in New Tab




就会看见大头贴的图片




留意 https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfa1/t1.0-1/c10.10.130.130/181592_10150105700371961_7986881_n.jpg 里面的最右边的斜线 /  181592_10150105700371961_7986881_n.jpg

 下划线 _ 分成三段,分别是 181592, 101501057003719617986881, 以及 n.jpg。

n.jpg 表示最大版本的图片, 如果把它改成 s.jpg  ('s' for small )就会变成小张。
那么为何这张 n.jpg 还是很小张呢? 那是因为链接多了 /c10.10.130.130 尺寸, 只要移除掉就可以了。 (有时多的是 /p160x160/ 尺寸。

https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfa1/t1.0-1/181592_10150105700371961_7986881_n.jpg


(注: 除了 jpg, 也有 png )
(注: 除了 s.jpg 和 n.jpg, 也有 a.jpg, b.jpg, g.jpg, o.jpg, q.jpg, t.jpg (t for tiny), x.jpg )

面子书也提供了 q1 到 q100 ('q' for quality, 1-100 for 素质数) 选择,

https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfa1/t1.0-1/q1/181592_10150105700371961_7986881_n.jpg

1 是最差的素质可是为什么没有效果呢? 你必须加上 r180 ('r' for rotate, 180 for 度数)

https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfa1/t1.0-1/q1/r180/181592_10150105700371961_7986881_n.jpg




当然你也可以改成 90度 或 270度

https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfa1/t1.0-1/q1/r270/181592_10150105700371961_7986881_n.jpg

(注:q100 的图片大小(可能 MB )虽然巨大, 但其实 q75 才是官方的最高素质, 因为 rotate 不能跟没 rotate 做比较)



问题来了

如果从天掉下来这条图片链接,

 https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfa1/t1.0-1/181592_10150105700371961_7986881_n.jpg

你要如何知道这张图片是属于哪一个面子书用户的 ?

我上面说了,
 下划线 _ 分成三段,分别是 181592, 101501057003719617986881以及 n.png。
拿中间的那个 10150105700371961, 把它组合成 https://facebook.com/10150105700371961

它就会重定向去 https://www.facebook.com/photo.php?fbid=10150105700371961&set=a.437816051960.224734.216311481960&type=1




这样就能找到图片拥有者的面子书了。当然图片必须要公开, 如果是不公开的话,比如说 Mark Zuckerberg,就不能了。




其实单靠 https://facebook.com 有时会失败的, 所以最正确的做法还是加上 photo.php?fbid= 变成 https://www.facebook.com/photo.php?fbid=10150105700371961




有了这个基础知识, 有没有其它的用处 ?



默认大头贴之旅

利用这个逻辑, 就可以逆流而上找到默认大头贴的用户。

默认大头贴的用户不是我自己吗? 当然不是, 又不是我自己 upload 的 :P




https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xpa1/t1.0-1/c47.0.160.160/p160x160/252231_1002029915278_1941483569_n.jpg




抽 1002029915278 出来变成 https://www.facebook.com/photo.php?fbid=1002029915278


什么来的, 没东西?

别给它骗了, 按 F5 refresh 多次。




隐藏的大 boss 终于曝光了,去它主页看看




加入面子书的时间, 2012 年 6 月 2 日。




他唯一的朋友也是同一时间成为朋友。




预料之内, 他唯一的朋友就是面子书工程师。




用 Graph Search stalk 几下就找到 (他 follow 的人 follow 的人)

全部都是有认证的面子书员工。




去浏览 https://www.facebook.com/will.chengberg/photos (不要按 Will's Photos 按钮, 那个不齐) 可以一次过浏览面子书图库的历史。

旧的




新的




你不能直接打开, 这些链接非常非常 hang,




按右键选打开链接




还是一样, 通常要按 F5 refresh 页面几次才能看见。




然后再右键打开在新的 tab



总之, 最重要的是, 只要修改链接 (根据我上面教的) 可以得到最高清的默认用户图片。


(注: 有些如 https://fbcdn-sphotos-g-a.akamaihd.net/hphotos-ak-xaf1/v/t1.0-9/1005878_10150001179405071_1239501253_n.jpg?oh=xxxxxxxxxxxxxxxxxxxxxxxxxxx&oe=xxxxxxx&__gda__=1111111111_xxxxxxxxxxxxxxxxxxxxxxxxxx 之类的链接。去除中间的 "/v/t1.0-9/" 以及 .jpg 后面的 "?oh=xxxxxxxxxxxxxxxxxxxxxxxxxxx&oe=xxxxxxx&__gda__=1111111111_xxxxxxxxxxxxxxxxxxxxxxxxxx" 即可。变成 https://fbcdn-sphotos-g-a.akamaihd.net/hphotos-ak-xaf1/1005878_10150001179405071_1239501253_n.jpg。)



Will Chengberg 这个名字有什么意义 ? Mark Zucker"berg" 的  berg  加上他老婆 Priscilla Chan "Cheng" 谐音。Will 四个字, Mark 也是四个字。

用旧搜索引擎 (Graph Search 搜不到), 整整一百条同人名 (不同 ID)




用 API 更夸张... 五百个用户 ID




这几百个 ID 都是潜水的, 没有用户名 (Username) 。而且有 link。实际用途无从得知。


没有用户名有它的道理, 总不能取名 Will Chengberg1, Will Chengberg2 吧 =.=

 空白一片的潜水用户主页:




无论如何, 有些潜水用户还是可以的,
https://www.facebook.com/profile.php?id=89000000613342




用那些 ID 是不能组成 USER_ID@facebook.com 电邮来登入的(for 这个 case 不能)。用户名 USERNAME@facebook.com 可以。可惜, 上面讲了没有用户名 =.=

可是头头那个 will.chengberg 还是可以组成电邮的。


“Test Users can only be accessed from Facebook networks.”
"只允许面子书内部网络才可以登入。"
好了点到为止, 图库之旅至此结束。