基于协程的Python网络库gevent介绍

继续Python协程方面的介绍,这次要讲的是gevent,它是一个并发网络库。它的协程是基于greenlet的,并基于libev实现快速事件循环(Linux上是epoll,FreeBSD上是kqueue,Mac OS X上是select)。有了gevent,协程的使用将无比简单,你根本无须像greenlet一样显式的切换,每当一个协程阻塞时,程序将自动调度,gevent处理了所有的底层细节。让我们看个例子来感受下吧。

解释下,”gevent.spawn()”方法会创建一个新的greenlet协程对象,并运行它。”gevent.joinall()”方法会等待所有传入的greenlet协程运行结束后再退出,这个方法可以接受一个”timeout”参数来设置超时时间,单位是秒。运行上面的程序,执行顺序如下:

  1. 先进入协程test1,打印12
  2. 遇到”gevent.sleep(0)”时,test1被阻塞,自动切换到协程test2,打印56
  3. 之后test2被阻塞,这时test1阻塞已结束,自动切换回test1,打印34
  4. 当test1运行完毕返回后,此时test2阻塞已结束,再自动切换回test2,打印78
  5. 所有协程执行完毕,程序退出

所以,程序运行下来的输出就是:

注意,这里与上一篇greenlet中第一个例子运行的结果不一样,greenlet一个协程运行完后,必须显式切换,不然会返回其父协程。而在gevent中,一个协程运行完后,它会自动调度那些未完成的协程。

我们换一个更有意义的例子:

我们通过协程分别获取三个网站的IP地址,由于打开远程地址会引起IO阻塞,所以gevent会自动调度不同的协程。另外,我们可以通过协程对象的”value”属性,来获取协程函数的返回值。

猴子补丁 Monkey patching

细心的朋友们在运行上面例子时会发现,其实程序运行的时间同不用协程是一样的,是三个网站打开时间的总和。可是理论上协程是非阻塞的,那运行时间应该等于最长的那个网站打开时间呀?其实这是因为Python标准库里的socket是阻塞式的,DNS解析无法并发,包括像urllib库也一样,所以这种情况下用协程完全没意义。那怎么办?

一种方法是使用gevent下的socket模块,我们可以通过”from gevent import socket”来导入。不过更常用的方法是使用猴子布丁(Monkey patching):

上述代码的第一行就是对socket标准库打上猴子补丁,此后socket标准库中的类和方法都会被替换成非阻塞式的,所有其他的代码都不用修改,这样协程的效率就真正体现出来了。Python中其它标准库也存在阻塞的情况,gevent提供了”monkey.patch_all()”方法将所有标准库都替换。

使用猴子补丁褒贬不一,但是官网上还是建议使用”patch_all()”,而且在程序的第一行就执行。

获取协程状态

协程状态有已启动和已停止,分别可以用协程对象的”started”属性和”ready()”方法来判断。对于已停止的协程,可以用”successful()”方法来判断其是否成功运行且没抛异常。如果协程执行完有返回值,可以通过”value”属性来获取。另外,greenlet协程运行过程中发生的异常是不会被抛出到协程外的,因此需要用协程对象的”exception”属性来获取协程中的异常。下面的例子很好的演示了各种方法和属性的使用。

协程运行超时

之前我们讲过在”gevent.joinall()”方法中可以传入timeout参数来设置超时,我们也可以在全局范围内设置超时时间:

上例中,我们将超时设为2秒,此后所有协程的运行,如果超过两秒就会抛出”Timeout”异常。我们也可以将超时设置在with语句内,这样该设置只在with语句块中有效:

此外,我们可以指定超时所抛出的异常,来替换默认的”Timeout”异常。比如下例中超时就会抛出我们自定义的”TooLong”异常。

协程间通讯

greenlet协程间的异步通讯可以使用事件(Event)对象。该对象的”wait()”方法可以阻塞当前协程,而”set()”方法可以唤醒之前阻塞的协程。在下面的例子中,5个waiter协程都会等待事件evt,当setter协程在3秒后设置evt事件,所有的waiter协程即被唤醒。

除了Event事件外,gevent还提供了AsyncResult事件,它可以在唤醒时传递消息。让我们将上例中的setter和waiter作如下改动:

队列 Queue

队列Queue的概念相信大家都知道,我们可以用它的put和get方法来存取队列中的元素。gevent的队列对象可以让greenlet协程之间安全的访问。运行下面的程序,你会看到3个消费者会分别消费队列中的产品,且消费过的产品不会被另一个消费者再取到:

put和get方法都是阻塞式的,它们都有非阻塞的版本:put_nowait和get_nowait。如果调用get方法时队列为空,则抛出”gevent.queue.Empty”异常。

信号量

信号量可以用来限制协程并发的个数。它有两个方法,acquire和release。顾名思义,acquire就是获取信号量,而release就是释放。当所有信号量都已被获取,那剩余的协程就只能等待任一协程释放信号量后才能得以运行:

上面的例子中,我们初始化了”BoundedSemaphore”信号量,并将其个数定为2。所以同一个时间,只能有两个worker协程被调度。程序运行后的结果如下:

如果信号量个数为1,那就等同于同步锁。

协程本地变量

同线程类似,协程也有本地变量,也就是只在当前协程内可被访问的变量:

通过将变量存放在local对象中,即可将其的作用域限制在当前协程内,当其他协程要访问该变量时,就会抛出异常。不同协程间可以有重名的本地变量,而且互相不影响。因为协程本地变量的实现,就是将其存放在以的”greenlet.getcurrent()”的返回为键值的私有的命名空间内。

实际应用

讲到这里,大家肯定很想看一个gevent的实际应用吧,这里有一个简单的聊天室程序,基于Flask实现,大家可以参考下。

参考链接


基于协程的Python网络库gevent介绍

Python byte转integer/string

需求:将形如’y\xcc\xa6\xbb’byte字符串转化为integer或者string

方法1 导入struct

方法2 python3.2及以上

byte串采取大端法:

若采取小端法,则:

方法3 借助十六进制转换

大端法:

小端法:

方法4 使用array

其中I用于表示大端或小端,且使用此方法要注意自己使用的python版本。

方法5 自己写函数实现

如:

又如:

字符数组转换成字符串

参考链接


Python Telnet服务器telnetsrv

Project description

Telnet server using gevent or threading.

Copied from http://pytelnetsrvlib.sourceforge.net/ and modified to support gevent, better input handling, clean asynchronous messages and much more. Licensed under the LGPL, as per the SourceForge notes.

This library allows you to easily create a Telnet server, powered by your Python code. The library negotiates with a Telnet client, parses commands, provides an automated help command, optionally provides login queries, then allows you to define your own commands.

You use the library to create your own handler, then pass that handler to a StreamServer or TCPServer to perform the actual connection tasks.

This library includes two flavors of the server handler, one uses separate threads, the other uses greenlets (green pseudo-threads) via gevent.

The threaded version uses a separate thread to process the input buffer and semaphores reading and writing. The provided test server only handles a single connection at a time.

The green version moves the input buffer processing into a greenlet to allow cooperative multi-processing. This results in significantly less memory usage and nearly no idle processing. The provided test server handles a large number of connections.

Install

telnetsrv is available through the Cheeseshop. You can use easy_install or pip to perform the installation.

or

Note that there are no dependancies defined, but if you want to use the green version, you must also install gevent.

To Use

Import the TelnetHandler base class and command function decorator from either the green class or threaded class, then subclass TelnetHandler to add your own commands which are methods decorated with @command.

Threaded

Green

Adding Commands

Commands can be defined by using the command function decorator.

Old Style

Commands can also be defined by prefixing any method with “cmd”. For example, this also creates an echocommand:

This method is less flexible and may not be supported in future versions.

Command Parameters

Any command parameters will be passed to this function automatically. The parameters are contained in a list. The user input is parsed similar to the way Bash parses text: space delimited, quoted parameters are kept together and default behavior can be modified with the \ character. If you need to access the raw text input, inspect the self.input.raw variable.

Command Help Text

The command’s docstring is used for generating the console help information, and must be formatted with at least 3 lines:

  • Line 0: Command parameter(s) if any. (Can be blank line)
  • Line 1: Short descriptive text. (Mandatory)
  • Line 2+: Long descriptive text. (Can be blank line)

If there is no line 2, line 1 will be used for the long description as well.

Command Aliases

To create an alias for the new command, set the method’s name to a list:

The decorator may be stacked, which adds each list to the aliases:

Hidden Commands

To hide the command (and any alias for that command) from the help text output, pass in hidden=True to the decorator:

The command will not show when the user invokes help by itself, but the detailed help text will show if the user invokes help echo.

When stacking decorators, any one of the stack may define the hidden parameter to hide the command.

Console Information

These will be provided for inspection.

TERM
String ID describing the currently connected terminal
username
Set after authentication succeeds, name of the logged in user. If no authentication was requested, will be None.
history
List containing the command history. This can be manipulated directly.

Console Communication

Send Text to the Client

Lower level functions:

self.writeline( TEXT )

self.write( TEXT )

Higher level functions:

self.writemessage( TEXT ) - for clean, asynchronous writing. Any interrupted input is rebuilt.

self.writeresponse( TEXT ) - to emit a line of expected output

self.writeerror( TEXT ) - to emit error messages

The writemessage method is intended to send messages to the console without interrupting any current input. If the user has entered text at the prompt, the prompt and text will be seamlessly regenerated following the message. It is ideal for asynchronous messages that aren’t generated from the direct user input.

Receive Text from the Client

self.readline( prompt=TEXT )

Setting the prompt is important to recreate the user input following a writemessage interruption.

When requesting sensative information from the user (such as requesting a password) the input should not be shown nor should it have access to or be written to the command history. readline accepts two optional parameters to control this, echo and user_history.

self.readline( prompt=TEXT, echo=False, use_history=False )

Handler Options

Override these class members to change the handler’s behavior.

logging
Default: pass
PROMPT
Default: "Telnet Server> "
CONTINUE_PROMPT
Default: "... "
WELCOME
Displayed after a successful connection, after the username/password is accepted, if configured.

Default: "You have connected to the telnet server."

session_start(self)
Called after the WELCOME text is displayed.

Default: pass

session_end(self)
Called after the console is disconnected.

Default: pass

authCallback(self, username, password)
Reference to authentication function. If this is not defined, no username or password is requested. Should raise an exception if authentication fails

Default: None

authNeedUser
Should a username be requested?

Default: False

authNeedPass
Should a password be requested?

Default: False

Handler Display Modification

If you want to change how the output is displayed, override one or all of the write classes. Make sure you call back to the base class when doing so. This is a good way to provide color to your console by using ANSI color commands. See http://en.wikipedia.org/wiki/ANSI_escape_code

  • writemessage( TEXT )
  • writeresponse( TEXT )
  • writeerror( TEXT )

Serving the Handler

Now you have a shiny new handler class, but it doesn’t serve itself - it must be called from an appropriate server. The server will create an instance of the TelnetHandler class for each new connection. The handler class will work with either a gevent StreamServer instance (for the green version) or with a SocketServer.TCPServer instance (for the threaded version).

Threaded

Green

The TelnetHandler class includes a streamserver_handle class method to translate the required fields from a StreamServer, allowing use with the gevent StreamServer (and possibly others).

Short Example

SSH

If the paramiko library is installed, the TelnetHanlder can be used via an SSH server for significantly improved security. paramiko_ssh contains SSHHandler and getRsaKeyFile to make setting up the server trivial. Since the authentication is done prior to invoking the TelnetHandler, any authCallback defined in the TelnetHandler is ignored.

Green

If using the green version of the TelnetHandler, you must use Gevent’s monkey patch_all prior to importing from paramiko_ssh.

Operation Overview

The SocketServer/StreamServer sets up the socket then passes that to an SSHHandler class which authenticates then starts the SSH transport. Within the SSH transport, the client requests a PTY channel (and possibly other channel types, which are denied) and the SSHHandler sets up a TelnetHandler class as the PTY for the channel. If the client never requests a PTY channel, the transport will disconnect after a timeout.

SSH Host Key

To thwart man-in-the-middle attacks, every SSH server provides an RSA key as a unique fingerprint. This unique key should never change, and should be stored in a local file or a database. The getRsaKeyFilemakes this easy by reading the given key file if it exists, or creating the key if it does not. The result should be read once and set in the class definition.

Easy way:

host_key = getRsaKeyFile( FILENAME )
If the FILENAME can be read, the RSA key is read in and returned as an RSAKey object. If the file can’t be read, it generates a new RSA key and stores it in that file.

Long way:

SSH Authentication

Users can authenticate with just a username, a username/publickey or a username/password. Up to three callbacks can be defined, and if all three are defined, all three will be tried before denying the authentication attempt. An SSH client will always provide a username. If no authCallbackXX is defined, the SSH authentication will be set to “none” and any username will be able to log in.

authCallbackUsername(self, username)
Reference to username-only authentication function. Define this function to permit specific usernames to log in without any futher authentication. Raise any exception to deny this authentication attempt.

If defined, this is always tried first.

Default: None

authCallbackKey(self, username, key)
Reference to username/key authentication function. If this is defined, users can log in the SSH client automatically with a key. Raise any exception to deny this authentication attempt.

Default: None

authCallback(self, username, password)
Reference to username/password authentication function. If this is defined, a password is requested. Raise any exception to deny this authentication attempt.

If defined, this is always tried last.

Default: None

SSHHandler uses Paramiko’s ServerInterface as one of its base classes. If you are familiar with Paramiko, feel free to instead override the authentication callbacks as needed.

Short SSH Example

Longer Example

See https://github.com/ianepperson/telnetsrvlib/blob/master/test.py

参考链接


telnetsrv 0.4

Python中文注释报错解决方法

使用Python脚本的时候,代码中一旦有了中文注释便会报错,类似如下内容:

原因

如果文件里有非ASCII字符,需要在第一行或第二行指定编码声明。

解决方法

在第一行或是第二行加入这么一句

完美解决。

参考链接


Python 中文注释报错解决方法

Python异步通信模块asyncore

Python的asyncore模块提供了以异步的方式写入套接字服务的客户端和服务器的基础结构。

模块主要包括:

asyncore.loop(…) - 用于循环监听网络事件。loop()函数负责检测一个字典,字典中保存dispatcher的实例。

asyncore.dispatcher类 - 一个底层套接字对象的简单封装。这个类有少数由异步循环调用的,用来事件处理的函数。

dispatcher类中的writable()和readable()在检测到一个socket可以写入或者数据到达的时候被调用,并返回一个bool值,决定是否调用handle_read或者handle_write。
asyncore.dispatcher_with_send类 - 一个 dispatcher的子类,添加了简单的缓冲输出能力,对简单的客户端很有用。

参考链接


ubuntu 14.04.5下安装多版本Python(2.7.6/3.4.3/3.6.4)

系统环境:ubuntu14.04.5 LTS,系统默认的python版本为2.7.4/3.4.3

但是实际工作中,某些通过pip安装的开发包需要不低于某个版本的python才能正常工作,比如ansible-2.4.2.0,明确只支持python-3.5以上的版本。在低版本的python上,无法正常运行。

而如果我们贸然更新系统python版本的话,会导致系统异常。因此我们借助pyenv实现基于用户的python版本定制。

安装pyenv

查看可以安装的python版本:

这里以安装python-3.6.4为例,首先安装依赖:

安装python

该命令会从github上下载python的源代码,并解压到/tmp目录下,然后在/tmp中执行编译工作。若依赖包没有安装,则会出现编译错误,需要在安装依赖包之后重新执行该命令。

安装完成之后,需要使用如下命令对数据库进行更新:

查看当前已经安装的python版本

其中星号代表是当前系统正在使用的python版本是系统自带的。

设置全局的python版本

从上面,我们可以看出来当前的python版本已经变为了3.6.4。也可以使用pyenv localpyenv shell临时改变python的版本。

如果需要还原设定的python版本为系统自带的版本,则执行如下命令:

确认python版本

上面的操作会导致在当前用户下,执行python或者python3命令的时候都会被定向到python-3.6.4,如果只想执行python3命令的时候被定向到python-3.6.4,则可以直接删除python这个链接:

安装pip

安装完成之后,需要使用如下命令对数据库进行更新:

注意事项:
  • 输入python即可使用新版的python
  • 系统自带的脚本会以/usr/bin/python的方式直接调用老版本的python,因而不会对系统脚本产生影响
  • 使用pip安装第三方模块时会安装到~/.pyenv/versions/3.6.4下,不会和系统模块发生冲突
  • 使用pip安装模块后,可能需要执行pyenv rehash更新数据库

参考链接


Python中用Ctrl+C终止threading初始化的多线程脚本

在编写Python脚本的时候,难免会遇到使用多线程操作的情况。

正常情况下,我们都习惯在Shell中使用Ctrl+C终止脚本的执行,但是在多线程环境中,我们发现Ctrl+C并不能有效终止开启了多线程的脚本。

由于Python提供的threading模块并没有提供线程的退出接口,导致我们没有办法终止已经正常运行的线程,尤其是线程被阻塞在内核的情况下。

解决这个问题的办法就是在线程初始化之后设置为守护模式(setDaemon),并且不要调用join阻塞主线程。这样当主线程退出的时候,其他线程也随之退出了。

下面是我们解决这种问题的一个例子:

参考链接


Python 中用 Ctrl+C 终止多线程程序的问题解决

Python简单实现WebSocket

实现一个简单的聊天室程序,代码如下:

测试页面:

参考链接


python简单实现websocket

Ubuntu 16.04下用Python显示YUV格式的图片

YUV420p to RGB & view

UYVY/YUV422 to RGB and view:

原始链接


YUV to RGB : Python Imaging Library