python wsgi教程4——GET请求

执行之前的程序,然后在浏览器中打开 http://localhost:8051/?a=10&b=w&b=r 这样的url。

环境变量字典中保存了请求信息REQUEST_METHOD和QUERY_STRING。问号之后的内容即为此次请求字符串的值。可以写一个函数对它进行解析,或者直接使用CGI模块的parse_qs函数,它返回一个字典,其值为一个列表。

下面是一个简单的例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#!/usr/bin/env python

from wsgiref.simple_server import make_server
from cgi import parse_qs, escape

html = """
<html>
<body>
<form method="get" action="">
<p>
Age: <input type="text" name="age">
</p>
<p>
Hobbies:
<input name="hobbies" type="checkbox" value="software"> Software
<input name="hobbies" type="checkbox" value="tunning"> Auto Tunning
</p>
<p>
<input type="submit" value="Submit">
</p>
</form>
<p>
Age: %s<br>
Hobbies: %s
</p>
</body>
</html>"""

def application(environ, start_response):

# Returns a dictionary containing lists as values.
d = parse_qs(environ['QUERY_STRING'])

# In this idiom you must issue a list containing a default value.
age = d.get('age', [''])[0] # Returns the first age value.
hobbies = d.get('hobbies', []) # Returns a list of hobbies.

# Always escape user input to avoid script injection
age = escape(age)
hobbies = [escape(hobby) for hobby in hobbies]

response_body = html % (age or 'Empty',
', '.join(hobbies or ['No Hobbies']))

status = '200 OK'

# Now content type is text/html
response_headers = [('Content-Type', 'text/html'),
('Content-Length', str(len(response_body)))]
start_response(status, response_headers)

return [response_body]

httpd = make_server('localhost', 8051, application)
# Now it is serve_forever() in instead of handle_request().
# In Windows you can kill it in the Task Manager (python.exe).
# In Linux a Ctrl-C will do it.
httpd.serve_forever()

python wsgi教程3——响应

将上一个例子的返回:


return [response_body]

改为:


return response_body

再次运行会发现速度变慢了。这是因此服务器对发送过来的字符串是按单个字节进行迭代的,所以最好对返回的字符串用一个可迭代对象包装一下。


如果返回的这个可迭代对象生成多个字符串,那么正文的长度即为这些字符串长度的总和。


接下来看一个例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#! /usr/bin/env python

from wsgiref.simple_server import make_server

def application(environ, start_response):

response_body = ['%s: %s' % (key, value)
for key, value in sorted(environ.items())]
response_body = '\n'.join(response_body)

# Response_body has now more than one string
response_body = ['The Beggining\n',
'*' * 30 + '\n',
response_body,
'\n' + '*' * 30 ,
'\nThe End']

# So the content-lenght is the sum of all string's lengths
content_length = 0
for s in response_body:
content_length += len(s)

status = '200 OK'
response_headers = [('Content-Type', 'text/plain'),
('Content-Length', str(content_length))]
start_response(status, response_headers)

return response_body

httpd = make_server('localhost', 8051, application)
httpd.handle_request()

python wsgi教程2——环境变量

环境变量字典包含了类似于CGI的变量,它是在每次请求时被服务器填充。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#! /usr/bin/env python

# Our tutorial's WSGI server
from wsgiref.simple_server import make_server

def application(environ, start_response):

# Sorting and stringifying the environment key, value pairs
response_body = ['%s: %s' % (key, value)
for key, value in sorted(environ.items())]
response_body = '\n'.join(response_body)

status = '200 OK'
response_headers = [('Content-Type', 'text/plain'),
('Content-Length', str(len(response_body)))]
start_response(status, response_headers)

return [response_body]

# Instantiate the WSGI server.
# It will receive the request, pass it to the application
# and send the application's response to the client
httpd = make_server(
'localhost', # The host name.
8051, # A port number where to wait for the request.
application # Our application object name, in this case a function.
)

# Wait for a single request, serve it and quit.
httpd.handle_request()

执行该脚本,然后在浏览器中打开http://localhost:8051/ 查看效果。

这个例子是将环境变量字典的值全部输出。

python wsgi教程1——介绍

WSGI(Web Server Gateway Interface)并不是一个服务器,而是一个协议。最开始是用Python写的,现在很多语言都有了对应的实现。详细内容可以看这里: http://www.python.org/dev/peps/pep-3333/

WSGI应用程序接口是一个可调用的对象。它必须接收两个固定的参数:一个包含了类似CGI变量的字典;一个可调用的函数用于返回HTTP状态代码和数据头。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# This is our application object. It could have any name,
# except when using mod_wsgi where it must be "application"
def application( # It accepts two arguments:
# environ points to a dictionary containing CGI like environment variables
# which is filled by the server for each received request from the client
environ,
# start_response is a callback function supplied by the server
# which will be used to send the HTTP status and headers to the server
start_response):

# build the response body possibly using the environ dictionary
response_body = 'The request method was %s' % environ['REQUEST_METHOD']

# HTTP response code and message
status = '200 OK'

# These are HTTP headers expected by the client.
# They must be wrapped as a list of tupled pairs:
# [(Header name, Header value)].
response_headers = [('Content-Type', 'text/plain'),
('Content-Length', str(len(response_body)))]

# Send them to the server using the supplied function
start_response(status, response_headers)

# Return the response body.
# Notice it is wrapped in a list although it could be any iterable.
return [response_body]

以上是一个应用程序的基本框架。由于没有服务器,因此这段代码目前还不能运行。

python cgi教程6——Session

Session是位于服务器端的Cookie。它保存在服务器上的文件或者数据库中。每条session是由session id(SID)进行标识。


基于Cookie的SID


Cookie可以长久胡保存SID,直到Cookie过期。用这种方式更快更安全。但是得客户端的浏览器支持Cookie才行。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/usr/bin/env python

import sha, time, Cookie, os

cookie = Cookie.SimpleCookie()
string_cookie = os.environ.get('HTTP_COOKIE')

# If new session
if not string_cookie:
# The sid will be a hash of the server time
sid = sha.new(repr(time.time())).hexdigest()
# Set the sid in the cookie
cookie['sid'] = sid
# Will expire in a year
cookie['sid']['expires'] = 12 * 30 * 24 * 60 * 60
# If already existent session
else:
cookie.load(string_cookie)
sid = cookie['sid'].value

print cookie
print 'Content-Type: text/html\n'
print '<html><body>'

if string_cookie:
print '<p>Already existent session</p>'
else:
print '<p>New session</p>'

print '<p>SID =', sid, '</p>'
print '</body></html>'

我们对服务器的时间进行哈希生成一个唯一的Session ID。


基于Query String的SID



#!/usr/bin/env python

import sha, time, cgi, os

sid = cgi.FieldStorage().getfirst(‘sid’)

if sid: # If session exists
message = ‘Already existent session’
else: # New session
# The sid will be a hash of the server time
sid = sha.new(repr(time.time())).hexdigest()
message = ‘New session’

qs = ‘sid=’ + sid

print “””\
Content-Type: text/html\n

%s


SID = %s


reload



“”” % (message, sid, sid)

这是将sid直接在url中传递。


使用隐藏域



#!/usr/bin/env python

import sha, time, cgi, os

sid = cgi.FieldStorage().getfirst(‘sid’)

if sid: # If session exists
message = ‘Already existent session’
else: # New session
# The sid will be a hash of the server time
sid = sha.new(repr(time.time())).hexdigest()
message = ‘New session’

qs = ‘sid=’ + sid

print “””\
Content-Type: text/html\n

%s


SID = %s







“”” % (message, sid, sid)

这是将sid放在表单中作为隐藏字段提交。


shelve模块


光有session id是不够的,还需要将内容保存到文件或者数据库中。这里可以使用shelve模块保存到文件。

session = shelve.open('/tmp/.session/sess_' + sid, writeback=True)

它打开文件并返回一个类似于字典的对象。

session['lastvisit'] = repr(time.time())

设置session的值。

lastvisit = session.get('lastvisit')

读取刚刚设置的值。

session.close()

最后操作完成之后要记得关闭文件。


Cookie和Shelve


接下来用一个例子展示下Cookie和Shelve共同使用。

#!/usr/bin/env python
import sha, time, Cookie, os, shelve

cookie = Cookie.SimpleCookie()
string_cookie = os.environ.get('HTTP_COOKIE')

if not string_cookie:
   sid = sha.new(repr(time.time())).hexdigest()
   cookie['sid'] = sid
   message = 'New session'
else:
   cookie.load(string_cookie)
   sid = cookie['sid'].value
cookie['sid']['expires'] = 12 * 30 * 24 * 60 * 60

# The shelve module will persist the session data
# and expose it as a dictionary
session_dir = os.environ['DOCUMENT_ROOT'] + '/tmp/.session'
session = shelve.open(session_dir + '/sess_' + sid, writeback=True)

# Retrieve last visit time from the session
lastvisit = session.get('lastvisit')
if lastvisit:
   message = 'Welcome back. Your last visit was at ' + \
      time.asctime(time.gmtime(float(lastvisit)))
# Save the current time in the session
session['lastvisit'] = repr(time.time())

print """\
%s
Content-Type: text/html\n
<html><body>
<p>%s</p>
<p>SID = %s</p>
</body></html>
""" % (cookie, message, sid)

python cgi教程5——Cookie

设置Cookie

有两个与cookie相关的操作,设置cookie和读取cookie。


以下例子展示了cookie的设置。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/usr/bin/env python
import time

# This is the message that contains the cookie
# and will be sent in the HTTP header to the client
print 'Set-Cookie: lastvisit=' + str(time.time());

# To save one line of code
# we replaced the print command with a '\n'
print 'Content-Type: text/html\n'
# End of HTTP header

print '<html><body>'
print 'Server time is', time.asctime(time.localtime())
print '</body></html>'

这是在数据头中使用Set-Cookie进行的操作。


检索Cookie


浏览器返回来的cookie存放于os.environ字典中,对应的字段名为’HTTP_COOKIE’。以下是一个例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/usr/bin/env python
import Cookie, os, time

cookie = Cookie.SimpleCookie()
cookie['lastvisit'] = str(time.time())

print cookie
print 'Content-Type: text/html\n'

print '<html><body>'
print '<p>Server time is', time.asctime(time.localtime()), '</p>'

# The returned cookie is available in the os.environ dictionary
cookie_string = os.environ.get('HTTP_COOKIE')

# The first time the page is run there will be no cookies
if not cookie_string:
print '<p>First visit or cookies disabled</p>'

else: # Run the page twice to retrieve the cookie
print '<p>The returned cookie string was "' + cookie_string + '"</p>'

# load() parses the cookie string
cookie.load(cookie_string)
# Use the value attribute of the cookie to get it
lastvisit = float(cookie['lastvisit'].value)

print '<p>Your last visit was at',
print time.asctime(time.localtime(lastvisit)), '</p>'

print '</body></html>'

使用SimpleCookie对象的load()方法对字符串进行解析。

python cgi教程4——执行Shell命令

可以使用subprocess.Popen或者os.popen4让cgi执行shell命令。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/python
import cgitb; cgitb.enable()

# The subprocess module is new in 2.4
import os, urllib, subprocess as sub

# Retrieve the command from the query string
# and unencode the escaped %xx chars
str_command = urllib.unquote(os.environ['QUERY_STRING'])

p = sub.Popen(['/bin/bash', '-c', str_command],
stdout=sub.PIPE, stderr=sub.STDOUT)
output = urllib.unquote(p.stdout.read())

print """\
Content-Type: text/html\n
<html><body>
<pre>
$ %s
%s
</pre>
</body></html>
""" % (str_command, output)

注意:这只是一个例子,在生产环境中这么使用是非常的不安全。

可以使用Cookies和Session对用户进行认证以提高安全性。

python cgi教程3——表单

cgi模块中有一个FieldStorage类可用于表单处理。


单一字段名


有一个HTML表单如下:

1
2
3
4
5
6
<html><body>
<form method="get" action="/cgi-bin/form1.py">
Name: <input type="text" name="name">
<input type="submit" value="Submit">
</form>
</body></html>

form1.py内容为:

#!/usr/bin/env python
import cgi
form = cgi.FieldStorage() # instantiate only once!
name = form.getfirst('name', 'empty')

# Avoid script injection escaping the user input
name = cgi.escape(name)

print """\
Content-Type: text/html\n
<html><body>
<p>The submitted name was "%s"</p>
</body></html>
""" % name

getfirst方法获取指定字段的第一个值,如果该字段不存在则为空。将表单的方法改为post它同样适用。


为了避免用户提交危险的内容,可以使用cgi.escape()方法对内容进行转换。


多字段名


对于多个字段具有相同名字的可以使用getlist()方法,它返回一个列表包含了这些值。

<html><body>
<form method="post" action="/cgi-bin/form2.py">
Red<input type="checkbox" name="color" value="red">
Green<input type="checkbox" name="color" value="green">
<input type="submit" value="Submit">
</form>
</body></html>

form2.py内容如下:

#!/usr/bin/env python
import cgi
form = cgi.FieldStorage()

# getlist() returns a list containing the
# values of the fields with the given name
colors = form.getlist('color')

print "Content-Type: text/html\n"
print '<html><body>'
print 'The colors list:', colors
for color in colors:
   print '<p>', cgi.escape(color), '</p>'
print '</body></html>'

文件上传

<html><body>
<form enctype="multipart/form-data" action="/cgi-bin/form3.py" method="post">
<p>File: <input type="file" name="file"></p>
<p><input type="submit" value="Upload"></p>
</form>
</body></html>

getfirst()getlist()都只能获取文件的内容。想获取文件名需要使用FieldStorage


form3.py内容如下:

#!/usr/bin/env python
import cgi, os
import cgitb; cgitb.enable()

try: # Windows needs stdio set for binary mode.
    import msvcrt
    msvcrt.setmode (0, os.O_BINARY) # stdin  = 0
    msvcrt.setmode (1, os.O_BINARY) # stdout = 1
except ImportError:
    pass

form = cgi.FieldStorage()

# A nested FieldStorage instance holds the file
fileitem = form['file']

# Test if the file was uploaded
if fileitem.filename:

   # strip leading path from file name to avoid directory traversal attacks
   fn = os.path.basename(fileitem.filename)
   open('files/' + fn, 'wb').write(fileitem.file.read())
   message = 'The file "' + fn + '" was uploaded successfully'

else:
   message = 'No file was uploaded'

print """\
Content-Type: text/html\n
<html><body>
<p>%s</p>
</body></html>
""" % (message,)

大文件上传

在处理大文件时如果内存不足,可以使用生成器将文件分成小片。

可将之前的脚本改写如下:

#!/usr/bin/env python
import cgi, os
import cgitb; cgitb.enable()

try: # Windows needs stdio set for binary mode.
    import msvcrt
    msvcrt.setmode (0, os.O_BINARY) # stdin  = 0
    msvcrt.setmode (1, os.O_BINARY) # stdout = 1
except ImportError:
    pass

form = cgi.FieldStorage()

# Generator to buffer file chunks
def fbuffer(f, chunk_size=10000):
   while True:
      chunk = f.read(chunk_size)
      if not chunk: break
      yield chunk

# A nested FieldStorage instance holds the file
fileitem = form['file']

# Test if the file was uploaded
if fileitem.filename:

   # strip leading path from file name to avoid directory traversal attacks
   fn = os.path.basename(fileitem.filename)
   f = open('files/' + fn, 'wb', 10000)

   # Read the file in chunks
   for chunk in fbuffer(fileitem.file):
      f.write(chunk)
   f.close()
   message = 'The file "' + fn + '" was uploaded successfully'

else:
   message = 'No file was uploaded'

print """\
Content-Type: text/html\n
<html><body>
<p>%s</p>
</body></html>
""" % (message,)

python cgi教程2——调试

有时数据头出错是很难定位的,除非有权限访问服务器日志。


好在Python有cgitb模块,可以将异常的堆栈信息放在正文中,作为HTML输出。


以下是一个简单的例子:

1
2
3
4
#!/usr/bin/env python
print "Content-Type: text/html\n"
import cgitb; cgitb.enable()
print 1/0

也可以使用handler()方法进行捕获异常处理。

1
2
3
4
5
6
7
8
#!/usr/bin/env python
print "Content-Type: text/html"
print
import cgitb
try:
f = open('non-existent-file.txt', 'r')
except:
cgitb.handler()

还有一种更直接的方法,将数据头设为”text/plain”并把标准错误输出设置到标准输出。

1
2
3
4
5
print "Content-Type: text/plain"
print
import sys
sys.stderr = sys.stdout
f = open('non-existent-file.txt', 'r')

注意:这些只是用于在开发阶段,在生产环境中要把它禁用。以免异常信息被攻击者利用。

python cgi教程1——Hello World

简介

CGI(Common Gateway Interface),通用网关接口的简称。它是客户端和服务器程序进行数据传输的一种标准。


一个CGI程序可以使用任何语言编写,通常它是放在Web服务器(如Apache)目录下的cgi-bin目录里。


实例


接下来看一个简单的例子。

1
2
3
4
5
6
7
8
9
10
#!/usr/bin/env python
print "Content-Type: text/html"
print
print """\
<html>
<body>
<h2>Hello World!</h2>
</body>
</html>
"""

脚本程序的第一行指定了python解释器的路径。在你系统中它也可能为:

1
2
3
4
#!/usr/bin/python
#!/usr/bin/python2
#!c:\Python26\python.exe
#!c:\Python27\python.exe
1
2
print "Content-Type: text/html"
print

脚本必须输出一个HTTP的头,它由一条或者多条消息构成,然后再一个空行。空行是必需的,它意味着头的结束。

这里我们想要把输出作为HTML解释,因此指定Content-Type为 text/html。


这里也可以写成:

1
print "Content-Type: text/html\n"

保存以上脚本,并添加执行权限。然后在浏览器中访问执行该脚本,应该可以看到”Hello World”这几个字。