2023-10-05发表2023-10-05更新Notes

Flask应用的几种简单的Docker部署方式

写完Flask应用的下一步就是部署了，但部署当然不能用Flask自带的开发服务器。最近尝试了几种常用的部署方式，操作也都很简单，很容易就能做成docker。在这里记录一下。

对于承载Flask的服务器的选择很多，可以使用独立的python wsgi服务器，也可以通过中间件把Flask应用部署在一般的HTTP服务器上。

gunicorn

gunicorn是一个python下的独立wsgi服务器，自身就能直接提供Flask应用的服务。并且可以直接用pypy3运行。在python3/pypy3映像的基础上安装相应的python库即可构建该运行环境。

pip install Flask gunicorn

服务器相关的配置可以通过新建一个gunicorn的配置文件来指定。端口号要与Flask应用中配置的一致

# gunicorn_config.py

bind = "192.168.1.1:8000" # 设置监听的地址和端口
workers = 3 # 设置worker进程数量
accesslog = '-' # 将accesslog输出到stdout

写成Dockerfile，安装环境并导入配置和应用

# Dockerfile

FROM pypy:3-7.3-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --no-cache-dir Flask gunicorn

ENV FLASK_DEBUG=False
ENV FLASK_APP=appv1

COPY ./appv1 .
COPY ./gunicorn_config.py .

EXPOSE 8000
CMD ["gunicorn", "-c","gunicorn_config.py","appv1:create_app()"]

这个方法很方便，能像Flask的开发服务器那样一键运行Flask应用，并且可以直接使用速度更快的pypy3。但对于http服务的可配置项要比其他的http服务器少得多。对于静态资源也并不友好。

apache2+mod_wsgi

apache2是一个众所周知的http服务器。Flask当然也可以部署在上面。除了用cgi之外，apache2也有一个mod_wsgi的模块可以直接承载Flask。

要构建该运行环境，可以选择在apache2映像的基础上直接用包管理器安装mod_wsgi和python3，然后用pip安装相关python库。alpine的包管理器的mod_wsgi包依赖于python3，如果要用pypy3则需要自行编译。

安装环境需要的命令：

apk add apache2-mod-wsgi
python3 -m ensurepip # 先安装pip
python3 -m pip install --no-cache-dir flask

想要启动应用，首先需要创建一个.wsgi文件。

# app.wsgi

import sys
sys.path.insert(0, '/app') # 指定flask应用所在路径
from appv1 import create_app
application = create_app()

然后在httpd.conf中添加该wsgi相关配置项即可应用mod_wsgi

# httpd.conf

...

LoadModule wsgi_module /usr/lib/apache2/mod_wsgi.so
WSGIDaemonProcess app user=www-data group=www-data threads=5
WSGIScriptAlias / /app/app.wsgi
<Directory "/app">
    WSGIProcessGroup app
    WSGIApplicationGroup %{GLOBAL}
    Require all allow
</Directory>

写成Dockerfile，安装环境并把app应用、app.wsgi、修改过的httpd.conf放入docker中。

# Dockerfile

FROM httpd:alpine3.18

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.tuna.tsinghua.edu.cn/g' /etc/apk/repositories;\
    apk add --no-cache apache2-mod-wsgi;\
    python3 -m ensurepip;\
    python3 -m pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --no-cache-dir flask;\
    python3 -m pip uninstall pip -y

WORKDIR /app

ENV FLASK_DEBUG=False
ENV FLASK_APP=appv1
COPY ./appv1 .
COPY ./app.wsgi .
COPY ./httpd.conf /usr/local/apache2/conf/httpd.conf

EXPOSE 8000
CMD ["httpd-foreground"]

这个方法就是将flask应用部署在最常用的apache2服务器上。HTTP相关的配置项和功能很完善。

lighttpd+fastcgi

对于apache2之外的HTTP服务器，就需要使用普通的FastCGI或者CGI来部署了。在这里咱使用的HTTP服务器是轻量的lighttpd，同时也需要python的flipflop库(flup的简化分支)来支持fastcgi。

该运行环境可以在python3的alpine映像的基础上，安装lighttpd和python库来构建。安装命令：

apk add lighttpd
python3 -m pip install flask flipflop

对于fastcgi，需要的是一个.fcgi文件。注意在fcgi文件中需要指定python3执行。

# app.fcgi

#!/usr/local/bin/python3
import sys
sys.path.insert(0, '/app') # 指定flask应用所在路径
from appv1 import create_app
app = create_app()
from flipflop import WSGIServer
if __name__ == '__main__':
    WSGIServer(app).run()

然后在lighttpd配置文件中添加相关配置项。

# lighttpd.conf

...

server.modules += "mod_fastcgi"

fastcgi.server = ("/app" =>
    ((
        "socket" => "/tmp/app-fcgi.sock",
        "bin-path" => "/app/app.fcgi",
        "check-local" => "disable",
        "max-procs" => 1
    ))
)

在lighttpd启动时，会开启一个监听socket的wsgi。处理请求时lighttpd会通过访问这个socket来与python应用程序进行联系。

写成Dockerfile，安装环境并把app应用、app.fcgi、修改过的lighttpd.conf放入docker中。

# Dockerfile

FROM python:3.11-alpine3.18

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.tuna.tsinghua.edu.cn/g' /etc/apk/repositories;\
    apk add --no-cache lighttpd;\
    python3 -m pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --no-cache-dir flask flipflop;\
    python3 -m pip uninstall pip -y

WORKDIR /app

ENV FLASK_DEBUG=False
ENV FLASK_APP=appv1
COPY ./appv1 .
COPY ./app.fcgi .
RUN chmod +x ./app.fcgi
COPY ./lighttpd.conf /etc/lighttpd/lighttpd.conf

EXPOSE 8000
CMD ["lighttpd", "-D", "-f", "/etc/lighttpd/lighttpd.conf"]

咱尝试过使用pypy3作为python解释器。但似乎pypy3与flipflop不兼容，处理请求时lighttpd不能连接到wsgi的unix socket，没有找到解决的办法。

这个方法使用的是轻量级的lighttpd。构建的容器映像最小，对内存资源占用得也最少。

(并不严谨的)性能比较

用ApacheBench简单做了一下性能测试。使用的flask应用是咱自己练手写的，测试用的路由对于每个请求会用SQLAlchemy对mysql数据库进行查询一次，同时也有返回一个静态图片资源。mysql服务和应用docker部署在同一个主机上。SQLAlchemy的连接池大小数量是自动的。

咱也不太了解实际情况的负载会是怎样的。ab和服务的参数都是随便指定的。设置并发数为200，测试10000个请求，分别给三种部署方式测试一下。都使用3个cpu核，其他很多因素都没有控制，只简单参考一下吧。

ab -n 10000 -c 200 http://192.168.1.1:8000/get_random

gunicorn(+pypy3):

Document Path: /get_random
Document Length: 1052 bytes

Concurrency Level: 200
Time taken for tests: 38.380 seconds
Complete requests: 10000
Failed requests: 9832
(Connect: 0, Receive: 0, Length: 9832, Exceptions: 0)
Total transferred: 11995771 bytes
HTML transferred: 10307332 bytes
Requests per second: 260.55 [#/sec] (mean)
Time per request: 767.603 [ms] (mean)
Time per request: 3.838 [ms] (mean, across all concurrent requests)
Transfer rate: 305.23 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 3 3.0 3 85
Processing: 208 752 176.7 722 1701
Waiting: 30 591 237.4 598 1701
Total: 212 755 176.9 726 1705

Percentage of the requests served within a certain time (ms)
50% 726
66% 761
75% 783
80% 806
90% 885
95% 968
98% 1431
99% 1658
100% 1705 (longest request)

此处配置的gunicorn的worker模式为sync，worker数量为3，每个worker一个线程。负载过程中占用mysql的连接数量也一直是3，每个worker占用一个。

apache2+mod_wsgi(+python3.11):

Document Path: /app/get_random
Document Length: 993 bytes

Concurrency Level: 200
Time taken for tests: 40.452 seconds
Complete requests: 10000
Failed requests: 9817
(Connect: 0, Receive: 0, Length: 9817, Exceptions: 0)
Total transferred: 12534432 bytes
HTML transferred: 10316043 bytes
Requests per second: 247.21 [#/sec] (mean)
Time per request: 809.034 [ms] (mean)
Time per request: 4.045 [ms] (mean, across all concurrent requests)
Transfer rate: 302.60 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 4 4.4 3 129
Processing: 350 789 568.6 667 12404
Waiting: 15 697 581.4 554 12404
Total: 354 793 568.6 670 12410

Percentage of the requests served within a certain time (ms)
50% 670
66% 779
75% 865
80% 927
90% 1120
95% 1469
98% 2114
99% 3429
100% 12410 (longest request)

此处配置的wsgi进程的thread数为5。过程中与mysql的连接数一直也是5个，大概也是每个thread各占用一个连接。

lighttpd+fastcgi(+python3.11):

Concurrency Level: 200
Time taken for tests: 91.466 seconds
Complete requests: 10000
Failed requests: 9760
(Connect: 0, Receive: 0, Length: 9760, Exceptions: 0)
Total transferred: 11995957 bytes
HTML transferred: 10307582 bytes
Requests per second: 109.33 [#/sec] (mean)
Time per request: 1829.317 [ms] (mean)
Time per request: 9.147 [ms] (mean, across all concurrent requests)
Transfer rate: 128.08 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 4 4.0 3 110
Processing: 55 1800 735.6 1885 6402
Waiting: 13 1792 740.8 1881 6402
Total: 56 1804 735.5 1889 6405

Percentage of the requests served within a certain time (ms)
50% 1889
66% 2033
75% 2098
80% 2134
90% 2266
95% 3098
98% 3682
99% 3948
100% 6405 (longest request)

此处fcgi使用的是默认配置。但过程中应用占用mysql的连接数量居然高达15个，而且都一直是exec状态。lighttpd和wsgi的进程只占用了0.5个核，大部分负载都是mysql服务产生的。

服务	性能	内存占用
gunicorn	260.55req/s	250.3MB
apache2	247.21req/s	79.2MB
lighttpd	109.33req/s	59.1MB

从以上结果可以发现，作为独立wsgi服务器的gunicorn表现最佳，而应用wsgi模块的apache2在低负载下比gunicorn稍强，在高负载下比不过gunicorn。应用fastcgi的lighttpd表现却差很多。但这里的瓶颈是数据库服务，大概还是应用与数据库的连接数过多导致的，不知道为啥它要对数据库建立15个连接，也可能是咱自己配置有问题。但lighttpd的映像大小和内存占用都是最低的，对于内存方面是最友好的。

#Python Docker

Flask应用的几种简单的Docker部署方式

gunicorn

apache2+mod_wsgi

lighttpd+fastcgi

(并不严谨的)性能比较

评论

分类

标签

归档

最新文章