Docker 构建 stable diffusion 1.5 webui 搭建流程记录

Docker 构建 SD 1.5 webui

Blind Holmes
v0.0.1

Written with StackEdit.

搭建流程记录

下载 nvidia 官方 cuda 镜像

https://hub.docker.com/r/nvidia/cuda/tags

docker pull nvidia/cuda:12.2.0-devel-ubuntu20.04

运行并创建容器

docker run -id --name sd nvidia/cuda:12.2.0-devel-ubuntu20.04 /bin/bash

连接容器

docker exec -it sd /bin/bash

安装必要环境

apt update; apt install wget git python3 python3-venv python3-pip sudo libgoogle-perftools4 libtcmalloc-minimal4 vim net-tools

创建用户

adduser user
adduser user sudo

以新用户连接容器

docker exec -it --user 1000 sd /bin/bash

创建 SD 目录

sudo mkdir /opt/stable_diffusion_webui;
sudo chown user:user /opt/stable_diffusion_webui;

安装 SD_WEB_UI

遵循:https://github.com/AUTOMATIC1111/stable-diffusion-webui

cd /opt/stable_diffusion_webui;
bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh);

执行后遇到错误

RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

于是直接执行 py 脚本

python3 launch.py

安装了一些依赖以后还是遇到错误

RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

根据提示,增加参数执行

python3 launch.py --skip-torch-cuda-test

之后继续下载并安装必要依赖

猜测错误来源:我是使用的 AMD R9 7945HX + RTX 4070 的笔记本,核显 + 独显的组合,虽然开了独显直连模式,但是在显卡识别上面可能还是有问题,比如识别到了 AMD 集显所以在 cuda 测试中出错。暂时不确定是否是 Torch 的问题。

运行一段时间之后又有报错

ImportError: libGL.so.1: cannot open shared object file: No such file or directory

尝试安装系统依赖:

sudo apt install ffmpeg libsm6 libxext6

安装过程中涉及到 tzdata 的时区设定……需要手动设定,这里是个问题!
设置为非交互式安装可以搞定

DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends tzdata

之后继续执行 launch.py,继续正常下载安装……
之后再度遭遇报错

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

貌似还是显卡识别的问题,先尝试增加执行参数,另外再增加本地端口监听等,方便外部访问

python3 launch.py --skip-torch-cuda-test --precision full --no-half --listen --enable-insecure-extension-access --theme dark --gradio-queue

终于运行起来了,并且能够从外部访问并安装扩展。出图会调用 CPU,速度奇慢无比。接下来要解决一下 CUDA 调用问题……
试试添加参数 --xformers 运行

python3 launch.py --skip-torch-cuda-test --precision full --no-half --listen --enable-insecure-extension-access --theme dark --gradio-queue --xformers

失败,还是 CPU 在执行,并且出图报错:

NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: query : shape=(1, 4096, 1, 512) (torch.float32) key : shape=(1, 4096, 1, 512) (torch.float32) value : shape=(1, 4096, 1, 512) (torch.float32) attn_bias : <class 'NoneType'> p : 0.0 `cutlassF` is not supported because: device=cpu (supported: {'cuda'}) `flshattF` is not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.float16, torch.bfloat16}) max(query.shape[-1] != value.shape[-1]) > 128 `tritonflashattF` is not supported because: device=cpu (supported: {'cuda'}) dtype=torch.float32 (supported: {torch.float16, torch.bfloat16}) max(query.shape[-1] != value.shape[-1]) > 128 Operator wasn't built - see `python -m xformers.info` for more info triton is not available `smallkF` is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 unsupported embed per head: 512
Time taken: 1m 1.21s

估计还是 cuda 调用方面各种问题,看来需要解决一下看看能不能把 --skip-torch-cuda-test --precision full --no-half 这种参数去掉

我在容器内创建一个脚本 test.py

#!/usr/bin/env python3
# coding=utf-8

import torch
print(torch.cuda.device_count())   # --> 0
print(torch.cuda.is_available())   # --> False
print(torch.version.cuda)          # --> 9.0.176
print(torch.cuda.current_device())
print(torch.cuda.is_available())

执行后,打印

/home/user/.local/lib/python3.8/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
0
False
11.8
Traceback (most recent call last):
  File "./test.py", line 10, in <module>
    print(torch.cuda.current_device())
  File "/home/user/.local/lib/python3.8/site-packages/torch/cuda/__init__.py", line 674, in current_device
    _lazy_init()
  File "/home/user/.local/lib/python3.8/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available

可以看出来,完全无法调用 cuda 啊啊。
于是退出 docker,在本机环境安装 torch 后创建并执行同一脚本,打印:

1
True
11.7
0
True

看来,我本机 cuda 驱动和依赖应该没这个问题,那么就还是 docker image 的原生环境问题。
思考一下,找找其他解决方案。

找到了这个:
https://stackoverflow.com/questions/54264338/why-does-pytorch-not-find-my-nvdia-drivers-for-cuda-support


https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags
中拉取现成的 nv pytorch 镜像,编写一个简单的 docker compose 脚本就可以了

看来 nv 专门针对这种场景搞了 docker 镜像,前面是走了弯路了……
使用说明:
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch

唉……,要从头开始了……

docker pull nvcr.io/nvidia/pytorch:23.06-py3

镜像有 8.58 GB,下载过程中正好来整理一下前面的安装命令

安装系统环境

apt update && \
apt upgrade -y && \
DEBIAN_FRONTEND=noninteractive \
apt install -y --no-install-recommends \
wget git python3 python3-venv python3-pip sudo libgoogle-perftools4 libtcmalloc-minimal4 ffmpeg libsm6 libxext6 libpng-dev libjpeg-dev vim net-tools && \
adduser --disabled-password --gecos '' user && \
echo -e "000000\n000000" | passwd user && \
adduser user sudo && \
mkdir /opt/sd && \
chown user:user /opt/sd && \
runuser -l user -c 'cd /opt/sd && bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)'

梳理差不多了,镜像也下载完
先把之前的容器改个名字

docker rename sd sd_bak

然后生成新的容器

docker run --gpus all -id --name sd nvcr.io/nvidia/pytorch:23.06-py3

结果上来就报错哈哈

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

查了一下本机要先安装 nvidia-container-toolkit
https://github.com/NVIDIA/nvidia-container-toolkit
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
https://www.server-world.info/en/note?os=Ubuntu_22.04&p=nvidia&f=2
注意系统版本,使用本机 root 用户执行命令:

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add -;
curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu22.04/nvidia-docker.list > /etc/apt/sources.list.d/nvidia-docker.list;
apt install -y nvidia-container-toolkit;
service docker restart;

接下来执行容器化

docker run --gpus all -id --name sd nvcr.io/nvidia/pytorch:23.06-py3

成功!啥也不说,先确认一下 torch 的情况,登入容器

docker exec -it sd /bin/bash

直接命令行检查

root@a26289624dd6:/workspace# python
Python 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.cuda.device_count())
1
>>> print(torch.cuda.is_available())
True
>>> print(torch.version.cuda)
12.1
>>> print(torch.cuda.current_device())
0
>>> print(torch.cuda.is_available())
True
>>>

赞!有了!直接执行整理好的命令集,直接成了
执行命令运行环境

python3 launch.py --listen --enable-insecure-extension-access --theme dark --gradio-queue --xformers

倒是可以运行起来了,但是这里出了一个插曲,xformers 报错:

Launching Web UI with arguments: --listen --enable-insecure-extension-access --theme dark --gradio-queue --xformers
NOTE! Installing ujson may make loading annotations faster.
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
    PyTorch 2.0.1+cu118 with CUDA 1108 (you have 2.1.0a0+4136153)
    Python  3.10.11 (you have 3.10.6)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details
*** Error setting up CodeFormer
    Traceback (most recent call last):
      File "/opt/sd/stable-diffusion-webui/modules/codeformer_model.py", line 33, in setup_model
        from facelib.utils.face_restoration_helper import FaceRestoreHelper
      File "/opt/sd/stable-diffusion-webui/repositories/CodeFormer/facelib/utils/face_restoration_helper.py", line 7, in <module>
        from facelib.detection import init_detection_model
      File "/opt/sd/stable-diffusion-webui/repositories/CodeFormer/facelib/detection/__init__.py", line 11, in <module>
        from .yolov5face.face_detector import YoloDetector
      File "/opt/sd/stable-diffusion-webui/repositories/CodeFormer/facelib/detection/yolov5face/face_detector.py", line 20, in <module>
        IS_HIGH_VERSION = tuple(map(int, torch.__version__.split('+')[0].split('.'))) >= (1, 9, 0)
    ValueError: invalid literal for int() with base 10: '0a0'

---

可能在依赖安装顺序上存在问题,于是重新安装 pip 依赖试试看

/usr/bin/python3 -m pip install --upgrade pip;
pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1+cu118 torchtext==0.15.1 torchdata==0.6.0 --extra-index-url https://download.pytorch.org/whl/cu118 -U;
pip install xformers==0.0.19 triton==2.0.0 -U

呃,装完依赖以后项目直接跑不起来了

Launching Web UI with arguments: --listen --enable-insecure-extension-access --theme dark --gradio-queue --xformers
/home/user/.local/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/home/user/.local/lib/python3.10/site-packages/torchvision/image.so: undefined symbol: _ZN3c106detail23torchInternalAssertFailEPKcS2_jS2_RKSs'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
Traceback (most recent call last):
  File "/opt/sd/stable-diffusion-webui/launch.py", line 38, in <module>
    main()
  File "/opt/sd/stable-diffusion-webui/launch.py", line 34, in main
    start()
  File "/opt/sd/stable-diffusion-webui/modules/launch_utils.py", line 340, in start
    import webui
  File "/opt/sd/stable-diffusion-webui/webui.py", line 28, in <module>
    import pytorch_lightning   # noqa: F401 # pytorch_lightning should be imported after torch, but it re-enables warnings on import so import once to disable them
  File "/home/user/.local/lib/python3.10/site-packages/pytorch_lightning/__init__.py", line 35, in <module>
    from pytorch_lightning.callbacks import Callback  # noqa: E402
  File "/home/user/.local/lib/python3.10/site-packages/pytorch_lightning/callbacks/__init__.py", line 14, in <module>
    from pytorch_lightning.callbacks.batch_size_finder import BatchSizeFinder
  File "/home/user/.local/lib/python3.10/site-packages/pytorch_lightning/callbacks/batch_size_finder.py", line 24, in <module>
    from pytorch_lightning.callbacks.callback import Callback
  File "/home/user/.local/lib/python3.10/site-packages/pytorch_lightning/callbacks/callback.py", line 25, in <module>
    from pytorch_lightning.utilities.types import STEP_OUTPUT
  File "/home/user/.local/lib/python3.10/site-packages/pytorch_lightning/utilities/types.py", line 27, in <module>
    from torchmetrics import Metric
  File "/home/user/.local/lib/python3.10/site-packages/torchmetrics/__init__.py", line 14, in <module>
    from torchmetrics import functional  # noqa: E402
  File "/home/user/.local/lib/python3.10/site-packages/torchmetrics/functional/__init__.py", line 14, in <module>
    from torchmetrics.functional.audio._deprecated import _permutation_invariant_training as permutation_invariant_training
  File "/home/user/.local/lib/python3.10/site-packages/torchmetrics/functional/audio/__init__.py", line 14, in <module>
    from torchmetrics.functional.audio.pit import permutation_invariant_training, pit_permutate
  File "/home/user/.local/lib/python3.10/site-packages/torchmetrics/functional/audio/pit.py", line 23, in <module>
    from torchmetrics.utilities import rank_zero_warn
  File "/home/user/.local/lib/python3.10/site-packages/torchmetrics/utilities/__init__.py", line 14, in <module>
    from torchmetrics.utilities.checks import check_forward_full_state_property
  File "/home/user/.local/lib/python3.10/site-packages/torchmetrics/utilities/checks.py", line 25, in <module>
    from torchmetrics.metric import Metric
  File "/home/user/.local/lib/python3.10/site-packages/torchmetrics/metric.py", line 30, in <module>
    from torchmetrics.utilities.data import (
  File "/home/user/.local/lib/python3.10/site-packages/torchmetrics/utilities/data.py", line 22, in <module>
    from torchmetrics.utilities.imports import _TORCH_GREATER_EQUAL_1_12, _XLA_AVAILABLE
  File "/home/user/.local/lib/python3.10/site-packages/torchmetrics/utilities/imports.py", line 48, in <module>
    _TORCHAUDIO_GREATER_EQUAL_0_10: Optional[bool] = compare_version("torchaudio", operator.ge, "0.10.0")
  File "/home/user/.local/lib/python3.10/site-packages/lightning_utilities/core/imports.py", line 73, in compare_version
    pkg = importlib.import_module(package)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/home/user/.local/lib/python3.10/site-packages/torchaudio/__init__.py", line 1, in <module>
    from torchaudio import (  # noqa: F401
  File "/home/user/.local/lib/python3.10/site-packages/torchaudio/_extension/__init__.py", line 43, in <module>
    _load_lib("libtorchaudio")
  File "/home/user/.local/lib/python3.10/site-packages/torchaudio/_extension/utils.py", line 61, in _load_lib
    torch.ops.load_library(path)
  File "/home/user/.local/lib/python3.10/site-packages/torch/_ops.py", line 643, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.11.0: cannot open shared object file: No such file or directory

删掉项目目录中的 venv 目录,重新执行 webui.sh
经过一轮安装后问题照旧,经过调试,发现其实是 torchvision 无法被引入导致,那么激活 venv 重装 torchvision 试试,重装还报错了

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
xformers 0.0.20 requires pyre-extensions==0.0.29, which is not installed.
numba 0.57.1 requires numpy<1.25,>=1.21, but you have numpy 1.25.1 which is incompatible.
google-auth 2.22.0 requires urllib3<2.0, but you have urllib3 2.0.3 which is incompatible.
blendmodes 2022 requires Pillow<10,>=9.0.0, but you have pillow 10.0.0 which is incompatible.

不过项目倒是 run 起来了,现在遇到了显存分配问题

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 7.73 GiB total capacity; 3.42 GiB already allocated; 42.25 MiB free; 3.47 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

查了一些资料,怀疑是进程通讯限制导致的,增加 --ipc host 参数,再重来了

docker run --gpus all --ipc host -id --name sd nvcr.io/nvidia/pytorch:23.06-py3

还是不行,经历了若干轮重装后,各种报错,总之,最后要重装 xformers

pip3 install -U xformers

奇怪,装完了以后 venv 消失了……
又经过重装,回到原点,重装 torchvision 试试

pip install --force-reinstall torchvision

成了!终于正常 run 起来!测试了一下出图,查看运行效率和 cpu 占用,都正常。终于搞定了。
整个安装流程需要好好梳理一下!

用最简单的方式建立一个远程 GIT 仓库

前提

  • 一台远程服务器,可 SSH 登录,允许无密码登录
  • Linux 系统
  • GIT 已安装

开始

用户根目录建立一个空文件夹,然后初始化一下

cd ~
mkdir code.git
git init --bare code.git

接下来,在 ~/.ssh/authorized_keys 文件中把客户机本机的公钥追加到文件末尾

之后,在客户机就可以通过

git clone 用户@IP:code.git

对项目进行克隆啦

全部结束

项目开发上线告一段落后期 GIT 仓库清理

重要提示,这是个危险操作,务必慎重

前言

项目开发后期,上线验收通过,沉淀一段时间以后,评估项目不再修改。这时候 GIT 仓库往往堆积了冗余的提交记录,为了节省远程 GIT 仓库所占用的磁盘空间。我们需要保留最新一个提交的文件,删除过去所有的提交记录来释放空间。这时候就用到这些操作。

开始

以 master 分支为例,我们首先切换到 master 分支

# 切换分支
git checkout master
# 更新分支
git pull

接下来,我们需要将当前代码,在抛弃历史提交的情况下,建立一个新的分支

git checkout --orphan tempBranch
git add --all
git commit -m 'save'

此时我们已经有了一个干净的新分支,接下来就是危险操作,将这个分支强行覆盖远程 master 分支

git push origin HEAD/master -f

这时候您的远程分支历史提交记录已经没了,已经没了!我们删除旧的 master 分支,重命名当前分支为 master 分支

git branch -D master
git branch -m master

现在,查看 git log,发现分支的历史 commit 全都没了,最后调用 git 垃圾回收,彻底删除以前的不要了的文件

git gc --aggressive --prune=now

全部结束

中国人民在WordPress中如何使文章自动发布到其他社交媒体

墙国有时候强起网站来可谓是不分敌我,WordPress 官方插件 Jetpack 以其高度集成性完整性和易用性理应成为使用WordPress进行建站的站长的不二之选,可奈何由于墙国的关系,导致加载 Jetpack 资源的时候会拖慢网页的加载。最近折腾折腾博客,顺便研究一下用那些插件来替代 Jetpack 的哪些功能。本帖介绍用什么方法可以在文章发布后,自动免费将文章的链接和介绍发布到其他社交媒体的方法。

https://ifttt.com/

IFTTT 就是专门帮人解决网络应用互联的,我们这里WordPress的自动转发就要通过IFTTT。

访问网站,会要求人们登录注册,可以邮箱注册也可以直接通过 Google 或者 Apple 帐号登录。登录成功后,会让你选择 Applets ,不要管它,直接访问 https://ifttt.com/wordpress 在Try something new... 的下面,可以看到好几个方块,自动发布到 Facebook 的是 Automatically share new posts to a Facebook Page,Twitter 是 Automatically tweet your new blog posts,Tumblr 是 Automatically cross-post from WordPress to Tumblr,Reddit 是 Automatically submit new WordPress posts to Reddit

点击 Connect 以后就开始一系列授权流程,其中要注意的是,转发到 Facebook 只能发布到粉丝页中,需要先建立好 Facebook 的粉丝页才能够授权成功。

授权成功后,发新帖子,会在5分钟左右完成转发。

不过实际测试过程中,转发 Twitter 有概率啥也没有发,发到 Facebook 粉丝页都能成功,不过作为一个没有粉丝的纯屌丝发了也没人看。其他嘛~还是得以来手动转发,从这个角度来看还是挺鸡肋的……下回分享怎么方便的手动转发

最新发现,Twitter没有转发的那篇被转发了只是隔了很晚……,Reddit的没有被转发。