{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "LxWWPVb-On3_" }, "source": [ "
\n", " \n", "
 
\n", "
\n", " MMYOLO 官方仓库\n", " \n", " \n", " ⭐Star\n", " \n", " \n", "     \n", " MMYOLO 开发计划\n", " \n", " \n", " 💬欢迎留言\n", " \n", " \n", "     \n", " 玩转 MMYOLO 系列视频教程\n", " \n", " \n", " 👍一键三连\n", " \n", " \n", "
\n", " MMYOLO 定位为 YOLO 系列工业核心算法库,提供统一全面的评测流程,轻松可定制的模块组件和支持多任务且高效的训练部署流程。\n", "
\n", "
 
\n", "\n", "\"Open\n", "\n", "[![PyPI](https://img.shields.io/pypi/v/mmyolo)](https://pypi.org/project/mmyolo)\n", "[![docs](https://img.shields.io/badge/docs-latest-blue)](https://mmyolo.readthedocs.io/en/latest/)\n", "[![deploy](https://github.com/open-mmlab/mmyolo/workflows/deploy/badge.svg)](https://github.com/open-mmlab/mmyolo/actions)\n", "[![codecov](https://codecov.io/gh/open-mmlab/mmyolo/branch/master/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmyolo)\n", "[![license](https://img.shields.io/github/license/open-mmlab/mmyolo.svg)](https://github.com/open-mmlab/mmyolo/blob/master/LICENSE)\n", "[![open issues](https://isitmaintained.com/badge/open/open-mmlab/mmyolo.svg)](https://github.com/open-mmlab/mmyolo/issues)\n", "[![issue resolution](https://isitmaintained.com/badge/resolution/open-mmlab/mmyolo.svg)](https://github.com/open-mmlab/mmyolo/issues)\n", "\n", "[📘文档](https://mmyolo.readthedocs.io/zh_CN/latest/) |\n", "[🛠️安装教程](https://mmyolo.readthedocs.io/zh_CN/latest/install.html) |\n", "[👀模型库](https://mmyolo.readthedocs.io/zh_CN/latest/model_zoo.html) |\n", "[🆕更新日志](https://mmyolo.readthedocs.io/zh_CN/latest/notes/changelog.html) |\n", "[🤔反馈问题](https://github.com/open-mmlab/mmyolo/issues/new/choose)" ] }, { "cell_type": "markdown", "metadata": { "id": "yIK1y9AeWC0w" }, "source": [ "![封面](https://user-images.githubusercontent.com/27466624/203250285-38ae8deb-c54a-4d55-b2e7-5918de40ee98.png)\n", "# 10 分钟带你换遍主干网络\n", "\n", "- MMYOLO 中只有 YOLO 系列的主干网,想尝试用 MMDet 中的?\n", "- 用 YOLO 打比赛的时候想用 SwinTransformer?\n", "- 自监督预训练很火,也想在 YOLO 中试一试?\n", "\n", "\n", "🔥本教程带你一网打尽🔥\n", "\n", "**注意**:\n", "1. 本文定稿时间为 2022.11.20。MMYOLO [![GitHub release](https://img.shields.io/github/release/open-mmlab/mmyolo.svg)](https://GitHub.com/open-mmlab/mmyolo/) 和其他 OpenMMLab 系列仓库在不断更新迭代,如果未来直接运行本脚本出错,请及时更新到最新代码或者锁定版本。\n", "\n", "2. 使用其他主干网络时,需要保证主干网络的输出通道与 Neck 的输入通道相匹配。\n", "3. 下面给出的配置文件,仅能确保训练可以正确运行,直接训练性能可能不是最优的。因为某些主干网络需要配套特定的学习率、优化器等超参数。后续会推出“训练技巧”补充训练调优相关内容。\n", "4. 词语 `主干网络`、`骨干网络`、`backbone` 都指代同一模块。" ] }, { "cell_type": "markdown", "metadata": { "id": "gkT3NdKDOt1X" }, "source": [ "## 0 环境准备" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "8_mohlpPjfKR", "outputId": "dc91b21f-9a38-4203-fca2-f2943f0d09d2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python 3.7.15\n", "nvcc: NVIDIA (R) Cuda compiler driver\n", "Copyright (c) 2005-2021 NVIDIA Corporation\n", "Built on Sun_Feb_14_21:12:58_PST_2021\n", "Cuda compilation tools, release 11.2, V11.2.152\n", "Build cuda_11.2.r11.2/compiler.29618528_0\n", "gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0\n", "Copyright (C) 2017 Free Software Foundation, Inc.\n", "This is free software; see the source for copying conditions. There is NO\n", "warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n", "\n", "1.12.1+cu113\n", "True\n" ] } ], "source": [ "#@title ### 0.1 检查环境版本\n", "# 检查 python 版本\n", "!python -V\n", "\n", "# 检查 nvcc 版本\n", "!nvcc -V\n", "\n", "# 检查 GCC 版本\n", "!gcc --version\n", "\n", "# 检查 PyTorch 版本\n", "import torch, torchvision\n", "print(torch.__version__)\n", "print(torch.cuda.is_available())" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "9pf6qCA-OteY", "outputId": "ee8a7672-502a-484d-b40f-faa12445141b" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Collecting openmim\n", " Downloading openmim-0.3.3-py2.py3-none-any.whl (50 kB)\n", "\u001b[K |████████████████████████████████| 50 kB 5.6 MB/s \n", "\u001b[?25hRequirement already satisfied: pip>=19.3 in /usr/local/lib/python3.7/dist-packages (from openmim) (21.1.3)\n", "Collecting model-index\n", " Downloading model_index-0.1.11-py3-none-any.whl (34 kB)\n", "Requirement already satisfied: tabulate in /usr/local/lib/python3.7/dist-packages (from openmim) (0.8.10)\n", "Collecting colorama\n", " Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)\n", "Collecting rich\n", " Downloading rich-12.6.0-py3-none-any.whl (237 kB)\n", "\u001b[K |████████████████████████████████| 237 kB 58.7 MB/s \n", "\u001b[?25hRequirement already satisfied: Click in /usr/local/lib/python3.7/dist-packages (from openmim) (7.1.2)\n", "Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from openmim) (1.3.5)\n", "Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from openmim) (2.23.0)\n", "Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from model-index->openmim) (6.0)\n", "Requirement already satisfied: markdown in /usr/local/lib/python3.7/dist-packages (from model-index->openmim) (3.4.1)\n", "Collecting ordered-set\n", " Downloading ordered_set-4.1.0-py3-none-any.whl (7.6 kB)\n", "Requirement already satisfied: importlib-metadata>=4.4 in /usr/local/lib/python3.7/dist-packages (from markdown->model-index->openmim) (4.13.0)\n", "Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata>=4.4->markdown->model-index->openmim) (3.10.0)\n", "Requirement already satisfied: typing-extensions>=3.6.4 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata>=4.4->markdown->model-index->openmim) (4.1.1)\n", "Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.7/dist-packages (from pandas->openmim) (1.21.6)\n", "Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas->openmim) (2022.6)\n", "Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas->openmim) (2.8.2)\n", "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas->openmim) (1.15.0)\n", "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->openmim) (2.10)\n", "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->openmim) (3.0.4)\n", "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->openmim) (1.24.3)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->openmim) (2022.9.24)\n", "Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/dist-packages (from rich->openmim) (2.6.1)\n", "Collecting commonmark<0.10.0,>=0.9.0\n", " Downloading commonmark-0.9.1-py2.py3-none-any.whl (51 kB)\n", "\u001b[K |████████████████████████████████| 51 kB 7.3 MB/s \n", "\u001b[?25hInstalling collected packages: ordered-set, commonmark, rich, model-index, colorama, openmim\n", "Successfully installed colorama-0.4.6 commonmark-0.9.1 model-index-0.1.11 openmim-0.3.3 ordered-set-4.1.0 rich-12.6.0\n", "Name: openmim\n", "Version: 0.3.3\n", "Summary: MIM Installs OpenMMLab packages\n", "Home-page: https://github.com/open-mmlab/mim\n", "Author: MIM Authors\n", "Author-email: openmmlab@gmail.com\n", "License: UNKNOWN\n", "Location: /usr/local/lib/python3.7/dist-packages\n", "Requires: requests, pandas, pip, model-index, tabulate, colorama, Click, rich\n", "Required-by: \n", "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Looking in links: https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/index.html\n", "Collecting mmengine==0.3.1\n", " Downloading mmengine-0.3.1-py3-none-any.whl (305 kB)\n", "\u001b[K |████████████████████████████████| 305 kB 24.2 MB/s \n", "\u001b[?25hCollecting yapf\n", " Downloading yapf-0.32.0-py2.py3-none-any.whl (190 kB)\n", "\u001b[K |████████████████████████████████| 190 kB 66.2 MB/s \n", "\u001b[?25hRequirement already satisfied: termcolor in /usr/local/lib/python3.7/dist-packages (from mmengine==0.3.1) (2.1.0)\n", "Collecting addict\n", " Downloading addict-2.4.0-py3-none-any.whl (3.8 kB)\n", "Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from mmengine==0.3.1) (6.0)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from mmengine==0.3.1) (1.21.6)\n", "Requirement already satisfied: opencv-python>=3 in /usr/local/lib/python3.7/dist-packages (from mmengine==0.3.1) (4.6.0.66)\n", "Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from mmengine==0.3.1) (3.2.2)\n", "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmengine==0.3.1) (0.11.0)\n", "Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmengine==0.3.1) (2.8.2)\n", "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmengine==0.3.1) (3.0.9)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmengine==0.3.1) (1.4.4)\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib->mmengine==0.3.1) (4.1.1)\n", "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->mmengine==0.3.1) (1.15.0)\n", "Installing collected packages: yapf, addict, mmengine\n", "Successfully installed addict-2.4.0 mmengine-0.3.1 yapf-0.32.0\n", "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Looking in links: https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/index.html\n", "Collecting mmcv>=2.0.0rc2\n", " Downloading https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/mmcv-2.0.0rc2-cp37-cp37m-manylinux1_x86_64.whl (43.2 MB)\n", "\u001b[K |████████████████████████████████| 43.2 MB 1.4 MB/s \n", "\u001b[?25hRequirement already satisfied: opencv-python>=3 in /usr/local/lib/python3.7/dist-packages (from mmcv>=2.0.0rc2) (4.6.0.66)\n", "Requirement already satisfied: yapf in /usr/local/lib/python3.7/dist-packages (from mmcv>=2.0.0rc2) (0.32.0)\n", "Requirement already satisfied: mmengine in /usr/local/lib/python3.7/dist-packages (from mmcv>=2.0.0rc2) (0.3.1)\n", "Requirement already satisfied: Pillow in /usr/local/lib/python3.7/dist-packages (from mmcv>=2.0.0rc2) (7.1.2)\n", "Requirement already satisfied: addict in /usr/local/lib/python3.7/dist-packages (from mmcv>=2.0.0rc2) (2.4.0)\n", "Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from mmcv>=2.0.0rc2) (6.0)\n", "Requirement already satisfied: packaging in /usr/local/lib/python3.7/dist-packages (from mmcv>=2.0.0rc2) (21.3)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from mmcv>=2.0.0rc2) (1.21.6)\n", "Requirement already satisfied: termcolor in /usr/local/lib/python3.7/dist-packages (from mmengine->mmcv>=2.0.0rc2) (2.1.0)\n", "Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from mmengine->mmcv>=2.0.0rc2) (3.2.2)\n", "Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmengine->mmcv>=2.0.0rc2) (2.8.2)\n", "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmengine->mmcv>=2.0.0rc2) (3.0.9)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmengine->mmcv>=2.0.0rc2) (1.4.4)\n", "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmengine->mmcv>=2.0.0rc2) (0.11.0)\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib->mmengine->mmcv>=2.0.0rc2) (4.1.1)\n", "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->mmengine->mmcv>=2.0.0rc2) (1.15.0)\n", "Installing collected packages: mmcv\n", "Successfully installed mmcv-2.0.0rc2\n", "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Looking in links: https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/index.html\n", "Collecting mmdet>=3.0.0rc2\n", " Downloading mmdet-3.0.0rc3-py3-none-any.whl (1.6 MB)\n", "\u001b[K |████████████████████████████████| 1.6 MB 31.3 MB/s \n", "\u001b[?25hRequirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc2) (1.7.3)\n", "Collecting terminaltables\n", " Downloading terminaltables-3.1.10-py2.py3-none-any.whl (15 kB)\n", "Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc2) (1.15.0)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc2) (1.21.6)\n", "Requirement already satisfied: pycocotools in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc2) (2.0.6)\n", "Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc2) (3.2.2)\n", "Requirement already satisfied: mmengine<1.0.0,>=0.1.0 in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc2) (0.3.1)\n", "Requirement already satisfied: mmcv<2.1.0,>=2.0.0rc1 in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc2) (2.0.0rc2)\n", "Requirement already satisfied: yapf in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmdet>=3.0.0rc2) (0.32.0)\n", "Requirement already satisfied: opencv-python>=3 in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmdet>=3.0.0rc2) (4.6.0.66)\n", "Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmdet>=3.0.0rc2) (6.0)\n", "Requirement already satisfied: packaging in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmdet>=3.0.0rc2) (21.3)\n", "Requirement already satisfied: Pillow in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmdet>=3.0.0rc2) (7.1.2)\n", "Requirement already satisfied: addict in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmdet>=3.0.0rc2) (2.4.0)\n", "Requirement already satisfied: termcolor in /usr/local/lib/python3.7/dist-packages (from mmengine<1.0.0,>=0.1.0->mmdet>=3.0.0rc2) (2.1.0)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmdet>=3.0.0rc2) (1.4.4)\n", "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmdet>=3.0.0rc2) (3.0.9)\n", "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmdet>=3.0.0rc2) (0.11.0)\n", "Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmdet>=3.0.0rc2) (2.8.2)\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib->mmdet>=3.0.0rc2) (4.1.1)\n", "Installing collected packages: terminaltables, mmdet\n", "Successfully installed mmdet-3.0.0rc3 terminaltables-3.1.10\n", "Cloning into 'mmyolo'...\n", "remote: Enumerating objects: 2074, done.\u001b[K\n", "remote: Counting objects: 100% (102/102), done.\u001b[K\n", "remote: Compressing objects: 100% (88/88), done.\u001b[K\n", "remote: Total 2074 (delta 39), reused 35 (delta 13), pack-reused 1972\u001b[K\n", "Receiving objects: 100% (2074/2074), 1.99 MiB | 7.98 MiB/s, done.\n", "Resolving deltas: 100% (1167/1167), done.\n", "/content/mmyolo\n", "Using pip 21.1.3 from /usr/local/lib/python3.7/dist-packages/pip (python 3.7)\n", "Value for scheme.platlib does not match. Please report this to \n", "distutils: /usr/local/lib/python3.7/dist-packages\n", "sysconfig: /usr/lib/python3.7/site-packages\n", "Value for scheme.purelib does not match. Please report this to \n", "distutils: /usr/local/lib/python3.7/dist-packages\n", "sysconfig: /usr/lib/python3.7/site-packages\n", "Value for scheme.headers does not match. Please report this to \n", "distutils: /usr/local/include/python3.7/UNKNOWN\n", "sysconfig: /usr/include/python3.7m/UNKNOWN\n", "Value for scheme.scripts does not match. Please report this to \n", "distutils: /usr/local/bin\n", "sysconfig: /usr/bin\n", "Value for scheme.data does not match. Please report this to \n", "distutils: /usr/local\n", "sysconfig: /usr\n", "Additional context:\n", "user = False\n", "home = None\n", "root = None\n", "prefix = None\n", "Non-user install because site-packages writeable\n", "Created temporary directory: /tmp/pip-ephem-wheel-cache-2uhdg314\n", "Created temporary directory: /tmp/pip-req-tracker-jzlnf0le\n", "Initialized build tracking at /tmp/pip-req-tracker-jzlnf0le\n", "Created build tracker: /tmp/pip-req-tracker-jzlnf0le\n", "Entered build tracker: /tmp/pip-req-tracker-jzlnf0le\n", "Created temporary directory: /tmp/pip-install-16xdzblz\n", "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Looking in links: https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/index.html\n", "Obtaining file:///content/mmyolo\n", " Added file:///content/mmyolo to build tracker '/tmp/pip-req-tracker-jzlnf0le'\n", " Running setup.py (path:/content/mmyolo/setup.py) egg_info for package from file:///content/mmyolo\n", " Created temporary directory: /tmp/pip-pip-egg-info-er3qcg59\n", " Running command python setup.py egg_info\n", " running egg_info\n", " creating /tmp/pip-pip-egg-info-er3qcg59/mmyolo.egg-info\n", " writing /tmp/pip-pip-egg-info-er3qcg59/mmyolo.egg-info/PKG-INFO\n", " writing dependency_links to /tmp/pip-pip-egg-info-er3qcg59/mmyolo.egg-info/dependency_links.txt\n", " writing requirements to /tmp/pip-pip-egg-info-er3qcg59/mmyolo.egg-info/requires.txt\n", " writing top-level names to /tmp/pip-pip-egg-info-er3qcg59/mmyolo.egg-info/top_level.txt\n", " writing manifest file '/tmp/pip-pip-egg-info-er3qcg59/mmyolo.egg-info/SOURCES.txt'\n", " reading manifest template 'MANIFEST.in'\n", " warning: no files found matching 'mmyolo/VERSION'\n", " warning: no files found matching 'mmyolo/.mim/model-index.yml'\n", " warning: no files found matching 'mmyolo/.mim/demo/*/*'\n", " warning: no files found matching '*.py' under directory 'mmyolo/.mim/configs'\n", " warning: no files found matching '*.yml' under directory 'mmyolo/.mim/configs'\n", " warning: no files found matching '*.sh' under directory 'mmyolo/.mim/tools'\n", " warning: no files found matching '*.py' under directory 'mmyolo/.mim/tools'\n", " adding license file 'LICENSE'\n", " writing manifest file '/tmp/pip-pip-egg-info-er3qcg59/mmyolo.egg-info/SOURCES.txt'\n", " Source in /content/mmyolo has version 0.1.3, which satisfies requirement mmyolo==0.1.3 from file:///content/mmyolo\n", " Removed mmyolo==0.1.3 from file:///content/mmyolo from build tracker '/tmp/pip-req-tracker-jzlnf0le'\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from mmyolo==0.1.3) (1.21.6)\n", "Requirement already satisfied: mmcv<2.1.0,>=2.0.0rc1 in /usr/local/lib/python3.7/dist-packages (from mmyolo==0.1.3) (2.0.0rc2)\n", "Requirement already satisfied: mmdet>=3.0.0rc3 in /usr/local/lib/python3.7/dist-packages (from mmyolo==0.1.3) (3.0.0rc3)\n", "Requirement already satisfied: mmengine>=0.3.1 in /usr/local/lib/python3.7/dist-packages (from mmyolo==0.1.3) (0.3.1)\n", "Requirement already satisfied: packaging in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmyolo==0.1.3) (21.3)\n", "Requirement already satisfied: opencv-python>=3 in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmyolo==0.1.3) (4.6.0.66)\n", "Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmyolo==0.1.3) (6.0)\n", "Requirement already satisfied: yapf in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmyolo==0.1.3) (0.32.0)\n", "Requirement already satisfied: Pillow in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmyolo==0.1.3) (7.1.2)\n", "Requirement already satisfied: addict in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmyolo==0.1.3) (2.4.0)\n", "Requirement already satisfied: terminaltables in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc3->mmyolo==0.1.3) (3.1.10)\n", "Requirement already satisfied: pycocotools in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc3->mmyolo==0.1.3) (2.0.6)\n", "Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc3->mmyolo==0.1.3) (1.15.0)\n", "Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc3->mmyolo==0.1.3) (3.2.2)\n", "Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from mmdet>=3.0.0rc3->mmyolo==0.1.3) (1.7.3)\n", "Requirement already satisfied: termcolor in /usr/local/lib/python3.7/dist-packages (from mmengine>=0.3.1->mmyolo==0.1.3) (2.1.0)\n", "Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmdet>=3.0.0rc3->mmyolo==0.1.3) (2.8.2)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmdet>=3.0.0rc3->mmyolo==0.1.3) (1.4.4)\n", "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmdet>=3.0.0rc3->mmyolo==0.1.3) (3.0.9)\n", "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmdet>=3.0.0rc3->mmyolo==0.1.3) (0.11.0)\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib->mmdet>=3.0.0rc3->mmyolo==0.1.3) (4.1.1)\n", "Created temporary directory: /tmp/pip-unpack-vmw9uhnr\n", "Installing collected packages: mmyolo\n", " Value for scheme.platlib does not match. Please report this to \n", " distutils: /usr/local/lib/python3.7/dist-packages\n", " sysconfig: /usr/lib/python3.7/site-packages\n", " Value for scheme.purelib does not match. Please report this to \n", " distutils: /usr/local/lib/python3.7/dist-packages\n", " sysconfig: /usr/lib/python3.7/site-packages\n", " Value for scheme.headers does not match. Please report this to \n", " distutils: /usr/local/include/python3.7/mmyolo\n", " sysconfig: /usr/include/python3.7m/mmyolo\n", " Value for scheme.scripts does not match. Please report this to \n", " distutils: /usr/local/bin\n", " sysconfig: /usr/bin\n", " Value for scheme.data does not match. Please report this to \n", " distutils: /usr/local\n", " sysconfig: /usr\n", " Additional context:\n", " user = False\n", " home = None\n", " root = None\n", " prefix = None\n", " Running setup.py develop for mmyolo\n", " Running command /usr/bin/python3 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '\"'\"'/content/mmyolo/setup.py'\"'\"'; __file__='\"'\"'/content/mmyolo/setup.py'\"'\"';f = getattr(tokenize, '\"'\"'open'\"'\"', open)(__file__) if os.path.exists(__file__) else io.StringIO('\"'\"'from setuptools import setup; setup()'\"'\"');code = f.read().replace('\"'\"'\\r\\n'\"'\"', '\"'\"'\\n'\"'\"');f.close();exec(compile(code, __file__, '\"'\"'exec'\"'\"'))' develop --no-deps\n", " running develop\n", " running egg_info\n", " creating mmyolo.egg-info\n", " writing mmyolo.egg-info/PKG-INFO\n", " writing dependency_links to mmyolo.egg-info/dependency_links.txt\n", " writing requirements to mmyolo.egg-info/requires.txt\n", " writing top-level names to mmyolo.egg-info/top_level.txt\n", " writing manifest file 'mmyolo.egg-info/SOURCES.txt'\n", " reading manifest template 'MANIFEST.in'\n", " warning: no files found matching 'mmyolo/VERSION'\n", " warning: no files found matching 'mmyolo/.mim/demo/*/*'\n", " adding license file 'LICENSE'\n", " writing manifest file 'mmyolo.egg-info/SOURCES.txt'\n", " running build_ext\n", " Creating /usr/local/lib/python3.7/dist-packages/mmyolo.egg-link (link to .)\n", " Adding mmyolo 0.1.3 to easy-install.pth file\n", "\n", " Installed /content/mmyolo\n", " /usr/local/lib/python3.7/dist-packages/torch/utils/cpp_extension.py:411: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.\n", " warnings.warn(msg.format('we could not find ninja.'))\n", "Value for scheme.platlib does not match. Please report this to \n", "distutils: /usr/local/lib/python3.7/dist-packages\n", "sysconfig: /usr/lib/python3.7/site-packages\n", "Value for scheme.purelib does not match. Please report this to \n", "distutils: /usr/local/lib/python3.7/dist-packages\n", "sysconfig: /usr/lib/python3.7/site-packages\n", "Value for scheme.headers does not match. Please report this to \n", "distutils: /usr/local/include/python3.7/UNKNOWN\n", "sysconfig: /usr/include/python3.7m/UNKNOWN\n", "Value for scheme.scripts does not match. Please report this to \n", "distutils: /usr/local/bin\n", "sysconfig: /usr/bin\n", "Value for scheme.data does not match. Please report this to \n", "distutils: /usr/local\n", "sysconfig: /usr\n", "Additional context:\n", "user = False\n", "home = None\n", "root = None\n", "prefix = None\n", "Successfully installed mmyolo-0.1.3\n", "Removed build tracker: '/tmp/pip-req-tracker-jzlnf0le'\n" ] } ], "source": [ "#@title ### 0.2 安装 MMYOLO\n", "# 使用 mim 安装 OpenMMlab 系列依赖包\n", "%pip install -U openmim\n", "%pip show openmim\n", "!mim install \"mmengine==0.3.1\"\n", "!mim install \"mmcv>=2.0.0rc2\"\n", "!mim install \"mmdet>=3.0.0rc2\"\n", "# 下载 MMYOLO 仓库\n", "!git clone https://github.com/open-mmlab/mmyolo.git -b dev\n", "%cd mmyolo \n", "# 使用 mim 以编辑模式安装 MMYOLO\n", "!mim install -v -e ." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Vtym1aNokLaz", "outputId": "e290fc79-7f50-4084-8747-9bf536f2c1b6" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Package Version Source\n", "--------- --------- -----------------------------------------\n", "mmcv 2.0.0rc2 https://github.com/open-mmlab/mmcv\n", "mmdet 3.0.0rc3 https://github.com/open-mmlab/mmdetection\n", "mmengine 0.3.1 https://github.com/open-mmlab/mmengine\n", "mmyolo 0.1.3 /content/mmyolo\n" ] } ], "source": [ "# 查看安装结果\n", "!mim list" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Rd2rIjHikQXC", "outputId": "3dfba3eb-b38e-4fa2-b02e-be13e3ccc10f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2022-11-20 07:55:27-- https://extremevision-js-userfile.oss-cn-hangzhou.aliyuncs.com/user-25117-files/ff6429ed-0f22-4759-aa1e-f3d3f8dc4b40/coco.zip\n", "Resolving extremevision-js-userfile.oss-cn-hangzhou.aliyuncs.com (extremevision-js-userfile.oss-cn-hangzhou.aliyuncs.com)... 47.110.23.49\n", "Connecting to extremevision-js-userfile.oss-cn-hangzhou.aliyuncs.com (extremevision-js-userfile.oss-cn-hangzhou.aliyuncs.com)|47.110.23.49|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 13938757 (13M) [application/zip]\n", "Saving to: ‘data/coco.zip’\n", "\n", "coco.zip 100%[===================>] 13.29M 7.96MB/s in 1.7s \n", "\n", "2022-11-20 07:55:30 (7.96 MB/s) - ‘data/coco.zip’ saved [13938757/13938757]\n", "\n", "Archive: data/coco.zip\n", " creating: data/coco/\n", " inflating: data/coco/LICENSE \n", " creating: data/coco/annotations/\n", " inflating: data/coco/annotations/instances_train2017.json \n", " inflating: data/coco/annotations/instances_val2017.json \n", " creating: data/coco/val2017/\n", " inflating: data/coco/val2017/000000000263.jpg \n", " inflating: data/coco/val2017/000000000438.jpg \n", " inflating: data/coco/val2017/000000000030.jpg \n", " inflating: data/coco/val2017/000000000581.jpg \n", " inflating: data/coco/val2017/000000000626.jpg \n", " inflating: data/coco/val2017/000000000315.jpg \n", " inflating: data/coco/val2017/000000000605.jpg \n", " inflating: data/coco/val2017/000000000294.jpg \n", " inflating: data/coco/val2017/000000000641.jpg \n", " inflating: data/coco/val2017/000000000620.jpg \n", " inflating: data/coco/val2017/000000000387.jpg \n", " inflating: data/coco/val2017/000000000025.jpg \n", " inflating: data/coco/val2017/000000000201.jpg \n", " inflating: data/coco/val2017/000000000562.jpg \n", " inflating: data/coco/val2017/000000000009.jpg \n", " inflating: data/coco/val2017/000000000419.jpg \n", " inflating: data/coco/val2017/000000000368.jpg \n", " inflating: data/coco/val2017/000000000634.jpg \n", " inflating: data/coco/val2017/000000000560.jpg \n", " inflating: data/coco/val2017/000000000471.jpg \n", " inflating: data/coco/val2017/000000000431.jpg \n", " inflating: data/coco/val2017/000000000542.jpg \n", " inflating: data/coco/val2017/000000000490.jpg \n", " inflating: data/coco/val2017/000000000089.jpg \n", " inflating: data/coco/val2017/000000000544.jpg \n", " inflating: data/coco/val2017/000000000540.jpg \n", " inflating: data/coco/val2017/000000000472.jpg \n", " inflating: data/coco/val2017/000000000415.jpg \n", " inflating: data/coco/val2017/000000000143.jpg \n", " inflating: data/coco/val2017/000000000307.jpg \n", " inflating: data/coco/val2017/000000000077.jpg \n", " inflating: data/coco/val2017/000000000394.jpg \n", " inflating: data/coco/val2017/000000000508.jpg \n", " inflating: data/coco/val2017/000000000133.jpg \n", " inflating: data/coco/val2017/000000000595.jpg \n", " inflating: data/coco/val2017/000000000532.jpg \n", " inflating: data/coco/val2017/000000000165.jpg \n", " inflating: data/coco/val2017/000000000384.jpg \n", " inflating: data/coco/val2017/000000000247.jpg \n", " inflating: data/coco/val2017/000000000486.jpg \n", " inflating: data/coco/val2017/000000000078.jpg \n", " inflating: data/coco/val2017/000000000064.jpg \n", " inflating: data/coco/val2017/000000000514.jpg \n", " inflating: data/coco/val2017/000000000321.jpg \n", " inflating: data/coco/val2017/000000000349.jpg \n", " inflating: data/coco/val2017/000000000208.jpg \n", " inflating: data/coco/val2017/000000000360.jpg \n", " inflating: data/coco/val2017/000000000151.jpg \n", " inflating: data/coco/val2017/000000000564.jpg \n", " inflating: data/coco/val2017/000000000072.jpg \n", " inflating: data/coco/val2017/000000000370.jpg \n", " inflating: data/coco/val2017/000000000590.jpg \n", " inflating: data/coco/val2017/000000000328.jpg \n", " inflating: data/coco/val2017/000000000446.jpg \n", " inflating: data/coco/val2017/000000000241.jpg \n", " inflating: data/coco/val2017/000000000436.jpg \n", " inflating: data/coco/val2017/000000000074.jpg \n", " inflating: data/coco/val2017/000000000629.jpg \n", " inflating: data/coco/val2017/000000000250.jpg \n", " inflating: data/coco/val2017/000000000283.jpg \n", " inflating: data/coco/val2017/000000000636.jpg \n", " inflating: data/coco/val2017/000000000149.jpg \n", " inflating: data/coco/val2017/000000000625.jpg \n", " inflating: data/coco/val2017/000000000192.jpg \n", " inflating: data/coco/val2017/000000000531.jpg \n", " inflating: data/coco/val2017/000000000322.jpg \n", " inflating: data/coco/val2017/000000000196.jpg \n", " inflating: data/coco/val2017/000000000312.jpg \n", " inflating: data/coco/val2017/000000000650.jpg \n", " inflating: data/coco/val2017/000000000400.jpg \n", " inflating: data/coco/val2017/000000000575.jpg \n", " inflating: data/coco/val2017/000000000397.jpg \n", " inflating: data/coco/val2017/000000000520.jpg \n", " inflating: data/coco/val2017/000000000138.jpg \n", " inflating: data/coco/val2017/000000000326.jpg \n", " inflating: data/coco/val2017/000000000034.jpg \n", " inflating: data/coco/val2017/000000000194.jpg \n", " inflating: data/coco/val2017/000000000643.jpg \n", " inflating: data/coco/val2017/000000000623.jpg \n", " inflating: data/coco/val2017/000000000404.jpg \n", " inflating: data/coco/val2017/000000000164.jpg \n", " inflating: data/coco/val2017/000000000357.jpg \n", " inflating: data/coco/val2017/000000000459.jpg \n", " inflating: data/coco/val2017/000000000113.jpg \n", " inflating: data/coco/val2017/000000000109.jpg \n", " inflating: data/coco/val2017/000000000061.jpg \n", " inflating: data/coco/val2017/000000000569.jpg \n", " inflating: data/coco/val2017/000000000144.jpg \n", " inflating: data/coco/val2017/000000000154.jpg \n", " inflating: data/coco/val2017/000000000071.jpg \n", " inflating: data/coco/val2017/000000000359.jpg \n", " inflating: data/coco/val2017/000000000309.jpg \n", " inflating: data/coco/val2017/000000000042.jpg \n", " inflating: data/coco/val2017/000000000597.jpg \n", " inflating: data/coco/val2017/000000000110.jpg \n", " inflating: data/coco/val2017/000000000257.jpg \n", " inflating: data/coco/val2017/000000000589.jpg \n", " inflating: data/coco/val2017/000000000389.jpg \n", " inflating: data/coco/val2017/000000000332.jpg \n", " inflating: data/coco/val2017/000000000536.jpg \n", " inflating: data/coco/val2017/000000000612.jpg \n", " inflating: data/coco/val2017/000000000092.jpg \n", " inflating: data/coco/val2017/000000000382.jpg \n", " inflating: data/coco/val2017/000000000049.jpg \n", " inflating: data/coco/val2017/000000000395.jpg \n", " inflating: data/coco/val2017/000000000428.jpg \n", " inflating: data/coco/val2017/000000000308.jpg \n", " inflating: data/coco/val2017/000000000572.jpg \n", " inflating: data/coco/val2017/000000000510.jpg \n", " inflating: data/coco/val2017/000000000443.jpg \n", " inflating: data/coco/val2017/000000000136.jpg \n", " inflating: data/coco/val2017/000000000584.jpg \n", " inflating: data/coco/val2017/000000000086.jpg \n", " inflating: data/coco/val2017/000000000474.jpg \n", " inflating: data/coco/val2017/000000000073.jpg \n", " inflating: data/coco/val2017/000000000491.jpg \n", " inflating: data/coco/val2017/000000000081.jpg \n", " inflating: data/coco/val2017/000000000094.jpg \n", " inflating: data/coco/val2017/000000000502.jpg \n", " inflating: data/coco/val2017/000000000599.jpg \n", " inflating: data/coco/val2017/000000000450.jpg \n", " inflating: data/coco/val2017/000000000260.jpg \n", " inflating: data/coco/val2017/000000000142.jpg \n", " inflating: data/coco/val2017/000000000529.jpg \n", " inflating: data/coco/val2017/000000000488.jpg \n", " inflating: data/coco/val2017/000000000127.jpg \n", " inflating: data/coco/val2017/000000000036.jpg \n", " inflating: data/coco/val2017/000000000338.jpg \n", " inflating: data/coco/README.txt \n", " creating: data/coco/train2017/\n", " inflating: data/coco/train2017/000000000263.jpg \n", " inflating: data/coco/train2017/000000000438.jpg \n", " inflating: data/coco/train2017/000000000030.jpg \n", " inflating: data/coco/train2017/000000000581.jpg \n", " inflating: data/coco/train2017/000000000626.jpg \n", " inflating: data/coco/train2017/000000000315.jpg \n", " inflating: data/coco/train2017/000000000605.jpg \n", " inflating: data/coco/train2017/000000000294.jpg \n", " inflating: data/coco/train2017/000000000641.jpg \n", " inflating: data/coco/train2017/000000000620.jpg \n", " inflating: data/coco/train2017/000000000387.jpg \n", " inflating: data/coco/train2017/000000000025.jpg \n", " inflating: data/coco/train2017/000000000201.jpg \n", " inflating: data/coco/train2017/000000000562.jpg \n", " inflating: data/coco/train2017/000000000009.jpg \n", " inflating: data/coco/train2017/000000000419.jpg \n", " inflating: data/coco/train2017/000000000368.jpg \n", " inflating: data/coco/train2017/000000000634.jpg \n", " inflating: data/coco/train2017/000000000560.jpg \n", " inflating: data/coco/train2017/000000000471.jpg \n", " inflating: data/coco/train2017/000000000431.jpg \n", " inflating: data/coco/train2017/000000000542.jpg \n", " inflating: data/coco/train2017/000000000490.jpg \n", " inflating: data/coco/train2017/000000000089.jpg \n", " inflating: data/coco/train2017/000000000544.jpg \n", " inflating: data/coco/train2017/000000000540.jpg \n", " inflating: data/coco/train2017/000000000472.jpg \n", " inflating: data/coco/train2017/000000000415.jpg \n", " inflating: data/coco/train2017/000000000143.jpg \n", " inflating: data/coco/train2017/000000000307.jpg \n", " inflating: data/coco/train2017/000000000077.jpg \n", " inflating: data/coco/train2017/000000000394.jpg \n", " inflating: data/coco/train2017/000000000508.jpg \n", " inflating: data/coco/train2017/000000000133.jpg \n", " inflating: data/coco/train2017/000000000595.jpg \n", " inflating: data/coco/train2017/000000000532.jpg \n", " inflating: data/coco/train2017/000000000165.jpg \n", " inflating: data/coco/train2017/000000000384.jpg \n", " inflating: data/coco/train2017/000000000247.jpg \n", " inflating: data/coco/train2017/000000000486.jpg \n", " inflating: data/coco/train2017/000000000078.jpg \n", " inflating: data/coco/train2017/000000000064.jpg \n", " inflating: data/coco/train2017/000000000514.jpg \n", " inflating: data/coco/train2017/000000000321.jpg \n", " inflating: data/coco/train2017/000000000349.jpg \n", " inflating: data/coco/train2017/000000000208.jpg \n", " inflating: data/coco/train2017/000000000360.jpg \n", " inflating: data/coco/train2017/000000000151.jpg \n", " inflating: data/coco/train2017/000000000564.jpg \n", " inflating: data/coco/train2017/000000000072.jpg \n", " inflating: data/coco/train2017/000000000370.jpg \n", " inflating: data/coco/train2017/000000000590.jpg \n", " inflating: data/coco/train2017/000000000328.jpg \n", " inflating: data/coco/train2017/000000000446.jpg \n", " inflating: data/coco/train2017/000000000241.jpg \n", " inflating: data/coco/train2017/000000000436.jpg \n", " inflating: data/coco/train2017/000000000074.jpg \n", " inflating: data/coco/train2017/000000000629.jpg \n", " inflating: data/coco/train2017/000000000250.jpg \n", " inflating: data/coco/train2017/000000000283.jpg \n", " inflating: data/coco/train2017/000000000636.jpg \n", " inflating: data/coco/train2017/000000000149.jpg \n", " inflating: data/coco/train2017/000000000625.jpg \n", " inflating: data/coco/train2017/000000000192.jpg \n", " inflating: data/coco/train2017/000000000531.jpg \n", " inflating: data/coco/train2017/000000000322.jpg \n", " inflating: data/coco/train2017/000000000196.jpg \n", " inflating: data/coco/train2017/000000000312.jpg \n", " inflating: data/coco/train2017/000000000650.jpg \n", " inflating: data/coco/train2017/000000000400.jpg \n", " inflating: data/coco/train2017/000000000575.jpg \n", " inflating: data/coco/train2017/000000000397.jpg \n", " inflating: data/coco/train2017/000000000520.jpg \n", " inflating: data/coco/train2017/000000000138.jpg \n", " inflating: data/coco/train2017/000000000326.jpg \n", " inflating: data/coco/train2017/000000000034.jpg \n", " inflating: data/coco/train2017/000000000194.jpg \n", " inflating: data/coco/train2017/000000000643.jpg \n", " inflating: data/coco/train2017/000000000623.jpg \n", " inflating: data/coco/train2017/000000000404.jpg \n", " inflating: data/coco/train2017/000000000164.jpg \n", " inflating: data/coco/train2017/000000000357.jpg \n", " inflating: data/coco/train2017/000000000459.jpg \n", " inflating: data/coco/train2017/000000000113.jpg \n", " inflating: data/coco/train2017/000000000109.jpg \n", " inflating: data/coco/train2017/000000000061.jpg \n", " inflating: data/coco/train2017/000000000569.jpg \n", " inflating: data/coco/train2017/000000000144.jpg \n", " inflating: data/coco/train2017/000000000154.jpg \n", " inflating: data/coco/train2017/000000000071.jpg \n", " inflating: data/coco/train2017/000000000359.jpg \n", " inflating: data/coco/train2017/000000000309.jpg \n", " inflating: data/coco/train2017/000000000042.jpg \n", " inflating: data/coco/train2017/000000000597.jpg \n", " inflating: data/coco/train2017/000000000110.jpg \n", " inflating: data/coco/train2017/000000000257.jpg \n", " inflating: data/coco/train2017/000000000589.jpg \n", " inflating: data/coco/train2017/000000000389.jpg \n", " inflating: data/coco/train2017/000000000332.jpg \n", " inflating: data/coco/train2017/000000000536.jpg \n", " inflating: data/coco/train2017/000000000612.jpg \n", " inflating: data/coco/train2017/000000000092.jpg \n", " inflating: data/coco/train2017/000000000382.jpg \n", " inflating: data/coco/train2017/000000000049.jpg \n", " inflating: data/coco/train2017/000000000395.jpg \n", " inflating: data/coco/train2017/000000000428.jpg \n", " inflating: data/coco/train2017/000000000308.jpg \n", " inflating: data/coco/train2017/000000000572.jpg \n", " inflating: data/coco/train2017/000000000510.jpg \n", " inflating: data/coco/train2017/000000000443.jpg \n", " inflating: data/coco/train2017/000000000136.jpg \n", " inflating: data/coco/train2017/000000000584.jpg \n", " inflating: data/coco/train2017/000000000086.jpg \n", " inflating: data/coco/train2017/000000000474.jpg \n", " inflating: data/coco/train2017/000000000073.jpg \n", " inflating: data/coco/train2017/000000000491.jpg \n", " inflating: data/coco/train2017/000000000081.jpg \n", " inflating: data/coco/train2017/000000000094.jpg \n", " inflating: data/coco/train2017/000000000502.jpg \n", " inflating: data/coco/train2017/000000000599.jpg \n", " inflating: data/coco/train2017/000000000450.jpg \n", " inflating: data/coco/train2017/000000000260.jpg \n", " inflating: data/coco/train2017/000000000142.jpg \n", " inflating: data/coco/train2017/000000000529.jpg \n", " inflating: data/coco/train2017/000000000488.jpg \n", " inflating: data/coco/train2017/000000000127.jpg \n", " inflating: data/coco/train2017/000000000036.jpg \n", " inflating: data/coco/train2017/000000000338.jpg \n" ] } ], "source": [ "#@title ### 0.3 下载 COCO128 数据集作为示例\n", "# COCO128 取自 COCO2017 目标检测数据训练集前 128 张\n", "# 原始数据(YOLO标注格式)来自 https://www.kaggle.com/datasets/ultralytics/coco128\n", "# 转换 COCO 标注格式的脚本 https://github.com/open-mmlab/mmyolo/blob/main/tools/dataset_converters/yolo2coco.py\n", "!mkdir data\n", "!wget -P data https://extremevision-js-userfile.oss-cn-hangzhou.aliyuncs.com/user-25117-files/ff6429ed-0f22-4759-aa1e-f3d3f8dc4b40/coco.zip\n", "!unzip data/coco.zip -d data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "CaQ_SvXEmiYN" }, "outputs": [], "source": [ "#@title ### 0.4 准备演示用的基础配置文件\n", "train_batch_size_per_gpu = 2 #@param {type:\"integer\"}\n", "train_num_workers = 2 #@param {type:\"integer\"}\n", "max_epochs = 1 #@param {type:\"integer\"}\n", "#@markdown >persistent_workers must be False if num_workers is 0.\n", "persistent_workers = True #@param {type:\"boolean\"}\n", "\n", "\n", "config_base=f\"\"\"\n", "_base_ = './yolov5_s-v61_syncbn_8xb16-300e_coco.py'\n", "\n", "train_batch_size_per_gpu = {train_batch_size_per_gpu}\n", "train_num_workers = {train_num_workers}\n", "max_epochs = {max_epochs} # (max_epochs 设置为 1 仅供演示)\n", "\n", "train_dataloader = dict(\n", " batch_size=train_batch_size_per_gpu,\n", " num_workers=train_num_workers)\n", "\n", "optim_wrapper = dict(\n", " optimizer=dict(\n", " batch_size_per_gpu=train_batch_size_per_gpu))\n", "\n", "default_hooks = dict(\n", " param_scheduler=dict(\n", " type='YOLOv5ParamSchedulerHook',\n", " max_epochs=max_epochs))\n", "\n", "train_cfg = dict(\n", " type='EpochBasedTrainLoop',\n", " max_epochs=max_epochs)\n", "\"\"\"\n", "\n", "# 写入文件\n", "with open('./configs/yolov5/yolov5_s-v61_1xb2-1e_coco128.py', 'w') as f:\n", " f.write(config_base)" ] }, { "cell_type": "markdown", "metadata": { "id": "BthrwomDvUbD" }, "source": [ "## 1 使用 MMYOLO 中注册的主干网络\n", "\n", "
\n", "Supported algorithms\n", "\n", "- [x] [YOLOv5](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolov5)\n", "- [x] [YOLOv6](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolov6)\n", "- [x] [YOLOv7](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolov7)\n", "- [x] [YOLOX](https://github.com/open-mmlab/mmyolo/tree/main/configs/yolox)\n", "- [x] [PPYOLO-E](https://github.com/open-mmlab/mmyolo/tree/main/configs/ppyoloe)\n", "- [x] [RTMDet](https://github.com/open-mmlab/mmyolo/tree/main/configs/rtmdet)\n", "\n", "
\n", "\n", "
\n", "\"EfficientRep\n", "
\n", "\n", "YOLOv6 中,美团基于 Rep 算子设计了一个高效的 Backbone。相比于 YOLOv5 采用的 CSP-Backbone,该 Backbone 能够高效利用硬件(如 GPU)算力的同时,还具有较强的表征能力[1](https://tech.meituan.com/2022/06/23/yolov6-a-fast-and-accurate-target-detection-framework-is-opening-source.html)。\n", "\n", "如果想将 `YOLOv6EfficientRep` 作为 `YOLOv5` 的主干网络,则配置文件如下:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "hHqwED0aHfzW" }, "outputs": [], "source": [ "_base_ = './yolov5_s-v61_1xb2-1e_coco128.py'\n", "\n", "model = dict(\n", " backbone=dict(\n", " type='YOLOv6EfficientRep',\n", " norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),\n", " act_cfg=dict(type='ReLU', inplace=True))\n", ")\n", "\n", "config_YOLOv6EfficientRep = f\"\"\"\n", "_base_=\\'{_base_}\\'\n", "model={model}\n", "\"\"\"\n", "\n", "with open('./configs/yolov5/yolov5_s_efficientrep-v61_1xb2-1e_coco128.py', 'w') as f:\n", " f.write(config_YOLOv6EfficientRep)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "jEZj85YNoGg4", "outputId": "65d42d74-a09e-40f6-f4a6-ceb89149424c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "11/20 08:02:42 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - Failed to search registry with scope \"mmyolo\" in the \"log_processor\" registry tree. As a workaround, the current \"log_processor\" registry in \"mmengine\" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether \"mmyolo\" is a correct scope, or whether the registry is initialized.\n", "11/20 08:02:42 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - \n", "------------------------------------------------------------\n", "System environment:\n", " sys.platform: linux\n", " Python: 3.7.15 (default, Oct 12 2022, 19:14:55) [GCC 7.5.0]\n", " CUDA available: True\n", " numpy_random_seed: 648320955\n", " GPU 0: Tesla T4\n", " CUDA_HOME: /usr/local/cuda\n", " NVCC: Cuda compilation tools, release 11.2, V11.2.152\n", " GCC: x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0\n", " PyTorch: 1.12.1+cu113\n", " PyTorch compiling details: PyTorch built with:\n", " - GCC 9.3\n", " - C++ Version: 201402\n", " - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications\n", " - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)\n", " - OpenMP 201511 (a.k.a. OpenMP 4.5)\n", " - LAPACK is enabled (usually provided by MKL)\n", " - NNPACK is enabled\n", " - CPU capability usage: AVX2\n", " - CUDA Runtime 11.3\n", " - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86\n", " - CuDNN 8.3.2 (built against CUDA 11.5)\n", " - Magma 2.5.2\n", " - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n", "\n", " TorchVision: 0.13.1+cu113\n", " OpenCV: 4.6.0\n", " MMEngine: 0.3.1\n", "\n", "Runtime environment:\n", " cudnn_benchmark: True\n", " mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}\n", " dist_cfg: {'backend': 'nccl'}\n", " seed: None\n", " Distributed launcher: none\n", " Distributed training: False\n", " GPU number: 1\n", "------------------------------------------------------------\n", "\n", "11/20 08:02:43 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Config:\n", "default_scope = 'mmyolo'\n", "default_hooks = dict(\n", " timer=dict(type='IterTimerHook'),\n", " logger=dict(type='LoggerHook', interval=50),\n", " param_scheduler=dict(\n", " type='YOLOv5ParamSchedulerHook',\n", " scheduler_type='linear',\n", " lr_factor=0.01,\n", " max_epochs=1),\n", " checkpoint=dict(\n", " type='CheckpointHook', interval=10, save_best='auto',\n", " max_keep_ckpts=3),\n", " sampler_seed=dict(type='DistSamplerSeedHook'),\n", " visualization=dict(type='mmdet.DetVisualizationHook'))\n", "env_cfg = dict(\n", " cudnn_benchmark=True,\n", " mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),\n", " dist_cfg=dict(backend='nccl'))\n", "vis_backends = [dict(type='LocalVisBackend')]\n", "visualizer = dict(\n", " type='mmdet.DetLocalVisualizer',\n", " vis_backends=[dict(type='LocalVisBackend')],\n", " name='visualizer')\n", "log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)\n", "log_level = 'INFO'\n", "load_from = None\n", "resume = False\n", "file_client_args = dict(backend='disk')\n", "data_root = 'data/coco/'\n", "dataset_type = 'YOLOv5CocoDataset'\n", "num_classes = 80\n", "img_scale = (640, 640)\n", "deepen_factor = 0.33\n", "widen_factor = 0.5\n", "max_epochs = 1\n", "save_epoch_intervals = 10\n", "train_batch_size_per_gpu = 2\n", "train_num_workers = 2\n", "val_batch_size_per_gpu = 1\n", "val_num_workers = 2\n", "persistent_workers = True\n", "batch_shapes_cfg = dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)\n", "anchors = [[(10, 13), (16, 30), (33, 23)], [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]]\n", "strides = [8, 16, 32]\n", "num_det_layers = 3\n", "model = dict(\n", " type='YOLODetector',\n", " data_preprocessor=dict(\n", " type='mmdet.DetDataPreprocessor',\n", " mean=[0.0, 0.0, 0.0],\n", " std=[255.0, 255.0, 255.0],\n", " bgr_to_rgb=True),\n", " backbone=dict(\n", " type='YOLOv6EfficientRep',\n", " deepen_factor=0.33,\n", " widen_factor=0.5,\n", " norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),\n", " act_cfg=dict(type='ReLU', inplace=True)),\n", " neck=dict(\n", " type='YOLOv5PAFPN',\n", " deepen_factor=0.33,\n", " widen_factor=0.5,\n", " in_channels=[256, 512, 1024],\n", " out_channels=[256, 512, 1024],\n", " num_csp_blocks=3,\n", " norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),\n", " act_cfg=dict(type='SiLU', inplace=True)),\n", " bbox_head=dict(\n", " type='YOLOv5Head',\n", " head_module=dict(\n", " type='YOLOv5HeadModule',\n", " num_classes=80,\n", " in_channels=[256, 512, 1024],\n", " widen_factor=0.5,\n", " featmap_strides=[8, 16, 32],\n", " num_base_priors=3),\n", " prior_generator=dict(\n", " type='mmdet.YOLOAnchorGenerator',\n", " base_sizes=[[(10, 13), (16, 30), (33, 23)],\n", " [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]],\n", " strides=[8, 16, 32]),\n", " loss_cls=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=0.5),\n", " loss_bbox=dict(\n", " type='IoULoss',\n", " iou_mode='ciou',\n", " bbox_format='xywh',\n", " eps=1e-07,\n", " reduction='mean',\n", " loss_weight=0.05,\n", " return_iou=True),\n", " loss_obj=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=1.0),\n", " prior_match_thr=4.0,\n", " obj_level_weights=[4.0, 1.0, 0.4]),\n", " test_cfg=dict(\n", " multi_label=True,\n", " nms_pre=30000,\n", " score_thr=0.001,\n", " nms=dict(type='nms', iou_threshold=0.65),\n", " max_per_img=300))\n", "albu_train_transforms = [\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", "]\n", "pre_transform = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", "]\n", "train_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',\n", " 'flip_direction'))\n", "]\n", "train_dataloader = dict(\n", " batch_size=2,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " sampler=dict(type='DefaultSampler', shuffle=True),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " ann_file='annotations/instances_train2017.json',\n", " data_prefix=dict(img='train2017/'),\n", " filter_cfg=dict(filter_empty_gt=False, min_size=32),\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'flip', 'flip_direction'))\n", " ]))\n", "test_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", "]\n", "val_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "test_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "param_scheduler = None\n", "optim_wrapper = dict(\n", " type='OptimWrapper',\n", " optimizer=dict(\n", " type='SGD',\n", " lr=0.01,\n", " momentum=0.937,\n", " weight_decay=0.0005,\n", " nesterov=True,\n", " batch_size_per_gpu=2),\n", " constructor='YOLOv5OptimizerConstructor')\n", "custom_hooks = [\n", " dict(\n", " type='EMAHook',\n", " ema_type='ExpMomentumEMA',\n", " momentum=0.0001,\n", " update_buffers=True,\n", " strict_load=False,\n", " priority=49)\n", "]\n", "val_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "test_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=1, val_interval=10)\n", "val_cfg = dict(type='ValLoop')\n", "test_cfg = dict(type='TestLoop')\n", "launcher = 'none'\n", "work_dir = './work_dirs/yolov5_s_efficientrep-v61_1xb2-1e_coco128'\n", "\n", "11/20 08:02:43 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Result has been saved to /content/mmyolo/work_dirs/yolov5_s_efficientrep-v61_1xb2-1e_coco128/modules_statistic_results.json\n", "11/20 08:02:48 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.\n", "loading annotations into memory...\n", "Done (t=0.00s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.00s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.00s)\n", "creating index...\n", "index created!\n", "11/20 08:02:50 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Checkpoints will be saved to /content/mmyolo/work_dirs/yolov5_s_efficientrep-v61_1xb2-1e_coco128.\n", "11/20 08:03:06 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Epoch(train) [1][50/63] lr: 4.9000e-04 eta: 0:00:04 time: 0.3141 data_time: 0.0075 memory: 4902 loss: 0.5655 loss_cls: 0.2104 loss_obj: 0.1342 loss_bbox: 0.2209\n", "11/20 08:03:08 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Exp name: yolov5_s_efficientrep-v61_1xb2-1e_coco128_20221120_080242\n", "11/20 08:03:08 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Saving checkpoint at 1 epochs\n", "11/20 08:03:09 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - `save_param_scheduler` is True but `self.param_schedulers` is None, so skip saving parameter schedulers\n" ] } ], "source": [ "# 启动训练\n", "!python tools/train.py configs/yolov5/yolov5_s_efficientrep-v61_1xb2-1e_coco128.py" ] }, { "cell_type": "markdown", "metadata": { "id": "OgEWPiW2qXsC" }, "source": [ "## 2 跨库使用主干网络\n", "OpenMMLab 2.0 体系中 MMYOLO、MMDetection、MMClassification、MMSelfsup 中的模型注册表都继承自 MMEngine 中的根注册表,允许这些 OpenMMLab 开源库直接使用彼此已经实现的模块。 因此用户可以在 MMYOLO 中使用来自 MMDetection、MMClassification、MMSelfsup 的主干网络,而无需重新实现。" ] }, { "cell_type": "markdown", "metadata": { "id": "lqhhOC74t3IA" }, "source": [ "### 2.1 使用在 MMDetection 中实现的主干网络(SwinTransformer)\n", "\n", "目前 MMDetection [![GitHub release](https://img.shields.io/github/release/open-mmlab/mmdetection.svg)](https://GitHub.com/open-mmlab/mmdetection/) 已支持 16 种主干网络:\n", "
\n", "Supported backbones\n", "\n", "
  • VGG (ICLR'2015)
  • \n", "
  • ResNet (CVPR'2016)
  • \n", "
  • ResNeXt (CVPR'2017)
  • \n", "
  • MobileNetV2 (CVPR'2018)
  • \n", "
  • HRNet (CVPR'2019)
  • \n", "
  • Generalized Attention (ICCV'2019)
  • \n", "
  • GCNet (ICCVW'2019)
  • \n", "
  • Res2Net (TPAMI'2020)
  • \n", "
  • RegNet (CVPR'2020)
  • \n", "
  • ResNeSt (ArXiv'2020)
  • \n", "
  • PVT (ICCV'2021)
  • \n", "
  • Swin (CVPR'2021)
  • \n", "
  • PVTv2 (ArXiv'2021)
  • \n", "
  • ResNet strikes back (ArXiv'2021)
  • \n", "
  • EfficientNet (ArXiv'2021)
  • \n", "
  • ConvNeXt (CVPR'2022)
  • \n", "\n", "
    \n", "\n", "
    \n", "\"EfficientRep\n", "
    \n", "\n", "SwinTransformer 不仅获得了 ICCV 2021 最佳论文,在各种计算机视觉比赛中你也可以看到它的身影。\n", "\n", "如果想将 `SwinTransformer-Tiny` 作为 `YOLOv5` 的主干网络,则配置文件如下:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "yLAi4JYArBI-" }, "outputs": [], "source": [ "_base_ = './yolov5_s-v61_1xb2-1e_coco128.py'\n", "\n", "widen_factor = 1.0\n", "channels = [192, 384, 768]\n", "checkpoint_file = 'https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth' # noqa\n", "\n", "model = dict(\n", " backbone=dict(\n", " _delete_=True, # 将 _base_ 中关于 backbone 的字段删除\n", " type='mmdet.SwinTransformer', # 使用 mmdet 中的 SwinTransformer\n", " embed_dims=96,\n", " depths=[2, 2, 6, 2],\n", " num_heads=[3, 6, 12, 24],\n", " window_size=7,\n", " mlp_ratio=4,\n", " qkv_bias=True,\n", " qk_scale=None,\n", " drop_rate=0.,\n", " attn_drop_rate=0.,\n", " drop_path_rate=0.2,\n", " patch_norm=True,\n", " out_indices=(1, 2, 3),\n", " with_cp=False,\n", " convert_weights=True,\n", " init_cfg=dict(type='Pretrained', checkpoint=checkpoint_file)),\n", " neck=dict(\n", " type='YOLOv5PAFPN',\n", " widen_factor=widen_factor,\n", " in_channels=channels, # 注意:SwinTransformer-Tiny 输出的3个通道是 [192, 384, 768],和原先的 yolov5-s neck 不匹配,需要更改\n", " out_channels=channels),\n", " bbox_head=dict(\n", " type='YOLOv5Head',\n", " head_module=dict(\n", " type='YOLOv5HeadModule',\n", " in_channels=channels, # head 部分输入通道也要做相应更改\n", " widen_factor=widen_factor))\n", ")\n", "\n", "config_swin_t = f\"\"\"\n", "_base_=\\'{_base_}\\'\n", "widen_factor={widen_factor}\n", "checkpoint_file=\\'{checkpoint_file}\\'\n", "model={model}\n", "\"\"\"\n", "\n", "with open('./configs/yolov5/yolov5_s_swin_t-v61_1xb2-1e_coco128.py', 'w') as f:\n", " f.write(config_swin_t)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "og5JqsH2rdw8", "outputId": "658b61cb-6eff-40e5-fb35-a9ac820cb181" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "11/19 11:26:33 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - Failed to search registry with scope \"mmyolo\" in the \"log_processor\" registry tree. As a workaround, the current \"log_processor\" registry in \"mmengine\" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether \"mmyolo\" is a correct scope, or whether the registry is initialized.\n", "11/19 11:26:33 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - \n", "------------------------------------------------------------\n", "System environment:\n", " sys.platform: linux\n", " Python: 3.7.15 (default, Oct 12 2022, 19:14:55) [GCC 7.5.0]\n", " CUDA available: True\n", " numpy_random_seed: 666285573\n", " GPU 0: Tesla T4\n", " CUDA_HOME: /usr/local/cuda\n", " NVCC: Cuda compilation tools, release 11.2, V11.2.152\n", " GCC: x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0\n", " PyTorch: 1.12.1+cu113\n", " PyTorch compiling details: PyTorch built with:\n", " - GCC 9.3\n", " - C++ Version: 201402\n", " - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications\n", " - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)\n", " - OpenMP 201511 (a.k.a. OpenMP 4.5)\n", " - LAPACK is enabled (usually provided by MKL)\n", " - NNPACK is enabled\n", " - CPU capability usage: AVX2\n", " - CUDA Runtime 11.3\n", " - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86\n", " - CuDNN 8.3.2 (built against CUDA 11.5)\n", " - Magma 2.5.2\n", " - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n", "\n", " TorchVision: 0.13.1+cu113\n", " OpenCV: 4.6.0\n", " MMEngine: 0.3.1\n", "\n", "Runtime environment:\n", " cudnn_benchmark: True\n", " mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}\n", " dist_cfg: {'backend': 'nccl'}\n", " seed: None\n", " Distributed launcher: none\n", " Distributed training: False\n", " GPU number: 1\n", "------------------------------------------------------------\n", "\n", "11/19 11:26:34 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Config:\n", "default_scope = 'mmyolo'\n", "default_hooks = dict(\n", " timer=dict(type='IterTimerHook'),\n", " logger=dict(type='LoggerHook', interval=50),\n", " param_scheduler=dict(\n", " type='YOLOv5ParamSchedulerHook',\n", " scheduler_type='linear',\n", " lr_factor=0.01,\n", " max_epochs=1),\n", " checkpoint=dict(\n", " type='CheckpointHook', interval=10, save_best='auto',\n", " max_keep_ckpts=3),\n", " sampler_seed=dict(type='DistSamplerSeedHook'),\n", " visualization=dict(type='mmdet.DetVisualizationHook'))\n", "env_cfg = dict(\n", " cudnn_benchmark=True,\n", " mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),\n", " dist_cfg=dict(backend='nccl'))\n", "vis_backends = [dict(type='LocalVisBackend')]\n", "visualizer = dict(\n", " type='mmdet.DetLocalVisualizer',\n", " vis_backends=[dict(type='LocalVisBackend')],\n", " name='visualizer')\n", "log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)\n", "log_level = 'INFO'\n", "load_from = None\n", "resume = False\n", "file_client_args = dict(backend='disk')\n", "data_root = 'data/coco/'\n", "dataset_type = 'YOLOv5CocoDataset'\n", "num_classes = 80\n", "img_scale = (640, 640)\n", "deepen_factor = 0.33\n", "widen_factor = 1.0\n", "max_epochs = 1\n", "save_epoch_intervals = 10\n", "train_batch_size_per_gpu = 2\n", "train_num_workers = 2\n", "val_batch_size_per_gpu = 1\n", "val_num_workers = 2\n", "persistent_workers = True\n", "batch_shapes_cfg = dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)\n", "anchors = [[(10, 13), (16, 30), (33, 23)], [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]]\n", "strides = [8, 16, 32]\n", "num_det_layers = 3\n", "model = dict(\n", " type='YOLODetector',\n", " data_preprocessor=dict(\n", " type='mmdet.DetDataPreprocessor',\n", " mean=[0.0, 0.0, 0.0],\n", " std=[255.0, 255.0, 255.0],\n", " bgr_to_rgb=True),\n", " backbone=dict(\n", " type='mmdet.SwinTransformer',\n", " embed_dims=96,\n", " depths=[2, 2, 6, 2],\n", " num_heads=[3, 6, 12, 24],\n", " window_size=7,\n", " mlp_ratio=4,\n", " qkv_bias=True,\n", " qk_scale=None,\n", " drop_rate=0.0,\n", " attn_drop_rate=0.0,\n", " drop_path_rate=0.2,\n", " patch_norm=True,\n", " out_indices=(1, 2, 3),\n", " with_cp=False,\n", " convert_weights=True,\n", " init_cfg=dict(\n", " type='Pretrained',\n", " checkpoint=\n", " 'https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth'\n", " )),\n", " neck=dict(\n", " type='YOLOv5PAFPN',\n", " deepen_factor=0.33,\n", " widen_factor=1.0,\n", " in_channels=[192, 384, 768],\n", " out_channels=[192, 384, 768],\n", " num_csp_blocks=3,\n", " norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),\n", " act_cfg=dict(type='SiLU', inplace=True)),\n", " bbox_head=dict(\n", " type='YOLOv5Head',\n", " head_module=dict(\n", " type='YOLOv5HeadModule',\n", " num_classes=80,\n", " in_channels=[192, 384, 768],\n", " widen_factor=1.0,\n", " featmap_strides=[8, 16, 32],\n", " num_base_priors=3),\n", " prior_generator=dict(\n", " type='mmdet.YOLOAnchorGenerator',\n", " base_sizes=[[(10, 13), (16, 30), (33, 23)],\n", " [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]],\n", " strides=[8, 16, 32]),\n", " loss_cls=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=0.5),\n", " loss_bbox=dict(\n", " type='IoULoss',\n", " iou_mode='ciou',\n", " bbox_format='xywh',\n", " eps=1e-07,\n", " reduction='mean',\n", " loss_weight=0.05,\n", " return_iou=True),\n", " loss_obj=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=1.0),\n", " prior_match_thr=4.0,\n", " obj_level_weights=[4.0, 1.0, 0.4]),\n", " test_cfg=dict(\n", " multi_label=True,\n", " nms_pre=30000,\n", " score_thr=0.001,\n", " nms=dict(type='nms', iou_threshold=0.65),\n", " max_per_img=300))\n", "albu_train_transforms = [\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", "]\n", "pre_transform = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", "]\n", "train_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',\n", " 'flip_direction'))\n", "]\n", "train_dataloader = dict(\n", " batch_size=2,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " sampler=dict(type='DefaultSampler', shuffle=True),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " ann_file='annotations/instances_train2017.json',\n", " data_prefix=dict(img='train2017/'),\n", " filter_cfg=dict(filter_empty_gt=False, min_size=32),\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'flip', 'flip_direction'))\n", " ]))\n", "test_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", "]\n", "val_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "test_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "param_scheduler = None\n", "optim_wrapper = dict(\n", " type='OptimWrapper',\n", " optimizer=dict(\n", " type='SGD',\n", " lr=0.01,\n", " momentum=0.937,\n", " weight_decay=0.0005,\n", " nesterov=True,\n", " batch_size_per_gpu=2),\n", " constructor='YOLOv5OptimizerConstructor')\n", "custom_hooks = [\n", " dict(\n", " type='EMAHook',\n", " ema_type='ExpMomentumEMA',\n", " momentum=0.0001,\n", " update_buffers=True,\n", " strict_load=False,\n", " priority=49)\n", "]\n", "val_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "test_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=1, val_interval=10)\n", "val_cfg = dict(type='ValLoop')\n", "test_cfg = dict(type='TestLoop')\n", "checkpoint_file = 'https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth'\n", "launcher = 'none'\n", "work_dir = './work_dirs/yolov5_s_swin_t-v61_1xb2-1e_coco128'\n", "\n", "11/19 11:26:34 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Result has been saved to /content/mmyolo/work_dirs/yolov5_s_swin_t-v61_1xb2-1e_coco128/modules_statistic_results.json\n", "11/19 11:26:38 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.\n", "loading annotations into memory...\n", "Done (t=0.01s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.01s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.01s)\n", "creating index...\n", "index created!\n", "11/19 11:26:45 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - http loads checkpoint from path: https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth\n", "Downloading: \"https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth\" to /root/.cache/torch/hub/checkpoints/swin_tiny_patch4_window7_224.pth\n", "100% 109M/109M [00:13<00:00, 8.74MB/s]\n", "11/19 11:27:00 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Checkpoints will be saved to /content/mmyolo/work_dirs/yolov5_s_swin_t-v61_1xb2-1e_coco128.\n", "11/19 11:27:19 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Epoch(train) [1][50/63] lr: 4.9000e-04 eta: 0:00:04 time: 0.3812 data_time: 0.0055 memory: 8502 loss: 0.5635 loss_cls: 0.2091 loss_obj: 0.1338 loss_bbox: 0.2205\n", "11/19 11:27:23 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Exp name: yolov5_s_swin_t-v61_1xb2-1e_coco128_20221119_112633\n", "11/19 11:27:23 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Saving checkpoint at 1 epochs\n", "11/19 11:27:25 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - `save_param_scheduler` is True but `self.param_schedulers` is None, so skip saving parameter schedulers\n" ] } ], "source": [ "# 启动训练\n", "!python tools/train.py configs/yolov5/yolov5_s_swin_t-v61_1xb2-1e_coco128.py" ] }, { "cell_type": "markdown", "metadata": { "id": "Q3MIUWpCKWM4" }, "source": [ "### 2.2 使用在 MMClassification 中实现的主干网络(MobileNetV3)\n", "觉得 SwinTransformer 模型太大?\n", "\n", "目前 MMClassification [![GitHub release](https://img.shields.io/github/release/open-mmlab/mmclassification.svg)](https://GitHub.com/open-mmlab/mmclassification/) 已支持 30 种主干网络,其中也有很多轻量级网络:\n", "
    \n", "Supported backbones\n", "\n", "- [x] [VGG](https://github.com/open-mmlab/mmclassification/tree/master/configs/vgg)\n", "- [x] [ResNet](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnet)\n", "- [x] [ResNeXt](https://github.com/open-mmlab/mmclassification/tree/master/configs/resnext)\n", "- [x] [SE-ResNet](https://github.com/open-mmlab/mmclassification/tree/master/configs/seresnet)\n", "- [x] [SE-ResNeXt](https://github.com/open-mmlab/mmclassification/tree/master/configs/seresnet)\n", "- [x] [RegNet](https://github.com/open-mmlab/mmclassification/tree/master/configs/regnet)\n", "- [x] [ShuffleNetV1](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v1)\n", "- [x] [ShuffleNetV2](https://github.com/open-mmlab/mmclassification/tree/master/configs/shufflenet_v2)\n", "- [x] [MobileNetV2](https://github.com/open-mmlab/mmclassification/tree/master/configs/mobilenet_v2)\n", "- [x] [MobileNetV3](https://github.com/open-mmlab/mmclassification/tree/master/configs/mobilenet_v3)\n", "- [x] [Swin-Transformer](https://github.com/open-mmlab/mmclassification/tree/master/configs/swin_transformer)\n", "- [x] [RepVGG](https://github.com/open-mmlab/mmclassification/tree/master/configs/repvgg)\n", "- [x] [Vision-Transformer](https://github.com/open-mmlab/mmclassification/tree/master/configs/vision_transformer)\n", "- [x] [Transformer-in-Transformer](https://github.com/open-mmlab/mmclassification/tree/master/configs/tnt)\n", "- [x] [Res2Net](https://github.com/open-mmlab/mmclassification/tree/master/configs/res2net)\n", "- [x] [MLP-Mixer](https://github.com/open-mmlab/mmclassification/tree/master/configs/mlp_mixer)\n", "- [x] [DeiT](https://github.com/open-mmlab/mmclassification/tree/master/configs/deit)\n", "- [x] [Conformer](https://github.com/open-mmlab/mmclassification/tree/master/configs/conformer)\n", "- [x] [T2T-ViT](https://github.com/open-mmlab/mmclassification/tree/master/configs/t2t_vit)\n", "- [x] [Twins](https://github.com/open-mmlab/mmclassification/tree/master/configs/twins)\n", "- [x] [EfficientNet](https://github.com/open-mmlab/mmclassification/tree/master/configs/efficientnet)\n", "- [x] [ConvNeXt](https://github.com/open-mmlab/mmclassification/tree/master/configs/convnext)\n", "- [x] [HRNet](https://github.com/open-mmlab/mmclassification/tree/master/configs/hrnet)\n", "- [x] [VAN](https://github.com/open-mmlab/mmclassification/tree/master/configs/van)\n", "- [x] [ConvMixer](https://github.com/open-mmlab/mmclassification/tree/master/configs/convmixer)\n", "- [x] [CSPNet](https://github.com/open-mmlab/mmclassification/tree/master/configs/cspnet)\n", "- [x] [PoolFormer](https://github.com/open-mmlab/mmclassification/tree/master/configs/poolformer)\n", "- [x] [MViT](https://github.com/open-mmlab/mmclassification/tree/master/configs/mvit)\n", "- [x] [EfficientFormer](https://github.com/open-mmlab/mmclassification/tree/master/configs/efficientformer)\n", "- [x] [HorNet](https://github.com/open-mmlab/mmclassification/tree/master/configs/hornet)\n", "\n", "
    \n", "\n", "如果想要使用在 MMClassification 中实现的主干网络,需要先安装 mmcls。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "nhdBMPTUrmYv", "outputId": "b26f4276-5501-4fce-86ea-ba638acbfeb2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Looking in links: https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/index.html\n", "Collecting mmcls>=1.0.0rc2\n", " Downloading mmcls-1.0.0rc2-py2.py3-none-any.whl (642 kB)\n", "\u001b[K |████████████████████████████████| 642 kB 26.2 MB/s \n", "\u001b[?25hRequirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from mmcls>=1.0.0rc2) (1.21.6)\n", "Requirement already satisfied: packaging in /usr/local/lib/python3.7/dist-packages (from mmcls>=1.0.0rc2) (21.3)\n", "Requirement already satisfied: rich in /usr/local/lib/python3.7/dist-packages (from mmcls>=1.0.0rc2) (12.6.0)\n", "Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from mmcls>=1.0.0rc2) (3.2.2)\n", "Requirement already satisfied: mmengine<1.0.0,>=0.2.0 in /usr/local/lib/python3.7/dist-packages (from mmcls>=1.0.0rc2) (0.3.1)\n", "Requirement already satisfied: mmcv<=2.0.0,>=2.0.0rc1 in /usr/local/lib/python3.7/dist-packages (from mmcls>=1.0.0rc2) (2.0.0rc2)\n", "Requirement already satisfied: opencv-python>=3 in /usr/local/lib/python3.7/dist-packages (from mmcv<=2.0.0,>=2.0.0rc1->mmcls>=1.0.0rc2) (4.6.0.66)\n", "Requirement already satisfied: yapf in /usr/local/lib/python3.7/dist-packages (from mmcv<=2.0.0,>=2.0.0rc1->mmcls>=1.0.0rc2) (0.32.0)\n", "Requirement already satisfied: Pillow in /usr/local/lib/python3.7/dist-packages (from mmcv<=2.0.0,>=2.0.0rc1->mmcls>=1.0.0rc2) (7.1.2)\n", "Requirement already satisfied: addict in /usr/local/lib/python3.7/dist-packages (from mmcv<=2.0.0,>=2.0.0rc1->mmcls>=1.0.0rc2) (2.4.0)\n", "Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from mmcv<=2.0.0,>=2.0.0rc1->mmcls>=1.0.0rc2) (6.0)\n", "Requirement already satisfied: termcolor in /usr/local/lib/python3.7/dist-packages (from mmengine<1.0.0,>=0.2.0->mmcls>=1.0.0rc2) (2.1.0)\n", "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmcls>=1.0.0rc2) (0.11.0)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmcls>=1.0.0rc2) (1.4.4)\n", "Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmcls>=1.0.0rc2) (2.8.2)\n", "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmcls>=1.0.0rc2) (3.0.9)\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib->mmcls>=1.0.0rc2) (4.1.1)\n", "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->mmcls>=1.0.0rc2) (1.15.0)\n", "Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/dist-packages (from rich->mmcls>=1.0.0rc2) (2.6.1)\n", "Requirement already satisfied: commonmark<0.10.0,>=0.9.0 in /usr/local/lib/python3.7/dist-packages (from rich->mmcls>=1.0.0rc2) (0.9.1)\n", "Installing collected packages: mmcls\n", "Successfully installed mmcls-1.0.0rc2\n" ] } ], "source": [ "# 安装 mmcls\n", "!mim install \"mmcls>=1.0.0rc2\"" ] }, { "cell_type": "markdown", "metadata": { "id": "RQFq4lu-rnCz" }, "source": [ "如果想将 `MobileNetV3-small` 作为 `YOLOv5` 的主干网络,则配置文件如下:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rQmnfdUprnuM" }, "outputs": [], "source": [ "_base_ = './yolov5_s-v61_1xb2-1e_coco128.py'\n", "\n", "# 导入 mmcls.models 使得可以调用 mmcls 中注册的模块\n", "custom_imports = dict(imports=['mmcls.models'], allow_failed_imports=False)\n", "checkpoint_file = 'https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_small-8427ecf0.pth' # noqa\n", "widen_factor = 1.0\n", "channels = [24, 48, 96]\n", "\n", "model = dict(\n", " backbone=dict(\n", " _delete_=True, # 将 _base_ 中关于 backbone 的字段删除\n", " type='mmcls.MobileNetV3', # 使用 mmcls 中的 MobileNetV3\n", " arch='small',\n", " out_indices=(3, 8, 11), # 修改 out_indices\n", " init_cfg=dict(\n", " type='Pretrained',\n", " checkpoint=checkpoint_file,\n", " prefix='backbone.')), # MMCls 中主干网络的预训练权重含义 prefix='backbone.',为了正常加载权重,需要把这个 prefix 去掉。\n", " neck=dict(\n", " type='YOLOv5PAFPN',\n", " widen_factor=widen_factor,\n", " in_channels=channels, # 注意:MobileNetV3-small 输出的3个通道是 [24, 48, 96],和原先的 yolov5-s neck 不匹配,需要更改\n", " out_channels=channels),\n", " bbox_head=dict(\n", " type='YOLOv5Head',\n", " head_module=dict(\n", " type='YOLOv5HeadModule',\n", " in_channels=channels, # head 部分输入通道也要做相应更改\n", " widen_factor=widen_factor))\n", ")\n", "\n", "config_mobilenetv3_s = f\"\"\"\n", "_base_=\\'{_base_}\\'\n", "widen_factor={widen_factor}\n", "checkpoint_file=\\'{checkpoint_file}\\'\n", "model={model}\n", "\"\"\"\n", "\n", "with open('./configs/yolov5/yolov5_s_mobilenetv3_s-v61_1xb2-1e_coco128.py', 'w') as f:\n", " f.write(config_mobilenetv3_s)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "fZC2BXgyr8br", "outputId": "51bd676d-4e63-4e38-befd-d7c49e2438c9" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "11/19 11:27:50 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - Failed to search registry with scope \"mmyolo\" in the \"log_processor\" registry tree. As a workaround, the current \"log_processor\" registry in \"mmengine\" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether \"mmyolo\" is a correct scope, or whether the registry is initialized.\n", "11/19 11:27:50 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - \n", "------------------------------------------------------------\n", "System environment:\n", " sys.platform: linux\n", " Python: 3.7.15 (default, Oct 12 2022, 19:14:55) [GCC 7.5.0]\n", " CUDA available: True\n", " numpy_random_seed: 896685858\n", " GPU 0: Tesla T4\n", " CUDA_HOME: /usr/local/cuda\n", " NVCC: Cuda compilation tools, release 11.2, V11.2.152\n", " GCC: x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0\n", " PyTorch: 1.12.1+cu113\n", " PyTorch compiling details: PyTorch built with:\n", " - GCC 9.3\n", " - C++ Version: 201402\n", " - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications\n", " - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)\n", " - OpenMP 201511 (a.k.a. OpenMP 4.5)\n", " - LAPACK is enabled (usually provided by MKL)\n", " - NNPACK is enabled\n", " - CPU capability usage: AVX2\n", " - CUDA Runtime 11.3\n", " - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86\n", " - CuDNN 8.3.2 (built against CUDA 11.5)\n", " - Magma 2.5.2\n", " - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n", "\n", " TorchVision: 0.13.1+cu113\n", " OpenCV: 4.6.0\n", " MMEngine: 0.3.1\n", "\n", "Runtime environment:\n", " cudnn_benchmark: True\n", " mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}\n", " dist_cfg: {'backend': 'nccl'}\n", " seed: None\n", " Distributed launcher: none\n", " Distributed training: False\n", " GPU number: 1\n", "------------------------------------------------------------\n", "\n", "11/19 11:27:51 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Config:\n", "default_scope = 'mmyolo'\n", "default_hooks = dict(\n", " timer=dict(type='IterTimerHook'),\n", " logger=dict(type='LoggerHook', interval=50),\n", " param_scheduler=dict(\n", " type='YOLOv5ParamSchedulerHook',\n", " scheduler_type='linear',\n", " lr_factor=0.01,\n", " max_epochs=1),\n", " checkpoint=dict(\n", " type='CheckpointHook', interval=10, save_best='auto',\n", " max_keep_ckpts=3),\n", " sampler_seed=dict(type='DistSamplerSeedHook'),\n", " visualization=dict(type='mmdet.DetVisualizationHook'))\n", "env_cfg = dict(\n", " cudnn_benchmark=True,\n", " mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),\n", " dist_cfg=dict(backend='nccl'))\n", "vis_backends = [dict(type='LocalVisBackend')]\n", "visualizer = dict(\n", " type='mmdet.DetLocalVisualizer',\n", " vis_backends=[dict(type='LocalVisBackend')],\n", " name='visualizer')\n", "log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)\n", "log_level = 'INFO'\n", "load_from = None\n", "resume = False\n", "file_client_args = dict(backend='disk')\n", "data_root = 'data/coco/'\n", "dataset_type = 'YOLOv5CocoDataset'\n", "num_classes = 80\n", "img_scale = (640, 640)\n", "deepen_factor = 0.33\n", "widen_factor = 1.0\n", "max_epochs = 1\n", "save_epoch_intervals = 10\n", "train_batch_size_per_gpu = 2\n", "train_num_workers = 2\n", "val_batch_size_per_gpu = 1\n", "val_num_workers = 2\n", "persistent_workers = True\n", "batch_shapes_cfg = dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)\n", "anchors = [[(10, 13), (16, 30), (33, 23)], [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]]\n", "strides = [8, 16, 32]\n", "num_det_layers = 3\n", "model = dict(\n", " type='YOLODetector',\n", " data_preprocessor=dict(\n", " type='mmdet.DetDataPreprocessor',\n", " mean=[0.0, 0.0, 0.0],\n", " std=[255.0, 255.0, 255.0],\n", " bgr_to_rgb=True),\n", " backbone=dict(\n", " type='mmcls.MobileNetV3',\n", " arch='small',\n", " out_indices=(3, 8, 11),\n", " init_cfg=dict(\n", " type='Pretrained',\n", " checkpoint=\n", " 'https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_small-8427ecf0.pth',\n", " prefix='backbone.')),\n", " neck=dict(\n", " type='YOLOv5PAFPN',\n", " deepen_factor=0.33,\n", " widen_factor=1.0,\n", " in_channels=[24, 48, 96],\n", " out_channels=[24, 48, 96],\n", " num_csp_blocks=3,\n", " norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),\n", " act_cfg=dict(type='SiLU', inplace=True)),\n", " bbox_head=dict(\n", " type='YOLOv5Head',\n", " head_module=dict(\n", " type='YOLOv5HeadModule',\n", " num_classes=80,\n", " in_channels=[24, 48, 96],\n", " widen_factor=1.0,\n", " featmap_strides=[8, 16, 32],\n", " num_base_priors=3),\n", " prior_generator=dict(\n", " type='mmdet.YOLOAnchorGenerator',\n", " base_sizes=[[(10, 13), (16, 30), (33, 23)],\n", " [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]],\n", " strides=[8, 16, 32]),\n", " loss_cls=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=0.5),\n", " loss_bbox=dict(\n", " type='IoULoss',\n", " iou_mode='ciou',\n", " bbox_format='xywh',\n", " eps=1e-07,\n", " reduction='mean',\n", " loss_weight=0.05,\n", " return_iou=True),\n", " loss_obj=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=1.0),\n", " prior_match_thr=4.0,\n", " obj_level_weights=[4.0, 1.0, 0.4]),\n", " test_cfg=dict(\n", " multi_label=True,\n", " nms_pre=30000,\n", " score_thr=0.001,\n", " nms=dict(type='nms', iou_threshold=0.65),\n", " max_per_img=300))\n", "albu_train_transforms = [\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", "]\n", "pre_transform = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", "]\n", "train_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',\n", " 'flip_direction'))\n", "]\n", "train_dataloader = dict(\n", " batch_size=2,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " sampler=dict(type='DefaultSampler', shuffle=True),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " ann_file='annotations/instances_train2017.json',\n", " data_prefix=dict(img='train2017/'),\n", " filter_cfg=dict(filter_empty_gt=False, min_size=32),\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'flip', 'flip_direction'))\n", " ]))\n", "test_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", "]\n", "val_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "test_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "param_scheduler = None\n", "optim_wrapper = dict(\n", " type='OptimWrapper',\n", " optimizer=dict(\n", " type='SGD',\n", " lr=0.01,\n", " momentum=0.937,\n", " weight_decay=0.0005,\n", " nesterov=True,\n", " batch_size_per_gpu=2),\n", " constructor='YOLOv5OptimizerConstructor')\n", "custom_hooks = [\n", " dict(\n", " type='EMAHook',\n", " ema_type='ExpMomentumEMA',\n", " momentum=0.0001,\n", " update_buffers=True,\n", " strict_load=False,\n", " priority=49)\n", "]\n", "val_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "test_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=1, val_interval=10)\n", "val_cfg = dict(type='ValLoop')\n", "test_cfg = dict(type='TestLoop')\n", "checkpoint_file = 'https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_small-8427ecf0.pth'\n", "launcher = 'none'\n", "work_dir = './work_dirs/yolov5_s_mobilenetv3_s-v61_1xb2-1e_coco128'\n", "\n", "11/19 11:27:51 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Result has been saved to /content/mmyolo/work_dirs/yolov5_s_mobilenetv3_s-v61_1xb2-1e_coco128/modules_statistic_results.json\n", "/usr/local/lib/python3.7/dist-packages/mmcv/cnn/bricks/hsigmoid.py:36: UserWarning: In MMCV v1.4.4, we modified the default value of args to align with PyTorch official. Previous Implementation: Hsigmoid(x) = min(max((x + 1) / 2, 0), 1). Current Implementation: Hsigmoid(x) = min(max((x + 3) / 6, 0), 1).\n", " 'In MMCV v1.4.4, we modified the default value of args to align '\n", "11/19 11:27:55 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.\n", "loading annotations into memory...\n", "Done (t=0.01s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.01s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.01s)\n", "creating index...\n", "index created!\n", "11/19 11:28:00 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - load backbone. in model from: https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_small-8427ecf0.pth\n", "http loads checkpoint from path: https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_small-8427ecf0.pth\n", "Downloading: \"https://download.openmmlab.com/mmclassification/v0/mobilenet_v3/convert/mobilenet_v3_small-8427ecf0.pth\" to /root/.cache/torch/hub/checkpoints/mobilenet_v3_small-8427ecf0.pth\n", "100% 9.82M/9.82M [00:00<00:00, 24.2MB/s]\n", "11/19 11:28:02 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Checkpoints will be saved to /content/mmyolo/work_dirs/yolov5_s_mobilenetv3_s-v61_1xb2-1e_coco128.\n", "11/19 11:28:12 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Epoch(train) [1][50/63] lr: 4.9000e-04 eta: 0:00:02 time: 0.1847 data_time: 0.0106 memory: 1145 loss: 0.5764 loss_cls: 0.2093 loss_obj: 0.1373 loss_bbox: 0.2298\n", "11/19 11:28:13 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Exp name: yolov5_s_mobilenetv3_s-v61_1xb2-1e_coco128_20221119_112750\n", "11/19 11:28:13 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Saving checkpoint at 1 epochs\n", "11/19 11:28:14 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - `save_param_scheduler` is True but `self.param_schedulers` is None, so skip saving parameter schedulers\n" ] } ], "source": [ "# 启动训练\n", "!python tools/train.py configs/yolov5/yolov5_s_mobilenetv3_s-v61_1xb2-1e_coco128.py" ] }, { "cell_type": "markdown", "metadata": { "id": "fz43offklfzF" }, "source": [ "### 2.3 通过 MMClassification 使用 timm 中实现的主干网络(MobileViT V2)\n", "MMClassification 中的主干网络还不够?\n", "\n", "试试通过 MMClassification 加载 timm [![GitHub release](https://img.shields.io/github/release/rwightman/pytorch-image-models.svg)](https://GitHub.com/rwightman/pytorch-image-models/) 中实现的主干网络:\n", "
    \n", "Supported backbones\n", "\n", "* Aggregating Nested Transformers - https://arxiv.org/abs/2105.12723\n", "* BEiT - https://arxiv.org/abs/2106.08254\n", "* Big Transfer ResNetV2 (BiT) - https://arxiv.org/abs/1912.11370\n", "* Bottleneck Transformers - https://arxiv.org/abs/2101.11605\n", "* CaiT (Class-Attention in Image Transformers) - https://arxiv.org/abs/2103.17239\n", "* CoaT (Co-Scale Conv-Attentional Image Transformers) - https://arxiv.org/abs/2104.06399\n", "* CoAtNet (Convolution and Attention) - https://arxiv.org/abs/2106.04803\n", "* ConvNeXt - https://arxiv.org/abs/2201.03545\n", "* ConViT (Soft Convolutional Inductive Biases Vision Transformers)- https://arxiv.org/abs/2103.10697\n", "* CspNet (Cross-Stage Partial Networks) - https://arxiv.org/abs/1911.11929\n", "* DeiT - https://arxiv.org/abs/2012.12877\n", "* DeiT-III - https://arxiv.org/pdf/2204.07118.pdf\n", "* DenseNet - https://arxiv.org/abs/1608.06993\n", "* DLA - https://arxiv.org/abs/1707.06484\n", "* DPN (Dual-Path Network) - https://arxiv.org/abs/1707.01629\n", "* EdgeNeXt - https://arxiv.org/abs/2206.10589\n", "* EfficientFormer - https://arxiv.org/abs/2206.01191\n", "* EfficientNet (MBConvNet Family)\n", " * EfficientNet NoisyStudent (B0-B7, L2) - https://arxiv.org/abs/1911.04252\n", " * EfficientNet AdvProp (B0-B8) - https://arxiv.org/abs/1911.09665\n", " * EfficientNet (B0-B7) - https://arxiv.org/abs/1905.11946\n", " * EfficientNet-EdgeTPU (S, M, L) - https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html\n", " * EfficientNet V2 - https://arxiv.org/abs/2104.00298\n", " * FBNet-C - https://arxiv.org/abs/1812.03443\n", " * MixNet - https://arxiv.org/abs/1907.09595\n", " * MNASNet B1, A1 (Squeeze-Excite), and Small - https://arxiv.org/abs/1807.11626\n", " * MobileNet-V2 - https://arxiv.org/abs/1801.04381\n", " * Single-Path NAS - https://arxiv.org/abs/1904.02877\n", " * TinyNet - https://arxiv.org/abs/2010.14819\n", "* GCViT (Global Context Vision Transformer) - https://arxiv.org/abs/2206.09959\n", "* GhostNet - https://arxiv.org/abs/1911.11907\n", "* gMLP - https://arxiv.org/abs/2105.08050\n", "* GPU-Efficient Networks - https://arxiv.org/abs/2006.14090\n", "* Halo Nets - https://arxiv.org/abs/2103.12731\n", "* HRNet - https://arxiv.org/abs/1908.07919\n", "* Inception-V3 - https://arxiv.org/abs/1512.00567\n", "* Inception-ResNet-V2 and Inception-V4 - https://arxiv.org/abs/1602.07261\n", "* Lambda Networks - https://arxiv.org/abs/2102.08602\n", "* LeViT (Vision Transformer in ConvNet's Clothing) - https://arxiv.org/abs/2104.01136\n", "* MaxViT (Multi-Axis Vision Transformer) - https://arxiv.org/abs/2204.01697\n", "* MLP-Mixer - https://arxiv.org/abs/2105.01601\n", "* MobileNet-V3 (MBConvNet w/ Efficient Head) - https://arxiv.org/abs/1905.02244\n", " * FBNet-V3 - https://arxiv.org/abs/2006.02049\n", " * HardCoRe-NAS - https://arxiv.org/abs/2102.11646\n", " * LCNet - https://arxiv.org/abs/2109.15099\n", "* MobileViT - https://arxiv.org/abs/2110.02178\n", "* MobileViT-V2 - https://arxiv.org/abs/2206.02680\n", "* MViT-V2 (Improved Multiscale Vision Transformer) - https://arxiv.org/abs/2112.01526\n", "* NASNet-A - https://arxiv.org/abs/1707.07012\n", "* NesT - https://arxiv.org/abs/2105.12723\n", "* NFNet-F - https://arxiv.org/abs/2102.06171\n", "* NF-RegNet / NF-ResNet - https://arxiv.org/abs/2101.08692\n", "* PNasNet - https://arxiv.org/abs/1712.00559\n", "* PoolFormer (MetaFormer) - https://arxiv.org/abs/2111.11418\n", "* Pooling-based Vision Transformer (PiT) - https://arxiv.org/abs/2103.16302\n", "* PVT-V2 (Improved Pyramid Vision Transformer) - https://arxiv.org/abs/2106.13797\n", "* RegNet - https://arxiv.org/abs/2003.13678\n", "* RegNetZ - https://arxiv.org/abs/2103.06877\n", "* RepVGG - https://arxiv.org/abs/2101.03697\n", "* ResMLP - https://arxiv.org/abs/2105.03404\n", "* ResNet/ResNeXt\n", " * ResNet (v1b/v1.5) - https://arxiv.org/abs/1512.03385\n", " * ResNeXt - https://arxiv.org/abs/1611.05431\n", " * 'Bag of Tricks' / Gluon C, D, E, S variations - https://arxiv.org/abs/1812.01187\n", " * Weakly-supervised (WSL) Instagram pretrained / ImageNet tuned ResNeXt101 - https://arxiv.org/abs/1805.00932\n", " * Semi-supervised (SSL) / Semi-weakly Supervised (SWSL) ResNet/ResNeXts - https://arxiv.org/abs/1905.00546\n", " * ECA-Net (ECAResNet) - https://arxiv.org/abs/1910.03151v4\n", " * Squeeze-and-Excitation Networks (SEResNet) - https://arxiv.org/abs/1709.01507\n", " * ResNet-RS - https://arxiv.org/abs/2103.07579\n", "* Res2Net - https://arxiv.org/abs/1904.01169\n", "* ResNeSt - https://arxiv.org/abs/2004.08955\n", "* ReXNet - https://arxiv.org/abs/2007.00992\n", "* SelecSLS - https://arxiv.org/abs/1907.00837\n", "* Selective Kernel Networks - https://arxiv.org/abs/1903.06586\n", "* Sequencer2D - https://arxiv.org/abs/2205.01972\n", "* Swin S3 (AutoFormerV2) - https://arxiv.org/abs/2111.14725\n", "* Swin Transformer - https://arxiv.org/abs/2103.14030\n", "* Swin Transformer V2 - https://arxiv.org/abs/2111.09883\n", "* Transformer-iN-Transformer (TNT) - https://arxiv.org/abs/2103.00112\n", "* TResNet - https://arxiv.org/abs/2003.13630\n", "* Twins (Spatial Attention in Vision Transformers) - https://arxiv.org/pdf/2104.13840.pdf\n", "* Visformer - https://arxiv.org/abs/2104.12533\n", "* Vision Transformer - https://arxiv.org/abs/2010.11929\n", "* VOLO (Vision Outlooker) - https://arxiv.org/abs/2106.13112\n", "* VovNet V2 and V1 - https://arxiv.org/abs/1911.06667\n", "* Xception - https://arxiv.org/abs/1610.02357\n", "* Xception (Modified Aligned, Gluon) - https://arxiv.org/abs/1802.02611\n", "* Xception (Modified Aligned, TF) - https://arxiv.org/abs/1802.02611\n", "* XCiT (Cross-Covariance Image Transformers) - https://arxiv.org/abs/2106.09681\n", "\n", "
    \n", "\n", "如果想要使用在 timm 中实现的主干网络,需要先安装 timm。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "7vyNPZsFmluO", "outputId": "491f4809-6468-4ad3-be77-06cad71df3e8" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Collecting timm\n", " Downloading timm-0.6.11-py3-none-any.whl (548 kB)\n", "\u001b[K |████████████████████████████████| 548 kB 35.1 MB/s \n", "\u001b[?25hRequirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from timm) (6.0)\n", "Collecting huggingface-hub\n", " Downloading huggingface_hub-0.11.0-py3-none-any.whl (182 kB)\n", "\u001b[K |████████████████████████████████| 182 kB 37.7 MB/s \n", "\u001b[?25hRequirement already satisfied: torch>=1.7 in /usr/local/lib/python3.7/dist-packages (from timm) (1.12.1+cu113)\n", "Requirement already satisfied: torchvision in /usr/local/lib/python3.7/dist-packages (from timm) (0.13.1+cu113)\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch>=1.7->timm) (4.1.1)\n", "Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.7/dist-packages (from huggingface-hub->timm) (21.3)\n", "Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from huggingface-hub->timm) (3.8.0)\n", "Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from huggingface-hub->timm) (2.23.0)\n", "Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from huggingface-hub->timm) (4.64.1)\n", "Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from huggingface-hub->timm) (4.13.0)\n", "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging>=20.9->huggingface-hub->timm) (3.0.9)\n", "Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->huggingface-hub->timm) (3.10.0)\n", "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->huggingface-hub->timm) (3.0.4)\n", "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->huggingface-hub->timm) (2.10)\n", "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->huggingface-hub->timm) (1.24.3)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->huggingface-hub->timm) (2022.9.24)\n", "Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /usr/local/lib/python3.7/dist-packages (from torchvision->timm) (7.1.2)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from torchvision->timm) (1.21.6)\n", "Installing collected packages: huggingface-hub, timm\n", "Successfully installed huggingface-hub-0.11.0 timm-0.6.11\n" ] } ], "source": [ "# 安装 timm\n", "%pip install timm" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "-jlBsDU8m7nm", "outputId": "05b84405-4c1a-4b35-bbff-a11e72afd8e8" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "num of models: 765\n", "['adv_inception_v3', 'bat_resnext26ts', 'beit_base_patch16_224', 'beit_base_patch16_224_in22k', 'beit_base_patch16_384', 'beit_large_patch16_224', 'beit_large_patch16_224_in22k', 'beit_large_patch16_384', 'beit_large_patch16_512', 'beitv2_base_patch16_224', 'beitv2_base_patch16_224_in22k', 'beitv2_large_patch16_224', 'beitv2_large_patch16_224_in22k', 'botnet26t_256', 'cait_m36_384', 'cait_m48_448', 'cait_s24_224', 'cait_s24_384', 'cait_s36_384', 'cait_xs24_384', 'cait_xxs24_224', 'cait_xxs24_384', 'cait_xxs36_224', 'cait_xxs36_384', 'coat_lite_mini', 'coat_lite_small', 'coat_lite_tiny', 'coat_mini', 'coat_tiny', 'coatnet_0_rw_224', 'coatnet_1_rw_224', 'coatnet_bn_0_rw_224', 'coatnet_nano_rw_224', 'coatnet_rmlp_1_rw_224', 'coatnet_rmlp_nano_rw_224', 'convit_base', 'convit_small', 'convit_tiny', 'convmixer_768_32', 'convmixer_1024_20_ks9_p14', 'convmixer_1536_20', 'convnext_atto', 'convnext_atto_ols', 'convnext_base', 'convnext_base_384_in22ft1k', 'convnext_base_in22ft1k', 'convnext_base_in22k', 'convnext_femto', 'convnext_femto_ols', 'convnext_large', 'convnext_large_384_in22ft1k', 'convnext_large_in22ft1k', 'convnext_large_in22k', 'convnext_nano', 'convnext_nano_ols', 'convnext_pico', 'convnext_pico_ols', 'convnext_small', 'convnext_small_384_in22ft1k', 'convnext_small_in22ft1k', 'convnext_small_in22k', 'convnext_tiny', 'convnext_tiny_384_in22ft1k', 'convnext_tiny_hnf', 'convnext_tiny_in22ft1k', 'convnext_tiny_in22k', 'convnext_xlarge_384_in22ft1k', 'convnext_xlarge_in22ft1k', 'convnext_xlarge_in22k', 'crossvit_9_240', 'crossvit_9_dagger_240', 'crossvit_15_240', 'crossvit_15_dagger_240', 'crossvit_15_dagger_408', 'crossvit_18_240', 'crossvit_18_dagger_240', 'crossvit_18_dagger_408', 'crossvit_base_240', 'crossvit_small_240', 'crossvit_tiny_240', 'cs3darknet_focus_l', 'cs3darknet_focus_m', 'cs3darknet_l', 'cs3darknet_m', 'cs3darknet_x', 'cs3edgenet_x', 'cs3se_edgenet_x', 'cs3sedarknet_l', 'cs3sedarknet_x', 'cspdarknet53', 'cspresnet50', 'cspresnext50', 'darknet53', 'darknetaa53', 'deit3_base_patch16_224', 'deit3_base_patch16_224_in21ft1k', 'deit3_base_patch16_384', 'deit3_base_patch16_384_in21ft1k', 'deit3_huge_patch14_224', 'deit3_huge_patch14_224_in21ft1k', 'deit3_large_patch16_224', 'deit3_large_patch16_224_in21ft1k', 'deit3_large_patch16_384', 'deit3_large_patch16_384_in21ft1k', 'deit3_medium_patch16_224', 'deit3_medium_patch16_224_in21ft1k', 'deit3_small_patch16_224', 'deit3_small_patch16_224_in21ft1k', 'deit3_small_patch16_384', 'deit3_small_patch16_384_in21ft1k', 'deit_base_distilled_patch16_224', 'deit_base_distilled_patch16_384', 'deit_base_patch16_224', 'deit_base_patch16_384', 'deit_small_distilled_patch16_224', 'deit_small_patch16_224', 'deit_tiny_distilled_patch16_224', 'deit_tiny_patch16_224', 'densenet121', 'densenet161', 'densenet169', 'densenet201', 'densenetblur121d', 'dla34', 'dla46_c', 'dla46x_c', 'dla60', 'dla60_res2net', 'dla60_res2next', 'dla60x', 'dla60x_c', 'dla102', 'dla102x', 'dla102x2', 'dla169', 'dm_nfnet_f0', 'dm_nfnet_f1', 'dm_nfnet_f2', 'dm_nfnet_f3', 'dm_nfnet_f4', 'dm_nfnet_f5', 'dm_nfnet_f6', 'dpn68', 'dpn68b', 'dpn92', 'dpn98', 'dpn107', 'dpn131', 'eca_botnext26ts_256', 'eca_halonext26ts', 'eca_nfnet_l0', 'eca_nfnet_l1', 'eca_nfnet_l2', 'eca_resnet33ts', 'eca_resnext26ts', 'ecaresnet26t', 'ecaresnet50d', 'ecaresnet50d_pruned', 'ecaresnet50t', 'ecaresnet101d', 'ecaresnet101d_pruned', 'ecaresnet269d', 'ecaresnetlight', 'edgenext_base', 'edgenext_small', 'edgenext_small_rw', 'edgenext_x_small', 'edgenext_xx_small', 'efficientformer_l1', 'efficientformer_l3', 'efficientformer_l7', 'efficientnet_b0', 'efficientnet_b1', 'efficientnet_b1_pruned', 'efficientnet_b2', 'efficientnet_b2_pruned', 'efficientnet_b3', 'efficientnet_b3_pruned', 'efficientnet_b4', 'efficientnet_el', 'efficientnet_el_pruned', 'efficientnet_em', 'efficientnet_es', 'efficientnet_es_pruned', 'efficientnet_lite0', 'efficientnetv2_rw_m', 'efficientnetv2_rw_s', 'efficientnetv2_rw_t', 'ens_adv_inception_resnet_v2', 'ese_vovnet19b_dw', 'ese_vovnet39b', 'fbnetc_100', 'fbnetv3_b', 'fbnetv3_d', 'fbnetv3_g', 'gc_efficientnetv2_rw_t', 'gcresnet33ts', 'gcresnet50t', 'gcresnext26ts', 'gcresnext50ts', 'gcvit_base', 'gcvit_small', 'gcvit_tiny', 'gcvit_xtiny', 'gcvit_xxtiny', 'gernet_l', 'gernet_m', 'gernet_s', 'ghostnet_100', 'gluon_inception_v3', 'gluon_resnet18_v1b', 'gluon_resnet34_v1b', 'gluon_resnet50_v1b', 'gluon_resnet50_v1c', 'gluon_resnet50_v1d', 'gluon_resnet50_v1s', 'gluon_resnet101_v1b', 'gluon_resnet101_v1c', 'gluon_resnet101_v1d', 'gluon_resnet101_v1s', 'gluon_resnet152_v1b', 'gluon_resnet152_v1c', 'gluon_resnet152_v1d', 'gluon_resnet152_v1s', 'gluon_resnext50_32x4d', 'gluon_resnext101_32x4d', 'gluon_resnext101_64x4d', 'gluon_senet154', 'gluon_seresnext50_32x4d', 'gluon_seresnext101_32x4d', 'gluon_seresnext101_64x4d', 'gluon_xception65', 'gmixer_24_224', 'gmlp_s16_224', 'halo2botnet50ts_256', 'halonet26t', 'halonet50ts', 'haloregnetz_b', 'hardcorenas_a', 'hardcorenas_b', 'hardcorenas_c', 'hardcorenas_d', 'hardcorenas_e', 'hardcorenas_f', 'hrnet_w18', 'hrnet_w18_small', 'hrnet_w18_small_v2', 'hrnet_w30', 'hrnet_w32', 'hrnet_w40', 'hrnet_w44', 'hrnet_w48', 'hrnet_w64', 'ig_resnext101_32x8d', 'ig_resnext101_32x16d', 'ig_resnext101_32x32d', 'ig_resnext101_32x48d', 'inception_resnet_v2', 'inception_v3', 'inception_v4', 'jx_nest_base', 'jx_nest_small', 'jx_nest_tiny', 'lambda_resnet26rpt_256', 'lambda_resnet26t', 'lambda_resnet50ts', 'lamhalobotnet50ts_256', 'lcnet_050', 'lcnet_075', 'lcnet_100', 'legacy_senet154', 'legacy_seresnet18', 'legacy_seresnet34', 'legacy_seresnet50', 'legacy_seresnet101', 'legacy_seresnet152', 'legacy_seresnext26_32x4d', 'legacy_seresnext50_32x4d', 'legacy_seresnext101_32x4d', 'levit_128', 'levit_128s', 'levit_192', 'levit_256', 'levit_384', 'maxvit_nano_rw_256', 'maxvit_rmlp_nano_rw_256', 'maxvit_rmlp_pico_rw_256', 'maxvit_rmlp_tiny_rw_256', 'maxvit_tiny_rw_224', 'mixer_b16_224', 'mixer_b16_224_in21k', 'mixer_b16_224_miil', 'mixer_b16_224_miil_in21k', 'mixer_l16_224', 'mixer_l16_224_in21k', 'mixnet_l', 'mixnet_m', 'mixnet_s', 'mixnet_xl', 'mnasnet_100', 'mnasnet_small', 'mobilenetv2_050', 'mobilenetv2_100', 'mobilenetv2_110d', 'mobilenetv2_120d', 'mobilenetv2_140', 'mobilenetv3_large_100', 'mobilenetv3_large_100_miil', 'mobilenetv3_large_100_miil_in21k', 'mobilenetv3_rw', 'mobilenetv3_small_050', 'mobilenetv3_small_075', 'mobilenetv3_small_100', 'mobilevit_s', 'mobilevit_xs', 'mobilevit_xxs', 'mobilevitv2_050', 'mobilevitv2_075', 'mobilevitv2_100', 'mobilevitv2_125', 'mobilevitv2_150', 'mobilevitv2_150_384_in22ft1k', 'mobilevitv2_150_in22ft1k', 'mobilevitv2_175', 'mobilevitv2_175_384_in22ft1k', 'mobilevitv2_175_in22ft1k', 'mobilevitv2_200', 'mobilevitv2_200_384_in22ft1k', 'mobilevitv2_200_in22ft1k', 'mvitv2_base', 'mvitv2_large', 'mvitv2_small', 'mvitv2_tiny', 'nasnetalarge', 'nf_regnet_b1', 'nf_resnet50', 'nfnet_l0', 'pit_b_224', 'pit_b_distilled_224', 'pit_s_224', 'pit_s_distilled_224', 'pit_ti_224', 'pit_ti_distilled_224', 'pit_xs_224', 'pit_xs_distilled_224', 'pnasnet5large', 'poolformer_m36', 'poolformer_m48', 'poolformer_s12', 'poolformer_s24', 'poolformer_s36', 'pvt_v2_b0', 'pvt_v2_b1', 'pvt_v2_b2', 'pvt_v2_b2_li', 'pvt_v2_b3', 'pvt_v2_b4', 'pvt_v2_b5', 'regnetv_040', 'regnetv_064', 'regnetx_002', 'regnetx_004', 'regnetx_006', 'regnetx_008', 'regnetx_016', 'regnetx_032', 'regnetx_040', 'regnetx_064', 'regnetx_080', 'regnetx_120', 'regnetx_160', 'regnetx_320', 'regnety_002', 'regnety_004', 'regnety_006', 'regnety_008', 'regnety_016', 'regnety_032', 'regnety_040', 'regnety_064', 'regnety_080', 'regnety_120', 'regnety_160', 'regnety_320', 'regnetz_040', 'regnetz_040h', 'regnetz_b16', 'regnetz_c16', 'regnetz_c16_evos', 'regnetz_d8', 'regnetz_d8_evos', 'regnetz_d32', 'regnetz_e8', 'repvgg_a2', 'repvgg_b0', 'repvgg_b1', 'repvgg_b1g4', 'repvgg_b2', 'repvgg_b2g4', 'repvgg_b3', 'repvgg_b3g4', 'res2net50_14w_8s', 'res2net50_26w_4s', 'res2net50_26w_6s', 'res2net50_26w_8s', 'res2net50_48w_2s', 'res2net101_26w_4s', 'res2next50', 'resmlp_12_224', 'resmlp_12_224_dino', 'resmlp_12_distilled_224', 'resmlp_24_224', 'resmlp_24_224_dino', 'resmlp_24_distilled_224', 'resmlp_36_224', 'resmlp_36_distilled_224', 'resmlp_big_24_224', 'resmlp_big_24_224_in22ft1k', 'resmlp_big_24_distilled_224', 'resnest14d', 'resnest26d', 'resnest50d', 'resnest50d_1s4x24d', 'resnest50d_4s2x40d', 'resnest101e', 'resnest200e', 'resnest269e', 'resnet10t', 'resnet14t', 'resnet18', 'resnet18d', 'resnet26', 'resnet26d', 'resnet26t', 'resnet32ts', 'resnet33ts', 'resnet34', 'resnet34d', 'resnet50', 'resnet50_gn', 'resnet50d', 'resnet51q', 'resnet61q', 'resnet101', 'resnet101d', 'resnet152', 'resnet152d', 'resnet200d', 'resnetaa50', 'resnetblur50', 'resnetrs50', 'resnetrs101', 'resnetrs152', 'resnetrs200', 'resnetrs270', 'resnetrs350', 'resnetrs420', 'resnetv2_50', 'resnetv2_50d_evos', 'resnetv2_50d_gn', 'resnetv2_50x1_bit_distilled', 'resnetv2_50x1_bitm', 'resnetv2_50x1_bitm_in21k', 'resnetv2_50x3_bitm', 'resnetv2_50x3_bitm_in21k', 'resnetv2_101', 'resnetv2_101x1_bitm', 'resnetv2_101x1_bitm_in21k', 'resnetv2_101x3_bitm', 'resnetv2_101x3_bitm_in21k', 'resnetv2_152x2_bit_teacher', 'resnetv2_152x2_bit_teacher_384', 'resnetv2_152x2_bitm', 'resnetv2_152x2_bitm_in21k', 'resnetv2_152x4_bitm', 'resnetv2_152x4_bitm_in21k', 'resnext26ts', 'resnext50_32x4d', 'resnext50d_32x4d', 'resnext101_32x8d', 'resnext101_64x4d', 'rexnet_100', 'rexnet_130', 'rexnet_150', 'rexnet_200', 'sebotnet33ts_256', 'sehalonet33ts', 'selecsls42b', 'selecsls60', 'selecsls60b', 'semnasnet_075', 'semnasnet_100', 'sequencer2d_l', 'sequencer2d_m', 'sequencer2d_s', 'seresnet33ts', 'seresnet50', 'seresnet152d', 'seresnext26d_32x4d', 'seresnext26t_32x4d', 'seresnext26ts', 'seresnext50_32x4d', 'seresnext101_32x8d', 'seresnext101d_32x8d', 'seresnextaa101d_32x8d', 'skresnet18', 'skresnet34', 'skresnext50_32x4d', 'spnasnet_100', 'ssl_resnet18', 'ssl_resnet50', 'ssl_resnext50_32x4d', 'ssl_resnext101_32x4d', 'ssl_resnext101_32x8d', 'ssl_resnext101_32x16d', 'swin_base_patch4_window7_224', 'swin_base_patch4_window7_224_in22k', 'swin_base_patch4_window12_384', 'swin_base_patch4_window12_384_in22k', 'swin_large_patch4_window7_224', 'swin_large_patch4_window7_224_in22k', 'swin_large_patch4_window12_384', 'swin_large_patch4_window12_384_in22k', 'swin_s3_base_224', 'swin_s3_small_224', 'swin_s3_tiny_224', 'swin_small_patch4_window7_224', 'swin_tiny_patch4_window7_224', 'swinv2_base_window8_256', 'swinv2_base_window12_192_22k', 'swinv2_base_window12to16_192to256_22kft1k', 'swinv2_base_window12to24_192to384_22kft1k', 'swinv2_base_window16_256', 'swinv2_cr_small_224', 'swinv2_cr_small_ns_224', 'swinv2_cr_tiny_ns_224', 'swinv2_large_window12_192_22k', 'swinv2_large_window12to16_192to256_22kft1k', 'swinv2_large_window12to24_192to384_22kft1k', 'swinv2_small_window8_256', 'swinv2_small_window16_256', 'swinv2_tiny_window8_256', 'swinv2_tiny_window16_256', 'swsl_resnet18', 'swsl_resnet50', 'swsl_resnext50_32x4d', 'swsl_resnext101_32x4d', 'swsl_resnext101_32x8d', 'swsl_resnext101_32x16d', 'tf_efficientnet_b0', 'tf_efficientnet_b0_ap', 'tf_efficientnet_b0_ns', 'tf_efficientnet_b1', 'tf_efficientnet_b1_ap', 'tf_efficientnet_b1_ns', 'tf_efficientnet_b2', 'tf_efficientnet_b2_ap', 'tf_efficientnet_b2_ns', 'tf_efficientnet_b3', 'tf_efficientnet_b3_ap', 'tf_efficientnet_b3_ns', 'tf_efficientnet_b4', 'tf_efficientnet_b4_ap', 'tf_efficientnet_b4_ns', 'tf_efficientnet_b5', 'tf_efficientnet_b5_ap', 'tf_efficientnet_b5_ns', 'tf_efficientnet_b6', 'tf_efficientnet_b6_ap', 'tf_efficientnet_b6_ns', 'tf_efficientnet_b7', 'tf_efficientnet_b7_ap', 'tf_efficientnet_b7_ns', 'tf_efficientnet_b8', 'tf_efficientnet_b8_ap', 'tf_efficientnet_cc_b0_4e', 'tf_efficientnet_cc_b0_8e', 'tf_efficientnet_cc_b1_8e', 'tf_efficientnet_el', 'tf_efficientnet_em', 'tf_efficientnet_es', 'tf_efficientnet_l2_ns', 'tf_efficientnet_l2_ns_475', 'tf_efficientnet_lite0', 'tf_efficientnet_lite1', 'tf_efficientnet_lite2', 'tf_efficientnet_lite3', 'tf_efficientnet_lite4', 'tf_efficientnetv2_b0', 'tf_efficientnetv2_b1', 'tf_efficientnetv2_b2', 'tf_efficientnetv2_b3', 'tf_efficientnetv2_l', 'tf_efficientnetv2_l_in21ft1k', 'tf_efficientnetv2_l_in21k', 'tf_efficientnetv2_m', 'tf_efficientnetv2_m_in21ft1k', 'tf_efficientnetv2_m_in21k', 'tf_efficientnetv2_s', 'tf_efficientnetv2_s_in21ft1k', 'tf_efficientnetv2_s_in21k', 'tf_efficientnetv2_xl_in21ft1k', 'tf_efficientnetv2_xl_in21k', 'tf_inception_v3', 'tf_mixnet_l', 'tf_mixnet_m', 'tf_mixnet_s', 'tf_mobilenetv3_large_075', 'tf_mobilenetv3_large_100', 'tf_mobilenetv3_large_minimal_100', 'tf_mobilenetv3_small_075', 'tf_mobilenetv3_small_100', 'tf_mobilenetv3_small_minimal_100', 'tinynet_a', 'tinynet_b', 'tinynet_c', 'tinynet_d', 'tinynet_e', 'tnt_s_patch16_224', 'tresnet_l', 'tresnet_l_448', 'tresnet_m', 'tresnet_m_448', 'tresnet_m_miil_in21k', 'tresnet_v2_l', 'tresnet_xl', 'tresnet_xl_448', 'tv_densenet121', 'tv_resnet34', 'tv_resnet50', 'tv_resnet101', 'tv_resnet152', 'tv_resnext50_32x4d', 'twins_pcpvt_base', 'twins_pcpvt_large', 'twins_pcpvt_small', 'twins_svt_base', 'twins_svt_large', 'twins_svt_small', 'vgg11', 'vgg11_bn', 'vgg13', 'vgg13_bn', 'vgg16', 'vgg16_bn', 'vgg19', 'vgg19_bn', 'visformer_small', 'vit_base_patch8_224', 'vit_base_patch8_224_dino', 'vit_base_patch8_224_in21k', 'vit_base_patch16_224', 'vit_base_patch16_224_dino', 'vit_base_patch16_224_in21k', 'vit_base_patch16_224_miil', 'vit_base_patch16_224_miil_in21k', 'vit_base_patch16_224_sam', 'vit_base_patch16_384', 'vit_base_patch16_rpn_224', 'vit_base_patch32_224', 'vit_base_patch32_224_clip_laion2b', 'vit_base_patch32_224_in21k', 'vit_base_patch32_224_sam', 'vit_base_patch32_384', 'vit_base_r50_s16_224_in21k', 'vit_base_r50_s16_384', 'vit_giant_patch14_224_clip_laion2b', 'vit_huge_patch14_224_clip_laion2b', 'vit_huge_patch14_224_in21k', 'vit_large_patch14_224_clip_laion2b', 'vit_large_patch16_224', 'vit_large_patch16_224_in21k', 'vit_large_patch16_384', 'vit_large_patch32_224_in21k', 'vit_large_patch32_384', 'vit_large_r50_s32_224', 'vit_large_r50_s32_224_in21k', 'vit_large_r50_s32_384', 'vit_relpos_base_patch16_224', 'vit_relpos_base_patch16_clsgap_224', 'vit_relpos_base_patch32_plus_rpn_256', 'vit_relpos_medium_patch16_224', 'vit_relpos_medium_patch16_cls_224', 'vit_relpos_medium_patch16_rpn_224', 'vit_relpos_small_patch16_224', 'vit_small_patch8_224_dino', 'vit_small_patch16_224', 'vit_small_patch16_224_dino', 'vit_small_patch16_224_in21k', 'vit_small_patch16_384', 'vit_small_patch32_224', 'vit_small_patch32_224_in21k', 'vit_small_patch32_384', 'vit_small_r26_s32_224', 'vit_small_r26_s32_224_in21k', 'vit_small_r26_s32_384', 'vit_srelpos_medium_patch16_224', 'vit_srelpos_small_patch16_224', 'vit_tiny_patch16_224', 'vit_tiny_patch16_224_in21k', 'vit_tiny_patch16_384', 'vit_tiny_r_s16_p8_224', 'vit_tiny_r_s16_p8_224_in21k', 'vit_tiny_r_s16_p8_384', 'volo_d1_224', 'volo_d1_384', 'volo_d2_224', 'volo_d2_384', 'volo_d3_224', 'volo_d3_448', 'volo_d4_224', 'volo_d4_448', 'volo_d5_224', 'volo_d5_448', 'volo_d5_512', 'wide_resnet50_2', 'wide_resnet101_2', 'xception', 'xception41', 'xception41p', 'xception65', 'xception65p', 'xception71', 'xcit_large_24_p8_224', 'xcit_large_24_p8_224_dist', 'xcit_large_24_p8_384_dist', 'xcit_large_24_p16_224', 'xcit_large_24_p16_224_dist', 'xcit_large_24_p16_384_dist', 'xcit_medium_24_p8_224', 'xcit_medium_24_p8_224_dist', 'xcit_medium_24_p8_384_dist', 'xcit_medium_24_p16_224', 'xcit_medium_24_p16_224_dist', 'xcit_medium_24_p16_384_dist', 'xcit_nano_12_p8_224', 'xcit_nano_12_p8_224_dist', 'xcit_nano_12_p8_384_dist', 'xcit_nano_12_p16_224', 'xcit_nano_12_p16_224_dist', 'xcit_nano_12_p16_384_dist', 'xcit_small_12_p8_224', 'xcit_small_12_p8_224_dist', 'xcit_small_12_p8_384_dist', 'xcit_small_12_p16_224', 'xcit_small_12_p16_224_dist', 'xcit_small_12_p16_384_dist', 'xcit_small_24_p8_224', 'xcit_small_24_p8_224_dist', 'xcit_small_24_p8_384_dist', 'xcit_small_24_p16_224', 'xcit_small_24_p16_224_dist', 'xcit_small_24_p16_384_dist', 'xcit_tiny_12_p8_224', 'xcit_tiny_12_p8_224_dist', 'xcit_tiny_12_p8_384_dist', 'xcit_tiny_12_p16_224', 'xcit_tiny_12_p16_224_dist', 'xcit_tiny_12_p16_384_dist', 'xcit_tiny_24_p8_224', 'xcit_tiny_24_p8_224_dist', 'xcit_tiny_24_p8_384_dist', 'xcit_tiny_24_p16_224', 'xcit_tiny_24_p16_224_dist', 'xcit_tiny_24_p16_384_dist']\n" ] } ], "source": [ "# 查看 timm 中支持的模型\n", "import timm\n", "model_names = timm.list_models(pretrained=True)\n", "print(f'num of models: {len(model_names)}')\n", "print(model_names)" ] }, { "cell_type": "markdown", "metadata": { "id": "BbT36avNnppt" }, "source": [ "如果想将 timm 中 `mobilevitv2_050` 作为 `YOLOv5` 的主干网络,则配置文件如下:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "KjYbb-LTmqH1" }, "outputs": [], "source": [ "_base_ = './yolov5_s-v61_1xb2-1e_coco128.py'\n", "\n", "# 导入 mmcls.models 使得可以调用 mmcls 中注册的模块\n", "custom_imports = dict(imports=['mmcls.models'], allow_failed_imports=False)\n", "\n", "widen_factor = 1.0\n", "channels = [128, 192, 256]\n", "\n", "model = dict(\n", " backbone=dict(\n", " _delete_=True, # 将 _base_ 中关于 backbone 的字段删除\n", " type='mmcls.TIMMBackbone', # 使用 mmcls 中的 timm 主干网络\n", " model_name='mobilevitv2_050', # 使用 TIMM 中的 mobilevitv2_050\n", " features_only=True,\n", " pretrained=True,\n", " out_indices=(2, 3, 4)),\n", " neck=dict(\n", " type='YOLOv5PAFPN',\n", " widen_factor=widen_factor,\n", " in_channels=channels, # 注意:mobilevitv2_050 输出的3个通道是 [128, 192, 256],和原先的 yolov5-s neck 不匹配,需要更改\n", " out_channels=channels),\n", " bbox_head=dict(\n", " type='YOLOv5Head',\n", " head_module=dict(\n", " type='YOLOv5HeadModule',\n", " in_channels=channels, # head 部分输入通道也要做相应更改\n", " widen_factor=widen_factor))\n", ")\n", "\n", "config_mobilevitv2_050 = f\"\"\"\n", "_base_=\\'{_base_}\\'\n", "widen_factor={widen_factor}\n", "model={model}\n", "\"\"\"\n", "\n", "with open('./configs/yolov5/yolov5_s_mobilevitv2_050-v61_1xb2-1e_coco128.py', 'w') as f:\n", " f.write(config_mobilevitv2_050)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "RLIcdPKNoaHM", "outputId": "33724fe3-7d80-4732-c1df-c89294e55f09" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "11/19 11:32:01 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - Failed to search registry with scope \"mmyolo\" in the \"log_processor\" registry tree. As a workaround, the current \"log_processor\" registry in \"mmengine\" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether \"mmyolo\" is a correct scope, or whether the registry is initialized.\n", "11/19 11:32:01 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - \n", "------------------------------------------------------------\n", "System environment:\n", " sys.platform: linux\n", " Python: 3.7.15 (default, Oct 12 2022, 19:14:55) [GCC 7.5.0]\n", " CUDA available: True\n", " numpy_random_seed: 31316808\n", " GPU 0: Tesla T4\n", " CUDA_HOME: /usr/local/cuda\n", " NVCC: Cuda compilation tools, release 11.2, V11.2.152\n", " GCC: x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0\n", " PyTorch: 1.12.1+cu113\n", " PyTorch compiling details: PyTorch built with:\n", " - GCC 9.3\n", " - C++ Version: 201402\n", " - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications\n", " - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)\n", " - OpenMP 201511 (a.k.a. OpenMP 4.5)\n", " - LAPACK is enabled (usually provided by MKL)\n", " - NNPACK is enabled\n", " - CPU capability usage: AVX2\n", " - CUDA Runtime 11.3\n", " - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86\n", " - CuDNN 8.3.2 (built against CUDA 11.5)\n", " - Magma 2.5.2\n", " - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n", "\n", " TorchVision: 0.13.1+cu113\n", " OpenCV: 4.6.0\n", " MMEngine: 0.3.1\n", "\n", "Runtime environment:\n", " cudnn_benchmark: True\n", " mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}\n", " dist_cfg: {'backend': 'nccl'}\n", " seed: None\n", " Distributed launcher: none\n", " Distributed training: False\n", " GPU number: 1\n", "------------------------------------------------------------\n", "\n", "11/19 11:32:01 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Config:\n", "default_scope = 'mmyolo'\n", "default_hooks = dict(\n", " timer=dict(type='IterTimerHook'),\n", " logger=dict(type='LoggerHook', interval=50),\n", " param_scheduler=dict(\n", " type='YOLOv5ParamSchedulerHook',\n", " scheduler_type='linear',\n", " lr_factor=0.01,\n", " max_epochs=1),\n", " checkpoint=dict(\n", " type='CheckpointHook', interval=10, save_best='auto',\n", " max_keep_ckpts=3),\n", " sampler_seed=dict(type='DistSamplerSeedHook'),\n", " visualization=dict(type='mmdet.DetVisualizationHook'))\n", "env_cfg = dict(\n", " cudnn_benchmark=True,\n", " mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),\n", " dist_cfg=dict(backend='nccl'))\n", "vis_backends = [dict(type='LocalVisBackend')]\n", "visualizer = dict(\n", " type='mmdet.DetLocalVisualizer',\n", " vis_backends=[dict(type='LocalVisBackend')],\n", " name='visualizer')\n", "log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)\n", "log_level = 'INFO'\n", "load_from = None\n", "resume = False\n", "file_client_args = dict(backend='disk')\n", "data_root = 'data/coco/'\n", "dataset_type = 'YOLOv5CocoDataset'\n", "num_classes = 80\n", "img_scale = (640, 640)\n", "deepen_factor = 0.33\n", "widen_factor = 1.0\n", "max_epochs = 1\n", "save_epoch_intervals = 10\n", "train_batch_size_per_gpu = 2\n", "train_num_workers = 2\n", "val_batch_size_per_gpu = 1\n", "val_num_workers = 2\n", "persistent_workers = True\n", "batch_shapes_cfg = dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)\n", "anchors = [[(10, 13), (16, 30), (33, 23)], [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]]\n", "strides = [8, 16, 32]\n", "num_det_layers = 3\n", "model = dict(\n", " type='YOLODetector',\n", " data_preprocessor=dict(\n", " type='mmdet.DetDataPreprocessor',\n", " mean=[0.0, 0.0, 0.0],\n", " std=[255.0, 255.0, 255.0],\n", " bgr_to_rgb=True),\n", " backbone=dict(\n", " type='mmcls.TIMMBackbone',\n", " model_name='mobilevitv2_050',\n", " features_only=True,\n", " pretrained=True,\n", " out_indices=(2, 3, 4)),\n", " neck=dict(\n", " type='YOLOv5PAFPN',\n", " deepen_factor=0.33,\n", " widen_factor=1.0,\n", " in_channels=[128, 192, 256],\n", " out_channels=[128, 192, 256],\n", " num_csp_blocks=3,\n", " norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),\n", " act_cfg=dict(type='SiLU', inplace=True)),\n", " bbox_head=dict(\n", " type='YOLOv5Head',\n", " head_module=dict(\n", " type='YOLOv5HeadModule',\n", " num_classes=80,\n", " in_channels=[128, 192, 256],\n", " widen_factor=1.0,\n", " featmap_strides=[8, 16, 32],\n", " num_base_priors=3),\n", " prior_generator=dict(\n", " type='mmdet.YOLOAnchorGenerator',\n", " base_sizes=[[(10, 13), (16, 30), (33, 23)],\n", " [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]],\n", " strides=[8, 16, 32]),\n", " loss_cls=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=0.5),\n", " loss_bbox=dict(\n", " type='IoULoss',\n", " iou_mode='ciou',\n", " bbox_format='xywh',\n", " eps=1e-07,\n", " reduction='mean',\n", " loss_weight=0.05,\n", " return_iou=True),\n", " loss_obj=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=1.0),\n", " prior_match_thr=4.0,\n", " obj_level_weights=[4.0, 1.0, 0.4]),\n", " test_cfg=dict(\n", " multi_label=True,\n", " nms_pre=30000,\n", " score_thr=0.001,\n", " nms=dict(type='nms', iou_threshold=0.65),\n", " max_per_img=300))\n", "albu_train_transforms = [\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", "]\n", "pre_transform = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", "]\n", "train_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',\n", " 'flip_direction'))\n", "]\n", "train_dataloader = dict(\n", " batch_size=2,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " sampler=dict(type='DefaultSampler', shuffle=True),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " ann_file='annotations/instances_train2017.json',\n", " data_prefix=dict(img='train2017/'),\n", " filter_cfg=dict(filter_empty_gt=False, min_size=32),\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'flip', 'flip_direction'))\n", " ]))\n", "test_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", "]\n", "val_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "test_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "param_scheduler = None\n", "optim_wrapper = dict(\n", " type='OptimWrapper',\n", " optimizer=dict(\n", " type='SGD',\n", " lr=0.01,\n", " momentum=0.937,\n", " weight_decay=0.0005,\n", " nesterov=True,\n", " batch_size_per_gpu=2),\n", " constructor='YOLOv5OptimizerConstructor')\n", "custom_hooks = [\n", " dict(\n", " type='EMAHook',\n", " ema_type='ExpMomentumEMA',\n", " momentum=0.0001,\n", " update_buffers=True,\n", " strict_load=False,\n", " priority=49)\n", "]\n", "val_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "test_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=1, val_interval=10)\n", "val_cfg = dict(type='ValLoop')\n", "test_cfg = dict(type='TestLoop')\n", "launcher = 'none'\n", "work_dir = './work_dirs/yolov5_s_mobilevitv2_050-v61_1xb2-1e_coco128'\n", "\n", "11/19 11:32:01 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Result has been saved to /content/mmyolo/work_dirs/yolov5_s_mobilevitv2_050-v61_1xb2-1e_coco128/modules_statistic_results.json\n", "11/19 11:32:03 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - backbone out_indices: (2, 3, 4)\n", "11/19 11:32:03 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - backbone out_channels: [128, 192, 256]\n", "11/19 11:32:03 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - backbone out_strides: [8, 16, 32]\n", "11/19 11:32:05 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.\n", "loading annotations into memory...\n", "Done (t=0.00s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.00s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.00s)\n", "creating index...\n", "index created!\n", "/usr/local/lib/python3.7/dist-packages/mmengine/model/base_module.py:124: UserWarning: init_weights of TIMMBackbone has been called more than once.\n", " warnings.warn(f'init_weights of {self.__class__.__name__} has '\n", "11/19 11:32:07 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Checkpoints will be saved to /content/mmyolo/work_dirs/yolov5_s_mobilevitv2_050-v61_1xb2-1e_coco128.\n", "11/19 11:32:18 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Epoch(train) [1][50/63] lr: 4.9000e-04 eta: 0:00:02 time: 0.2264 data_time: 0.0062 memory: 5409 loss: 0.5606 loss_cls: 0.2027 loss_obj: 0.1421 loss_bbox: 0.2158\n", "11/19 11:32:20 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Exp name: yolov5_s_mobilevitv2_050-v61_1xb2-1e_coco128_20221119_113201\n", "11/19 11:32:20 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Saving checkpoint at 1 epochs\n", "11/19 11:32:21 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - `save_param_scheduler` is True but `self.param_schedulers` is None, so skip saving parameter schedulers\n" ] } ], "source": [ "# 启动训练\n", "!python tools/train.py configs/yolov5/yolov5_s_mobilevitv2_050-v61_1xb2-1e_coco128.py" ] }, { "cell_type": "markdown", "metadata": { "id": "B2_3IshhsDj_" }, "source": [ "### 2.4 使用在 MMSelfSup 中实现的主干网络(MoCo v3-ResNet)\n", "目前 MMSelfSup [![GitHub release](https://img.shields.io/github/release/open-mmlab/mmselfsup.svg)](https://GitHub.com/open-mmlab/mmselfsup/) 已支持 18 种自监督算法:\n", "\n", "
    \n", "Supported algorithms\n", "\n", "- [x] [Relative Location (ICCV'2015)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/relative_loc)\n", "- [x] [Rotation Prediction (ICLR'2018)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/rotation_pred)\n", "- [x] [DeepCluster (ECCV'2018)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/deepcluster)\n", "- [x] [NPID (CVPR'2018)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/npid)\n", "- [x] [ODC (CVPR'2020)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/odc)\n", "- [x] [MoCo v1 (CVPR'2020)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/mocov1)\n", "- [x] [SimCLR (ICML'2020)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/simclr)\n", "- [x] [MoCo v2 (ArXiv'2020)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/byol)\n", "- [x] [BYOL (NeurIPS'2020)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/mocov2)\n", "- [x] [SwAV (NeurIPS'2020)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/swav)\n", "- [x] [DenseCL (CVPR'2021)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/densecl)\n", "- [x] [SimSiam (CVPR'2021)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/simsiam)\n", "- [x] [Barlow Twins (ICML'2021)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/barlowtwins)\n", "- [x] [MoCo v3 (ICCV'2021)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/mocov3)\n", "- [x] [MAE (CVPR'2022)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/mae)\n", "- [x] [SimMIM (CVPR'2022)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/simmim)\n", "- [x] [MaskFeat (CVPR'2022)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/maskfeat)\n", "- [x] [CAE (ArXiv'2022)](https://github.com/open-mmlab/mmselfsup/tree/master/configs/selfsup/cae)\n", "\n", "
    \n", "\n", "如果想要使用在 MMSelfSup 中实现的主干网络,需要先安装 mmselfsup。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "p8HNMjaHsBK0", "outputId": "7b3b9942-ed60-4c58-fc7d-ae501a9b3766" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Looking in links: https://download.openmmlab.com/mmcv/dist/cu113/torch1.12.0/index.html\n", "Collecting mmselfsup>=1.0.0rc3\n", " Downloading mmselfsup-1.0.0rc3-py3-none-any.whl (348 kB)\n", "\u001b[K |████████████████████████████████| 348 kB 23.8 MB/s \n", "\u001b[?25hRequirement already satisfied: scikit-learn in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (1.0.2)\n", "Requirement already satisfied: attrs in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (22.1.0)\n", "Requirement already satisfied: packaging in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (21.3)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (1.21.6)\n", "Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (3.2.2)\n", "Requirement already satisfied: mmcls>=1.0.0rc0 in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (1.0.0rc2)\n", "Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (1.15.0)\n", "Requirement already satisfied: tensorboard in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (2.9.1)\n", "Requirement already satisfied: future in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (0.16.0)\n", "Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (1.7.3)\n", "Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (4.64.1)\n", "Requirement already satisfied: mmdet<3.1.0,>=3.0.0rc0 in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (3.0.0rc3)\n", "Collecting mmsegmentation<1.1.0,>=1.0.0rc0\n", " Downloading mmsegmentation-1.0.0rc1-py3-none-any.whl (806 kB)\n", "\u001b[K |████████████████████████████████| 806 kB 62.1 MB/s \n", "\u001b[?25hRequirement already satisfied: mmcv<2.1.0,>=2.0.0rc1 in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (2.0.0rc2)\n", "Requirement already satisfied: mmengine<1.0.0,>=0.1.0 in /usr/local/lib/python3.7/dist-packages (from mmselfsup>=1.0.0rc3) (0.3.1)\n", "Requirement already satisfied: rich in /usr/local/lib/python3.7/dist-packages (from mmcls>=1.0.0rc0->mmselfsup>=1.0.0rc3) (12.6.0)\n", "Requirement already satisfied: opencv-python>=3 in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmselfsup>=1.0.0rc3) (4.6.0.66)\n", "Requirement already satisfied: yapf in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmselfsup>=1.0.0rc3) (0.32.0)\n", "Requirement already satisfied: Pillow in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmselfsup>=1.0.0rc3) (7.1.2)\n", "Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmselfsup>=1.0.0rc3) (6.0)\n", "Requirement already satisfied: addict in /usr/local/lib/python3.7/dist-packages (from mmcv<2.1.0,>=2.0.0rc1->mmselfsup>=1.0.0rc3) (2.4.0)\n", "Requirement already satisfied: pycocotools in /usr/local/lib/python3.7/dist-packages (from mmdet<3.1.0,>=3.0.0rc0->mmselfsup>=1.0.0rc3) (2.0.6)\n", "Requirement already satisfied: terminaltables in /usr/local/lib/python3.7/dist-packages (from mmdet<3.1.0,>=3.0.0rc0->mmselfsup>=1.0.0rc3) (3.1.10)\n", "Requirement already satisfied: termcolor in /usr/local/lib/python3.7/dist-packages (from mmengine<1.0.0,>=0.1.0->mmselfsup>=1.0.0rc3) (2.1.0)\n", "Requirement already satisfied: prettytable in /usr/local/lib/python3.7/dist-packages (from mmsegmentation<1.1.0,>=1.0.0rc0->mmselfsup>=1.0.0rc3) (3.5.0)\n", "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmselfsup>=1.0.0rc3) (0.11.0)\n", "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmselfsup>=1.0.0rc3) (3.0.9)\n", "Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmselfsup>=1.0.0rc3) (2.8.2)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->mmselfsup>=1.0.0rc3) (1.4.4)\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib->mmselfsup>=1.0.0rc3) (4.1.1)\n", "Requirement already satisfied: wcwidth in /usr/local/lib/python3.7/dist-packages (from prettytable->mmsegmentation<1.1.0,>=1.0.0rc0->mmselfsup>=1.0.0rc3) (0.2.5)\n", "Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from prettytable->mmsegmentation<1.1.0,>=1.0.0rc0->mmselfsup>=1.0.0rc3) (4.13.0)\n", "Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->prettytable->mmsegmentation<1.1.0,>=1.0.0rc0->mmselfsup>=1.0.0rc3) (3.10.0)\n", "Requirement already satisfied: commonmark<0.10.0,>=0.9.0 in /usr/local/lib/python3.7/dist-packages (from rich->mmcls>=1.0.0rc0->mmselfsup>=1.0.0rc3) (0.9.1)\n", "Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/dist-packages (from rich->mmcls>=1.0.0rc0->mmselfsup>=1.0.0rc3) (2.6.1)\n", "Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn->mmselfsup>=1.0.0rc3) (1.2.0)\n", "Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn->mmselfsup>=1.0.0rc3) (3.1.0)\n", "Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (2.23.0)\n", "Requirement already satisfied: grpcio>=1.24.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (1.50.0)\n", "Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (0.38.3)\n", "Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (0.6.1)\n", "Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (1.0.1)\n", "Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (1.3.0)\n", "Requirement already satisfied: protobuf<3.20,>=3.9.2 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (3.19.6)\n", "Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (2.14.1)\n", "Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (3.4.1)\n", "Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (0.4.6)\n", "Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (1.8.1)\n", "Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->mmselfsup>=1.0.0rc3) (57.4.0)\n", "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard->mmselfsup>=1.0.0rc3) (5.2.0)\n", "Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard->mmselfsup>=1.0.0rc3) (0.2.8)\n", "Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard->mmselfsup>=1.0.0rc3) (4.9)\n", "Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard->mmselfsup>=1.0.0rc3) (1.3.1)\n", "Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard->mmselfsup>=1.0.0rc3) (0.4.8)\n", "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->mmselfsup>=1.0.0rc3) (1.24.3)\n", "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->mmselfsup>=1.0.0rc3) (2.10)\n", "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->mmselfsup>=1.0.0rc3) (3.0.4)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->mmselfsup>=1.0.0rc3) (2022.9.24)\n", "Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard->mmselfsup>=1.0.0rc3) (3.2.2)\n", "Installing collected packages: mmsegmentation, mmselfsup\n", "Successfully installed mmsegmentation-1.0.0rc1 mmselfsup-1.0.0rc3\n" ] } ], "source": [ "# 安装 mmselfsup\n", "!mim install \"mmselfsup>=1.0.0rc3\"" ] }, { "cell_type": "markdown", "metadata": { "id": "U0qxhim1sN_N" }, "source": [ "如果想将 MMSelfSup 中 `MoCo v3` 自监督训练的 `ResNet-50` 作为 `YOLOv5` 的主干网络,则配置文件如下:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dTm6fVN0sLpH" }, "outputs": [], "source": [ "_base_ = './yolov5_s-v61_1xb2-1e_coco128.py'\n", "\n", "# 导入 mmselfsup.models 使得可以调用 mmselfsup 中注册的模块\n", "custom_imports = dict(imports=['mmselfsup.models'], allow_failed_imports=False)\n", "checkpoint_file = 'https://download.openmmlab.com/mmselfsup/1.x/mocov3/mocov3_resnet50_8xb512-amp-coslr-800e_in1k/mocov3_resnet50_8xb512-amp-coslr-800e_in1k_20220927-e043f51a.pth' # noqa\n", "widen_factor = 1.0\n", "channels = [512, 1024, 2048]\n", "\n", "model = dict(\n", " backbone=dict(\n", " _delete_=True, # 将 _base_ 中关于 backbone 的字段删除\n", " type='mmselfsup.ResNet',\n", " depth=50,\n", " num_stages=4,\n", " out_indices=(2, 3, 4), # 注意:MMSelfSup 中 ResNet 的 out_indices 比 MMdet 和 MMCls 的要大 1\n", " frozen_stages=1,\n", " norm_cfg=dict(type='BN', requires_grad=True),\n", " norm_eval=True,\n", " style='pytorch',\n", " init_cfg=dict(type='Pretrained', checkpoint=checkpoint_file)),\n", " neck=dict(\n", " type='YOLOv5PAFPN',\n", " widen_factor=widen_factor,\n", " in_channels=channels, # 注意:ResNet-50 输出的3个通道是 [512, 1024, 2048],和原先的 yolov5-s neck 不匹配,需要更改\n", " out_channels=channels),\n", " bbox_head=dict(\n", " type='YOLOv5Head',\n", " head_module=dict(\n", " type='YOLOv5HeadModule',\n", " in_channels=channels, # head 部分输入通道也要做相应更改\n", " widen_factor=widen_factor))\n", ")\n", "\n", "config_res50_mocov3 = f\"\"\"\n", "_base_=\\'{_base_}\\'\n", "widen_factor={widen_factor}\n", "checkpoint_file=\\'{checkpoint_file}\\'\n", "model={model}\n", "\"\"\"\n", "with open('./configs/yolov5/yolov5_s_res50_mocov3-v61_1xb2-1e_coco128.py', 'w') as f:\n", " f.write(config_res50_mocov3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "3dLIVIzOsZtc", "outputId": "b90d6c77-138f-4f78-95b1-d77051df040d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "11/19 11:33:24 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - Failed to search registry with scope \"mmyolo\" in the \"log_processor\" registry tree. As a workaround, the current \"log_processor\" registry in \"mmengine\" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether \"mmyolo\" is a correct scope, or whether the registry is initialized.\n", "11/19 11:33:24 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - \n", "------------------------------------------------------------\n", "System environment:\n", " sys.platform: linux\n", " Python: 3.7.15 (default, Oct 12 2022, 19:14:55) [GCC 7.5.0]\n", " CUDA available: True\n", " numpy_random_seed: 918106889\n", " GPU 0: Tesla T4\n", " CUDA_HOME: /usr/local/cuda\n", " NVCC: Cuda compilation tools, release 11.2, V11.2.152\n", " GCC: x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0\n", " PyTorch: 1.12.1+cu113\n", " PyTorch compiling details: PyTorch built with:\n", " - GCC 9.3\n", " - C++ Version: 201402\n", " - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications\n", " - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)\n", " - OpenMP 201511 (a.k.a. OpenMP 4.5)\n", " - LAPACK is enabled (usually provided by MKL)\n", " - NNPACK is enabled\n", " - CPU capability usage: AVX2\n", " - CUDA Runtime 11.3\n", " - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86\n", " - CuDNN 8.3.2 (built against CUDA 11.5)\n", " - Magma 2.5.2\n", " - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n", "\n", " TorchVision: 0.13.1+cu113\n", " OpenCV: 4.6.0\n", " MMEngine: 0.3.1\n", "\n", "Runtime environment:\n", " cudnn_benchmark: True\n", " mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}\n", " dist_cfg: {'backend': 'nccl'}\n", " seed: None\n", " Distributed launcher: none\n", " Distributed training: False\n", " GPU number: 1\n", "------------------------------------------------------------\n", "\n", "11/19 11:33:25 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Config:\n", "default_scope = 'mmyolo'\n", "default_hooks = dict(\n", " timer=dict(type='IterTimerHook'),\n", " logger=dict(type='LoggerHook', interval=50),\n", " param_scheduler=dict(\n", " type='YOLOv5ParamSchedulerHook',\n", " scheduler_type='linear',\n", " lr_factor=0.01,\n", " max_epochs=1),\n", " checkpoint=dict(\n", " type='CheckpointHook', interval=10, save_best='auto',\n", " max_keep_ckpts=3),\n", " sampler_seed=dict(type='DistSamplerSeedHook'),\n", " visualization=dict(type='mmdet.DetVisualizationHook'))\n", "env_cfg = dict(\n", " cudnn_benchmark=True,\n", " mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),\n", " dist_cfg=dict(backend='nccl'))\n", "vis_backends = [dict(type='LocalVisBackend')]\n", "visualizer = dict(\n", " type='mmdet.DetLocalVisualizer',\n", " vis_backends=[dict(type='LocalVisBackend')],\n", " name='visualizer')\n", "log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)\n", "log_level = 'INFO'\n", "load_from = None\n", "resume = False\n", "file_client_args = dict(backend='disk')\n", "data_root = 'data/coco/'\n", "dataset_type = 'YOLOv5CocoDataset'\n", "num_classes = 80\n", "img_scale = (640, 640)\n", "deepen_factor = 0.33\n", "widen_factor = 1.0\n", "max_epochs = 1\n", "save_epoch_intervals = 10\n", "train_batch_size_per_gpu = 2\n", "train_num_workers = 2\n", "val_batch_size_per_gpu = 1\n", "val_num_workers = 2\n", "persistent_workers = True\n", "batch_shapes_cfg = dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)\n", "anchors = [[(10, 13), (16, 30), (33, 23)], [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]]\n", "strides = [8, 16, 32]\n", "num_det_layers = 3\n", "model = dict(\n", " type='YOLODetector',\n", " data_preprocessor=dict(\n", " type='mmdet.DetDataPreprocessor',\n", " mean=[0.0, 0.0, 0.0],\n", " std=[255.0, 255.0, 255.0],\n", " bgr_to_rgb=True),\n", " backbone=dict(\n", " type='mmselfsup.ResNet',\n", " depth=50,\n", " num_stages=4,\n", " out_indices=(2, 3, 4),\n", " frozen_stages=1,\n", " norm_cfg=dict(type='BN', requires_grad=True),\n", " norm_eval=True,\n", " style='pytorch',\n", " init_cfg=dict(\n", " type='Pretrained',\n", " checkpoint=\n", " 'https://download.openmmlab.com/mmselfsup/1.x/mocov3/mocov3_resnet50_8xb512-amp-coslr-800e_in1k/mocov3_resnet50_8xb512-amp-coslr-800e_in1k_20220927-e043f51a.pth'\n", " )),\n", " neck=dict(\n", " type='YOLOv5PAFPN',\n", " deepen_factor=0.33,\n", " widen_factor=1.0,\n", " in_channels=[512, 1024, 2048],\n", " out_channels=[512, 1024, 2048],\n", " num_csp_blocks=3,\n", " norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),\n", " act_cfg=dict(type='SiLU', inplace=True)),\n", " bbox_head=dict(\n", " type='YOLOv5Head',\n", " head_module=dict(\n", " type='YOLOv5HeadModule',\n", " num_classes=80,\n", " in_channels=[512, 1024, 2048],\n", " widen_factor=1.0,\n", " featmap_strides=[8, 16, 32],\n", " num_base_priors=3),\n", " prior_generator=dict(\n", " type='mmdet.YOLOAnchorGenerator',\n", " base_sizes=[[(10, 13), (16, 30), (33, 23)],\n", " [(30, 61), (62, 45), (59, 119)],\n", " [(116, 90), (156, 198), (373, 326)]],\n", " strides=[8, 16, 32]),\n", " loss_cls=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=0.5),\n", " loss_bbox=dict(\n", " type='IoULoss',\n", " iou_mode='ciou',\n", " bbox_format='xywh',\n", " eps=1e-07,\n", " reduction='mean',\n", " loss_weight=0.05,\n", " return_iou=True),\n", " loss_obj=dict(\n", " type='mmdet.CrossEntropyLoss',\n", " use_sigmoid=True,\n", " reduction='mean',\n", " loss_weight=1.0),\n", " prior_match_thr=4.0,\n", " obj_level_weights=[4.0, 1.0, 0.4]),\n", " test_cfg=dict(\n", " multi_label=True,\n", " nms_pre=30000,\n", " score_thr=0.001,\n", " nms=dict(type='nms', iou_threshold=0.65),\n", " max_per_img=300))\n", "albu_train_transforms = [\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", "]\n", "pre_transform = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", "]\n", "train_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',\n", " 'flip_direction'))\n", "]\n", "train_dataloader = dict(\n", " batch_size=2,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " sampler=dict(type='DefaultSampler', shuffle=True),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " ann_file='annotations/instances_train2017.json',\n", " data_prefix=dict(img='train2017/'),\n", " filter_cfg=dict(filter_empty_gt=False, min_size=32),\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True),\n", " dict(\n", " type='Mosaic',\n", " img_scale=(640, 640),\n", " pad_val=114.0,\n", " pre_transform=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='LoadAnnotations', with_bbox=True)\n", " ]),\n", " dict(\n", " type='YOLOv5RandomAffine',\n", " max_rotate_degree=0.0,\n", " max_shear_degree=0.0,\n", " scaling_ratio_range=(0.5, 1.5),\n", " border=(-320, -320),\n", " border_val=(114, 114, 114)),\n", " dict(\n", " type='mmdet.Albu',\n", " transforms=[\n", " dict(type='Blur', p=0.01),\n", " dict(type='MedianBlur', p=0.01),\n", " dict(type='ToGray', p=0.01),\n", " dict(type='CLAHE', p=0.01)\n", " ],\n", " bbox_params=dict(\n", " type='BboxParams',\n", " format='pascal_voc',\n", " label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),\n", " keymap=dict(img='image', gt_bboxes='bboxes')),\n", " dict(type='YOLOv5HSVRandomAug'),\n", " dict(type='mmdet.RandomFlip', prob=0.5),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'flip', 'flip_direction'))\n", " ]))\n", "test_pipeline = [\n", " dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", "]\n", "val_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "test_dataloader = dict(\n", " batch_size=1,\n", " num_workers=2,\n", " persistent_workers=True,\n", " pin_memory=True,\n", " drop_last=False,\n", " sampler=dict(type='DefaultSampler', shuffle=False),\n", " dataset=dict(\n", " type='YOLOv5CocoDataset',\n", " data_root='data/coco/',\n", " test_mode=True,\n", " data_prefix=dict(img='val2017/'),\n", " ann_file='annotations/instances_val2017.json',\n", " pipeline=[\n", " dict(\n", " type='LoadImageFromFile',\n", " file_client_args=dict(backend='disk')),\n", " dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),\n", " dict(\n", " type='LetterResize',\n", " scale=(640, 640),\n", " allow_scale_up=False,\n", " pad_val=dict(img=114)),\n", " dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),\n", " dict(\n", " type='mmdet.PackDetInputs',\n", " meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',\n", " 'scale_factor', 'pad_param'))\n", " ],\n", " batch_shapes_cfg=dict(\n", " type='BatchShapePolicy',\n", " batch_size=1,\n", " img_size=640,\n", " size_divisor=32,\n", " extra_pad_ratio=0.5)))\n", "param_scheduler = None\n", "optim_wrapper = dict(\n", " type='OptimWrapper',\n", " optimizer=dict(\n", " type='SGD',\n", " lr=0.01,\n", " momentum=0.937,\n", " weight_decay=0.0005,\n", " nesterov=True,\n", " batch_size_per_gpu=2),\n", " constructor='YOLOv5OptimizerConstructor')\n", "custom_hooks = [\n", " dict(\n", " type='EMAHook',\n", " ema_type='ExpMomentumEMA',\n", " momentum=0.0001,\n", " update_buffers=True,\n", " strict_load=False,\n", " priority=49)\n", "]\n", "val_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "test_evaluator = dict(\n", " type='mmdet.CocoMetric',\n", " proposal_nums=(100, 1, 10),\n", " ann_file='data/coco/annotations/instances_val2017.json',\n", " metric='bbox')\n", "train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=1, val_interval=10)\n", "val_cfg = dict(type='ValLoop')\n", "test_cfg = dict(type='TestLoop')\n", "checkpoint_file = 'https://download.openmmlab.com/mmselfsup/1.x/mocov3/mocov3_resnet50_8xb512-amp-coslr-800e_in1k/mocov3_resnet50_8xb512-amp-coslr-800e_in1k_20220927-e043f51a.pth'\n", "launcher = 'none'\n", "work_dir = './work_dirs/yolov5_s_res50_mocov3-v61_1xb2-1e_coco128'\n", "\n", "11/19 11:33:25 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Result has been saved to /content/mmyolo/work_dirs/yolov5_s_res50_mocov3-v61_1xb2-1e_coco128/modules_statistic_results.json\n", "11/19 11:33:30 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.\n", "loading annotations into memory...\n", "Done (t=0.01s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.01s)\n", "creating index...\n", "index created!\n", "loading annotations into memory...\n", "Done (t=0.01s)\n", "creating index...\n", "index created!\n", "11/19 11:33:34 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - load model from: https://download.openmmlab.com/mmselfsup/1.x/mocov3/mocov3_resnet50_8xb512-amp-coslr-800e_in1k/mocov3_resnet50_8xb512-amp-coslr-800e_in1k_20220927-e043f51a.pth\n", "11/19 11:33:34 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - http loads checkpoint from path: https://download.openmmlab.com/mmselfsup/1.x/mocov3/mocov3_resnet50_8xb512-amp-coslr-800e_in1k/mocov3_resnet50_8xb512-amp-coslr-800e_in1k_20220927-e043f51a.pth\n", "11/19 11:33:34 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - The model and loaded state dict do not match exactly\n", "\n", "unexpected key in source state_dict: data_preprocessor.mean, data_preprocessor.std, backbone.conv1.weight, backbone.bn1.weight, backbone.bn1.bias, backbone.bn1.running_mean, backbone.bn1.running_var, backbone.bn1.num_batches_tracked, backbone.layer1.0.conv1.weight, backbone.layer1.0.bn1.weight, backbone.layer1.0.bn1.bias, backbone.layer1.0.bn1.running_mean, backbone.layer1.0.bn1.running_var, backbone.layer1.0.bn1.num_batches_tracked, backbone.layer1.0.conv2.weight, backbone.layer1.0.bn2.weight, backbone.layer1.0.bn2.bias, backbone.layer1.0.bn2.running_mean, backbone.layer1.0.bn2.running_var, backbone.layer1.0.bn2.num_batches_tracked, backbone.layer1.0.conv3.weight, backbone.layer1.0.bn3.weight, backbone.layer1.0.bn3.bias, backbone.layer1.0.bn3.running_mean, backbone.layer1.0.bn3.running_var, backbone.layer1.0.bn3.num_batches_tracked, backbone.layer1.0.downsample.0.weight, backbone.layer1.0.downsample.1.weight, backbone.layer1.0.downsample.1.bias, backbone.layer1.0.downsample.1.running_mean, backbone.layer1.0.downsample.1.running_var, backbone.layer1.0.downsample.1.num_batches_tracked, backbone.layer1.1.conv1.weight, backbone.layer1.1.bn1.weight, backbone.layer1.1.bn1.bias, backbone.layer1.1.bn1.running_mean, backbone.layer1.1.bn1.running_var, backbone.layer1.1.bn1.num_batches_tracked, backbone.layer1.1.conv2.weight, backbone.layer1.1.bn2.weight, backbone.layer1.1.bn2.bias, backbone.layer1.1.bn2.running_mean, backbone.layer1.1.bn2.running_var, backbone.layer1.1.bn2.num_batches_tracked, backbone.layer1.1.conv3.weight, backbone.layer1.1.bn3.weight, backbone.layer1.1.bn3.bias, backbone.layer1.1.bn3.running_mean, backbone.layer1.1.bn3.running_var, backbone.layer1.1.bn3.num_batches_tracked, backbone.layer1.2.conv1.weight, backbone.layer1.2.bn1.weight, backbone.layer1.2.bn1.bias, backbone.layer1.2.bn1.running_mean, backbone.layer1.2.bn1.running_var, backbone.layer1.2.bn1.num_batches_tracked, backbone.layer1.2.conv2.weight, backbone.layer1.2.bn2.weight, backbone.layer1.2.bn2.bias, backbone.layer1.2.bn2.running_mean, backbone.layer1.2.bn2.running_var, backbone.layer1.2.bn2.num_batches_tracked, backbone.layer1.2.conv3.weight, backbone.layer1.2.bn3.weight, backbone.layer1.2.bn3.bias, backbone.layer1.2.bn3.running_mean, backbone.layer1.2.bn3.running_var, backbone.layer1.2.bn3.num_batches_tracked, backbone.layer2.0.conv1.weight, backbone.layer2.0.bn1.weight, backbone.layer2.0.bn1.bias, backbone.layer2.0.bn1.running_mean, backbone.layer2.0.bn1.running_var, backbone.layer2.0.bn1.num_batches_tracked, backbone.layer2.0.conv2.weight, backbone.layer2.0.bn2.weight, backbone.layer2.0.bn2.bias, backbone.layer2.0.bn2.running_mean, backbone.layer2.0.bn2.running_var, backbone.layer2.0.bn2.num_batches_tracked, backbone.layer2.0.conv3.weight, backbone.layer2.0.bn3.weight, backbone.layer2.0.bn3.bias, backbone.layer2.0.bn3.running_mean, backbone.layer2.0.bn3.running_var, backbone.layer2.0.bn3.num_batches_tracked, backbone.layer2.0.downsample.0.weight, backbone.layer2.0.downsample.1.weight, backbone.layer2.0.downsample.1.bias, backbone.layer2.0.downsample.1.running_mean, backbone.layer2.0.downsample.1.running_var, backbone.layer2.0.downsample.1.num_batches_tracked, backbone.layer2.1.conv1.weight, backbone.layer2.1.bn1.weight, backbone.layer2.1.bn1.bias, backbone.layer2.1.bn1.running_mean, backbone.layer2.1.bn1.running_var, backbone.layer2.1.bn1.num_batches_tracked, backbone.layer2.1.conv2.weight, backbone.layer2.1.bn2.weight, backbone.layer2.1.bn2.bias, backbone.layer2.1.bn2.running_mean, backbone.layer2.1.bn2.running_var, backbone.layer2.1.bn2.num_batches_tracked, backbone.layer2.1.conv3.weight, backbone.layer2.1.bn3.weight, backbone.layer2.1.bn3.bias, backbone.layer2.1.bn3.running_mean, backbone.layer2.1.bn3.running_var, backbone.layer2.1.bn3.num_batches_tracked, backbone.layer2.2.conv1.weight, backbone.layer2.2.bn1.weight, backbone.layer2.2.bn1.bias, backbone.layer2.2.bn1.running_mean, backbone.layer2.2.bn1.running_var, backbone.layer2.2.bn1.num_batches_tracked, backbone.layer2.2.conv2.weight, backbone.layer2.2.bn2.weight, backbone.layer2.2.bn2.bias, backbone.layer2.2.bn2.running_mean, backbone.layer2.2.bn2.running_var, backbone.layer2.2.bn2.num_batches_tracked, backbone.layer2.2.conv3.weight, backbone.layer2.2.bn3.weight, backbone.layer2.2.bn3.bias, backbone.layer2.2.bn3.running_mean, backbone.layer2.2.bn3.running_var, backbone.layer2.2.bn3.num_batches_tracked, backbone.layer2.3.conv1.weight, backbone.layer2.3.bn1.weight, backbone.layer2.3.bn1.bias, backbone.layer2.3.bn1.running_mean, backbone.layer2.3.bn1.running_var, backbone.layer2.3.bn1.num_batches_tracked, backbone.layer2.3.conv2.weight, backbone.layer2.3.bn2.weight, backbone.layer2.3.bn2.bias, backbone.layer2.3.bn2.running_mean, backbone.layer2.3.bn2.running_var, backbone.layer2.3.bn2.num_batches_tracked, backbone.layer2.3.conv3.weight, backbone.layer2.3.bn3.weight, backbone.layer2.3.bn3.bias, backbone.layer2.3.bn3.running_mean, backbone.layer2.3.bn3.running_var, backbone.layer2.3.bn3.num_batches_tracked, backbone.layer3.0.conv1.weight, backbone.layer3.0.bn1.weight, backbone.layer3.0.bn1.bias, backbone.layer3.0.bn1.running_mean, backbone.layer3.0.bn1.running_var, backbone.layer3.0.bn1.num_batches_tracked, backbone.layer3.0.conv2.weight, backbone.layer3.0.bn2.weight, backbone.layer3.0.bn2.bias, backbone.layer3.0.bn2.running_mean, backbone.layer3.0.bn2.running_var, backbone.layer3.0.bn2.num_batches_tracked, backbone.layer3.0.conv3.weight, backbone.layer3.0.bn3.weight, backbone.layer3.0.bn3.bias, backbone.layer3.0.bn3.running_mean, backbone.layer3.0.bn3.running_var, backbone.layer3.0.bn3.num_batches_tracked, backbone.layer3.0.downsample.0.weight, backbone.layer3.0.downsample.1.weight, backbone.layer3.0.downsample.1.bias, backbone.layer3.0.downsample.1.running_mean, backbone.layer3.0.downsample.1.running_var, backbone.layer3.0.downsample.1.num_batches_tracked, backbone.layer3.1.conv1.weight, backbone.layer3.1.bn1.weight, backbone.layer3.1.bn1.bias, backbone.layer3.1.bn1.running_mean, backbone.layer3.1.bn1.running_var, backbone.layer3.1.bn1.num_batches_tracked, backbone.layer3.1.conv2.weight, backbone.layer3.1.bn2.weight, backbone.layer3.1.bn2.bias, backbone.layer3.1.bn2.running_mean, backbone.layer3.1.bn2.running_var, backbone.layer3.1.bn2.num_batches_tracked, backbone.layer3.1.conv3.weight, backbone.layer3.1.bn3.weight, backbone.layer3.1.bn3.bias, backbone.layer3.1.bn3.running_mean, backbone.layer3.1.bn3.running_var, backbone.layer3.1.bn3.num_batches_tracked, backbone.layer3.2.conv1.weight, backbone.layer3.2.bn1.weight, backbone.layer3.2.bn1.bias, backbone.layer3.2.bn1.running_mean, backbone.layer3.2.bn1.running_var, backbone.layer3.2.bn1.num_batches_tracked, backbone.layer3.2.conv2.weight, backbone.layer3.2.bn2.weight, backbone.layer3.2.bn2.bias, backbone.layer3.2.bn2.running_mean, backbone.layer3.2.bn2.running_var, backbone.layer3.2.bn2.num_batches_tracked, backbone.layer3.2.conv3.weight, backbone.layer3.2.bn3.weight, backbone.layer3.2.bn3.bias, backbone.layer3.2.bn3.running_mean, backbone.layer3.2.bn3.running_var, backbone.layer3.2.bn3.num_batches_tracked, backbone.layer3.3.conv1.weight, backbone.layer3.3.bn1.weight, backbone.layer3.3.bn1.bias, backbone.layer3.3.bn1.running_mean, backbone.layer3.3.bn1.running_var, backbone.layer3.3.bn1.num_batches_tracked, backbone.layer3.3.conv2.weight, backbone.layer3.3.bn2.weight, backbone.layer3.3.bn2.bias, backbone.layer3.3.bn2.running_mean, backbone.layer3.3.bn2.running_var, backbone.layer3.3.bn2.num_batches_tracked, backbone.layer3.3.conv3.weight, backbone.layer3.3.bn3.weight, backbone.layer3.3.bn3.bias, backbone.layer3.3.bn3.running_mean, backbone.layer3.3.bn3.running_var, backbone.layer3.3.bn3.num_batches_tracked, backbone.layer3.4.conv1.weight, backbone.layer3.4.bn1.weight, backbone.layer3.4.bn1.bias, backbone.layer3.4.bn1.running_mean, backbone.layer3.4.bn1.running_var, backbone.layer3.4.bn1.num_batches_tracked, backbone.layer3.4.conv2.weight, backbone.layer3.4.bn2.weight, backbone.layer3.4.bn2.bias, backbone.layer3.4.bn2.running_mean, backbone.layer3.4.bn2.running_var, backbone.layer3.4.bn2.num_batches_tracked, backbone.layer3.4.conv3.weight, backbone.layer3.4.bn3.weight, backbone.layer3.4.bn3.bias, backbone.layer3.4.bn3.running_mean, backbone.layer3.4.bn3.running_var, backbone.layer3.4.bn3.num_batches_tracked, backbone.layer3.5.conv1.weight, backbone.layer3.5.bn1.weight, backbone.layer3.5.bn1.bias, backbone.layer3.5.bn1.running_mean, backbone.layer3.5.bn1.running_var, backbone.layer3.5.bn1.num_batches_tracked, backbone.layer3.5.conv2.weight, backbone.layer3.5.bn2.weight, backbone.layer3.5.bn2.bias, backbone.layer3.5.bn2.running_mean, backbone.layer3.5.bn2.running_var, backbone.layer3.5.bn2.num_batches_tracked, backbone.layer3.5.conv3.weight, backbone.layer3.5.bn3.weight, backbone.layer3.5.bn3.bias, backbone.layer3.5.bn3.running_mean, backbone.layer3.5.bn3.running_var, backbone.layer3.5.bn3.num_batches_tracked, backbone.layer4.0.conv1.weight, backbone.layer4.0.bn1.weight, backbone.layer4.0.bn1.bias, backbone.layer4.0.bn1.running_mean, backbone.layer4.0.bn1.running_var, backbone.layer4.0.bn1.num_batches_tracked, backbone.layer4.0.conv2.weight, backbone.layer4.0.bn2.weight, backbone.layer4.0.bn2.bias, backbone.layer4.0.bn2.running_mean, backbone.layer4.0.bn2.running_var, backbone.layer4.0.bn2.num_batches_tracked, backbone.layer4.0.conv3.weight, backbone.layer4.0.bn3.weight, backbone.layer4.0.bn3.bias, backbone.layer4.0.bn3.running_mean, backbone.layer4.0.bn3.running_var, backbone.layer4.0.bn3.num_batches_tracked, backbone.layer4.0.downsample.0.weight, backbone.layer4.0.downsample.1.weight, backbone.layer4.0.downsample.1.bias, backbone.layer4.0.downsample.1.running_mean, backbone.layer4.0.downsample.1.running_var, backbone.layer4.0.downsample.1.num_batches_tracked, backbone.layer4.1.conv1.weight, backbone.layer4.1.bn1.weight, backbone.layer4.1.bn1.bias, backbone.layer4.1.bn1.running_mean, backbone.layer4.1.bn1.running_var, backbone.layer4.1.bn1.num_batches_tracked, backbone.layer4.1.conv2.weight, backbone.layer4.1.bn2.weight, backbone.layer4.1.bn2.bias, backbone.layer4.1.bn2.running_mean, backbone.layer4.1.bn2.running_var, backbone.layer4.1.bn2.num_batches_tracked, backbone.layer4.1.conv3.weight, backbone.layer4.1.bn3.weight, backbone.layer4.1.bn3.bias, backbone.layer4.1.bn3.running_mean, backbone.layer4.1.bn3.running_var, backbone.layer4.1.bn3.num_batches_tracked, backbone.layer4.2.conv1.weight, backbone.layer4.2.bn1.weight, backbone.layer4.2.bn1.bias, backbone.layer4.2.bn1.running_mean, backbone.layer4.2.bn1.running_var, backbone.layer4.2.bn1.num_batches_tracked, backbone.layer4.2.conv2.weight, backbone.layer4.2.bn2.weight, backbone.layer4.2.bn2.bias, backbone.layer4.2.bn2.running_mean, backbone.layer4.2.bn2.running_var, backbone.layer4.2.bn2.num_batches_tracked, backbone.layer4.2.conv3.weight, backbone.layer4.2.bn3.weight, backbone.layer4.2.bn3.bias, backbone.layer4.2.bn3.running_mean, backbone.layer4.2.bn3.running_var, backbone.layer4.2.bn3.num_batches_tracked, neck.fc0.weight, neck.bn0.weight, neck.bn0.bias, neck.bn0.running_mean, neck.bn0.running_var, neck.bn0.num_batches_tracked, neck.fc1.weight, neck.bn1.running_mean, neck.bn1.running_var, neck.bn1.num_batches_tracked, head.predictor.fc0.weight, head.predictor.bn0.weight, head.predictor.bn0.bias, head.predictor.bn0.running_mean, head.predictor.bn0.running_var, head.predictor.bn0.num_batches_tracked, head.predictor.fc1.weight, momentum_encoder.steps, momentum_encoder.module.0.conv1.weight, momentum_encoder.module.0.bn1.weight, momentum_encoder.module.0.bn1.bias, momentum_encoder.module.0.bn1.running_mean, momentum_encoder.module.0.bn1.running_var, momentum_encoder.module.0.bn1.num_batches_tracked, momentum_encoder.module.0.layer1.0.conv1.weight, momentum_encoder.module.0.layer1.0.bn1.weight, momentum_encoder.module.0.layer1.0.bn1.bias, momentum_encoder.module.0.layer1.0.bn1.running_mean, momentum_encoder.module.0.layer1.0.bn1.running_var, momentum_encoder.module.0.layer1.0.bn1.num_batches_tracked, momentum_encoder.module.0.layer1.0.conv2.weight, momentum_encoder.module.0.layer1.0.bn2.weight, momentum_encoder.module.0.layer1.0.bn2.bias, momentum_encoder.module.0.layer1.0.bn2.running_mean, momentum_encoder.module.0.layer1.0.bn2.running_var, momentum_encoder.module.0.layer1.0.bn2.num_batches_tracked, momentum_encoder.module.0.layer1.0.conv3.weight, momentum_encoder.module.0.layer1.0.bn3.weight, momentum_encoder.module.0.layer1.0.bn3.bias, momentum_encoder.module.0.layer1.0.bn3.running_mean, momentum_encoder.module.0.layer1.0.bn3.running_var, momentum_encoder.module.0.layer1.0.bn3.num_batches_tracked, momentum_encoder.module.0.layer1.0.downsample.0.weight, momentum_encoder.module.0.layer1.0.downsample.1.weight, momentum_encoder.module.0.layer1.0.downsample.1.bias, momentum_encoder.module.0.layer1.0.downsample.1.running_mean, momentum_encoder.module.0.layer1.0.downsample.1.running_var, momentum_encoder.module.0.layer1.0.downsample.1.num_batches_tracked, momentum_encoder.module.0.layer1.1.conv1.weight, momentum_encoder.module.0.layer1.1.bn1.weight, momentum_encoder.module.0.layer1.1.bn1.bias, momentum_encoder.module.0.layer1.1.bn1.running_mean, momentum_encoder.module.0.layer1.1.bn1.running_var, momentum_encoder.module.0.layer1.1.bn1.num_batches_tracked, momentum_encoder.module.0.layer1.1.conv2.weight, momentum_encoder.module.0.layer1.1.bn2.weight, momentum_encoder.module.0.layer1.1.bn2.bias, momentum_encoder.module.0.layer1.1.bn2.running_mean, momentum_encoder.module.0.layer1.1.bn2.running_var, momentum_encoder.module.0.layer1.1.bn2.num_batches_tracked, momentum_encoder.module.0.layer1.1.conv3.weight, momentum_encoder.module.0.layer1.1.bn3.weight, momentum_encoder.module.0.layer1.1.bn3.bias, momentum_encoder.module.0.layer1.1.bn3.running_mean, momentum_encoder.module.0.layer1.1.bn3.running_var, momentum_encoder.module.0.layer1.1.bn3.num_batches_tracked, momentum_encoder.module.0.layer1.2.conv1.weight, momentum_encoder.module.0.layer1.2.bn1.weight, momentum_encoder.module.0.layer1.2.bn1.bias, momentum_encoder.module.0.layer1.2.bn1.running_mean, momentum_encoder.module.0.layer1.2.bn1.running_var, momentum_encoder.module.0.layer1.2.bn1.num_batches_tracked, momentum_encoder.module.0.layer1.2.conv2.weight, momentum_encoder.module.0.layer1.2.bn2.weight, momentum_encoder.module.0.layer1.2.bn2.bias, momentum_encoder.module.0.layer1.2.bn2.running_mean, momentum_encoder.module.0.layer1.2.bn2.running_var, momentum_encoder.module.0.layer1.2.bn2.num_batches_tracked, momentum_encoder.module.0.layer1.2.conv3.weight, momentum_encoder.module.0.layer1.2.bn3.weight, momentum_encoder.module.0.layer1.2.bn3.bias, momentum_encoder.module.0.layer1.2.bn3.running_mean, momentum_encoder.module.0.layer1.2.bn3.running_var, momentum_encoder.module.0.layer1.2.bn3.num_batches_tracked, momentum_encoder.module.0.layer2.0.conv1.weight, momentum_encoder.module.0.layer2.0.bn1.weight, momentum_encoder.module.0.layer2.0.bn1.bias, momentum_encoder.module.0.layer2.0.bn1.running_mean, momentum_encoder.module.0.layer2.0.bn1.running_var, momentum_encoder.module.0.layer2.0.bn1.num_batches_tracked, momentum_encoder.module.0.layer2.0.conv2.weight, momentum_encoder.module.0.layer2.0.bn2.weight, momentum_encoder.module.0.layer2.0.bn2.bias, momentum_encoder.module.0.layer2.0.bn2.running_mean, momentum_encoder.module.0.layer2.0.bn2.running_var, momentum_encoder.module.0.layer2.0.bn2.num_batches_tracked, momentum_encoder.module.0.layer2.0.conv3.weight, momentum_encoder.module.0.layer2.0.bn3.weight, momentum_encoder.module.0.layer2.0.bn3.bias, momentum_encoder.module.0.layer2.0.bn3.running_mean, momentum_encoder.module.0.layer2.0.bn3.running_var, momentum_encoder.module.0.layer2.0.bn3.num_batches_tracked, momentum_encoder.module.0.layer2.0.downsample.0.weight, momentum_encoder.module.0.layer2.0.downsample.1.weight, momentum_encoder.module.0.layer2.0.downsample.1.bias, momentum_encoder.module.0.layer2.0.downsample.1.running_mean, momentum_encoder.module.0.layer2.0.downsample.1.running_var, momentum_encoder.module.0.layer2.0.downsample.1.num_batches_tracked, momentum_encoder.module.0.layer2.1.conv1.weight, momentum_encoder.module.0.layer2.1.bn1.weight, momentum_encoder.module.0.layer2.1.bn1.bias, momentum_encoder.module.0.layer2.1.bn1.running_mean, momentum_encoder.module.0.layer2.1.bn1.running_var, momentum_encoder.module.0.layer2.1.bn1.num_batches_tracked, momentum_encoder.module.0.layer2.1.conv2.weight, momentum_encoder.module.0.layer2.1.bn2.weight, momentum_encoder.module.0.layer2.1.bn2.bias, momentum_encoder.module.0.layer2.1.bn2.running_mean, momentum_encoder.module.0.layer2.1.bn2.running_var, momentum_encoder.module.0.layer2.1.bn2.num_batches_tracked, momentum_encoder.module.0.layer2.1.conv3.weight, momentum_encoder.module.0.layer2.1.bn3.weight, momentum_encoder.module.0.layer2.1.bn3.bias, momentum_encoder.module.0.layer2.1.bn3.running_mean, momentum_encoder.module.0.layer2.1.bn3.running_var, momentum_encoder.module.0.layer2.1.bn3.num_batches_tracked, momentum_encoder.module.0.layer2.2.conv1.weight, momentum_encoder.module.0.layer2.2.bn1.weight, momentum_encoder.module.0.layer2.2.bn1.bias, momentum_encoder.module.0.layer2.2.bn1.running_mean, momentum_encoder.module.0.layer2.2.bn1.running_var, momentum_encoder.module.0.layer2.2.bn1.num_batches_tracked, momentum_encoder.module.0.layer2.2.conv2.weight, momentum_encoder.module.0.layer2.2.bn2.weight, momentum_encoder.module.0.layer2.2.bn2.bias, momentum_encoder.module.0.layer2.2.bn2.running_mean, momentum_encoder.module.0.layer2.2.bn2.running_var, momentum_encoder.module.0.layer2.2.bn2.num_batches_tracked, momentum_encoder.module.0.layer2.2.conv3.weight, momentum_encoder.module.0.layer2.2.bn3.weight, momentum_encoder.module.0.layer2.2.bn3.bias, momentum_encoder.module.0.layer2.2.bn3.running_mean, momentum_encoder.module.0.layer2.2.bn3.running_var, momentum_encoder.module.0.layer2.2.bn3.num_batches_tracked, momentum_encoder.module.0.layer2.3.conv1.weight, momentum_encoder.module.0.layer2.3.bn1.weight, momentum_encoder.module.0.layer2.3.bn1.bias, momentum_encoder.module.0.layer2.3.bn1.running_mean, momentum_encoder.module.0.layer2.3.bn1.running_var, momentum_encoder.module.0.layer2.3.bn1.num_batches_tracked, momentum_encoder.module.0.layer2.3.conv2.weight, momentum_encoder.module.0.layer2.3.bn2.weight, momentum_encoder.module.0.layer2.3.bn2.bias, momentum_encoder.module.0.layer2.3.bn2.running_mean, momentum_encoder.module.0.layer2.3.bn2.running_var, momentum_encoder.module.0.layer2.3.bn2.num_batches_tracked, momentum_encoder.module.0.layer2.3.conv3.weight, momentum_encoder.module.0.layer2.3.bn3.weight, momentum_encoder.module.0.layer2.3.bn3.bias, momentum_encoder.module.0.layer2.3.bn3.running_mean, momentum_encoder.module.0.layer2.3.bn3.running_var, momentum_encoder.module.0.layer2.3.bn3.num_batches_tracked, momentum_encoder.module.0.layer3.0.conv1.weight, momentum_encoder.module.0.layer3.0.bn1.weight, momentum_encoder.module.0.layer3.0.bn1.bias, momentum_encoder.module.0.layer3.0.bn1.running_mean, momentum_encoder.module.0.layer3.0.bn1.running_var, momentum_encoder.module.0.layer3.0.bn1.num_batches_tracked, momentum_encoder.module.0.layer3.0.conv2.weight, momentum_encoder.module.0.layer3.0.bn2.weight, momentum_encoder.module.0.layer3.0.bn2.bias, momentum_encoder.module.0.layer3.0.bn2.running_mean, momentum_encoder.module.0.layer3.0.bn2.running_var, momentum_encoder.module.0.layer3.0.bn2.num_batches_tracked, momentum_encoder.module.0.layer3.0.conv3.weight, momentum_encoder.module.0.layer3.0.bn3.weight, momentum_encoder.module.0.layer3.0.bn3.bias, momentum_encoder.module.0.layer3.0.bn3.running_mean, momentum_encoder.module.0.layer3.0.bn3.running_var, momentum_encoder.module.0.layer3.0.bn3.num_batches_tracked, momentum_encoder.module.0.layer3.0.downsample.0.weight, momentum_encoder.module.0.layer3.0.downsample.1.weight, momentum_encoder.module.0.layer3.0.downsample.1.bias, momentum_encoder.module.0.layer3.0.downsample.1.running_mean, momentum_encoder.module.0.layer3.0.downsample.1.running_var, momentum_encoder.module.0.layer3.0.downsample.1.num_batches_tracked, momentum_encoder.module.0.layer3.1.conv1.weight, momentum_encoder.module.0.layer3.1.bn1.weight, momentum_encoder.module.0.layer3.1.bn1.bias, momentum_encoder.module.0.layer3.1.bn1.running_mean, momentum_encoder.module.0.layer3.1.bn1.running_var, momentum_encoder.module.0.layer3.1.bn1.num_batches_tracked, momentum_encoder.module.0.layer3.1.conv2.weight, momentum_encoder.module.0.layer3.1.bn2.weight, momentum_encoder.module.0.layer3.1.bn2.bias, momentum_encoder.module.0.layer3.1.bn2.running_mean, momentum_encoder.module.0.layer3.1.bn2.running_var, momentum_encoder.module.0.layer3.1.bn2.num_batches_tracked, momentum_encoder.module.0.layer3.1.conv3.weight, momentum_encoder.module.0.layer3.1.bn3.weight, momentum_encoder.module.0.layer3.1.bn3.bias, momentum_encoder.module.0.layer3.1.bn3.running_mean, momentum_encoder.module.0.layer3.1.bn3.running_var, momentum_encoder.module.0.layer3.1.bn3.num_batches_tracked, momentum_encoder.module.0.layer3.2.conv1.weight, momentum_encoder.module.0.layer3.2.bn1.weight, momentum_encoder.module.0.layer3.2.bn1.bias, momentum_encoder.module.0.layer3.2.bn1.running_mean, momentum_encoder.module.0.layer3.2.bn1.running_var, momentum_encoder.module.0.layer3.2.bn1.num_batches_tracked, momentum_encoder.module.0.layer3.2.conv2.weight, momentum_encoder.module.0.layer3.2.bn2.weight, momentum_encoder.module.0.layer3.2.bn2.bias, momentum_encoder.module.0.layer3.2.bn2.running_mean, momentum_encoder.module.0.layer3.2.bn2.running_var, momentum_encoder.module.0.layer3.2.bn2.num_batches_tracked, momentum_encoder.module.0.layer3.2.conv3.weight, momentum_encoder.module.0.layer3.2.bn3.weight, momentum_encoder.module.0.layer3.2.bn3.bias, momentum_encoder.module.0.layer3.2.bn3.running_mean, momentum_encoder.module.0.layer3.2.bn3.running_var, momentum_encoder.module.0.layer3.2.bn3.num_batches_tracked, momentum_encoder.module.0.layer3.3.conv1.weight, momentum_encoder.module.0.layer3.3.bn1.weight, momentum_encoder.module.0.layer3.3.bn1.bias, momentum_encoder.module.0.layer3.3.bn1.running_mean, momentum_encoder.module.0.layer3.3.bn1.running_var, momentum_encoder.module.0.layer3.3.bn1.num_batches_tracked, momentum_encoder.module.0.layer3.3.conv2.weight, momentum_encoder.module.0.layer3.3.bn2.weight, momentum_encoder.module.0.layer3.3.bn2.bias, momentum_encoder.module.0.layer3.3.bn2.running_mean, momentum_encoder.module.0.layer3.3.bn2.running_var, momentum_encoder.module.0.layer3.3.bn2.num_batches_tracked, momentum_encoder.module.0.layer3.3.conv3.weight, momentum_encoder.module.0.layer3.3.bn3.weight, momentum_encoder.module.0.layer3.3.bn3.bias, momentum_encoder.module.0.layer3.3.bn3.running_mean, momentum_encoder.module.0.layer3.3.bn3.running_var, momentum_encoder.module.0.layer3.3.bn3.num_batches_tracked, momentum_encoder.module.0.layer3.4.conv1.weight, momentum_encoder.module.0.layer3.4.bn1.weight, momentum_encoder.module.0.layer3.4.bn1.bias, momentum_encoder.module.0.layer3.4.bn1.running_mean, momentum_encoder.module.0.layer3.4.bn1.running_var, momentum_encoder.module.0.layer3.4.bn1.num_batches_tracked, momentum_encoder.module.0.layer3.4.conv2.weight, momentum_encoder.module.0.layer3.4.bn2.weight, momentum_encoder.module.0.layer3.4.bn2.bias, momentum_encoder.module.0.layer3.4.bn2.running_mean, momentum_encoder.module.0.layer3.4.bn2.running_var, momentum_encoder.module.0.layer3.4.bn2.num_batches_tracked, momentum_encoder.module.0.layer3.4.conv3.weight, momentum_encoder.module.0.layer3.4.bn3.weight, momentum_encoder.module.0.layer3.4.bn3.bias, momentum_encoder.module.0.layer3.4.bn3.running_mean, momentum_encoder.module.0.layer3.4.bn3.running_var, momentum_encoder.module.0.layer3.4.bn3.num_batches_tracked, momentum_encoder.module.0.layer3.5.conv1.weight, momentum_encoder.module.0.layer3.5.bn1.weight, momentum_encoder.module.0.layer3.5.bn1.bias, momentum_encoder.module.0.layer3.5.bn1.running_mean, momentum_encoder.module.0.layer3.5.bn1.running_var, momentum_encoder.module.0.layer3.5.bn1.num_batches_tracked, momentum_encoder.module.0.layer3.5.conv2.weight, momentum_encoder.module.0.layer3.5.bn2.weight, momentum_encoder.module.0.layer3.5.bn2.bias, momentum_encoder.module.0.layer3.5.bn2.running_mean, momentum_encoder.module.0.layer3.5.bn2.running_var, momentum_encoder.module.0.layer3.5.bn2.num_batches_tracked, momentum_encoder.module.0.layer3.5.conv3.weight, momentum_encoder.module.0.layer3.5.bn3.weight, momentum_encoder.module.0.layer3.5.bn3.bias, momentum_encoder.module.0.layer3.5.bn3.running_mean, momentum_encoder.module.0.layer3.5.bn3.running_var, momentum_encoder.module.0.layer3.5.bn3.num_batches_tracked, momentum_encoder.module.0.layer4.0.conv1.weight, momentum_encoder.module.0.layer4.0.bn1.weight, momentum_encoder.module.0.layer4.0.bn1.bias, momentum_encoder.module.0.layer4.0.bn1.running_mean, momentum_encoder.module.0.layer4.0.bn1.running_var, momentum_encoder.module.0.layer4.0.bn1.num_batches_tracked, momentum_encoder.module.0.layer4.0.conv2.weight, momentum_encoder.module.0.layer4.0.bn2.weight, momentum_encoder.module.0.layer4.0.bn2.bias, momentum_encoder.module.0.layer4.0.bn2.running_mean, momentum_encoder.module.0.layer4.0.bn2.running_var, momentum_encoder.module.0.layer4.0.bn2.num_batches_tracked, momentum_encoder.module.0.layer4.0.conv3.weight, momentum_encoder.module.0.layer4.0.bn3.weight, momentum_encoder.module.0.layer4.0.bn3.bias, momentum_encoder.module.0.layer4.0.bn3.running_mean, momentum_encoder.module.0.layer4.0.bn3.running_var, momentum_encoder.module.0.layer4.0.bn3.num_batches_tracked, momentum_encoder.module.0.layer4.0.downsample.0.weight, momentum_encoder.module.0.layer4.0.downsample.1.weight, momentum_encoder.module.0.layer4.0.downsample.1.bias, momentum_encoder.module.0.layer4.0.downsample.1.running_mean, momentum_encoder.module.0.layer4.0.downsample.1.running_var, momentum_encoder.module.0.layer4.0.downsample.1.num_batches_tracked, momentum_encoder.module.0.layer4.1.conv1.weight, momentum_encoder.module.0.layer4.1.bn1.weight, momentum_encoder.module.0.layer4.1.bn1.bias, momentum_encoder.module.0.layer4.1.bn1.running_mean, momentum_encoder.module.0.layer4.1.bn1.running_var, momentum_encoder.module.0.layer4.1.bn1.num_batches_tracked, momentum_encoder.module.0.layer4.1.conv2.weight, momentum_encoder.module.0.layer4.1.bn2.weight, momentum_encoder.module.0.layer4.1.bn2.bias, momentum_encoder.module.0.layer4.1.bn2.running_mean, momentum_encoder.module.0.layer4.1.bn2.running_var, momentum_encoder.module.0.layer4.1.bn2.num_batches_tracked, momentum_encoder.module.0.layer4.1.conv3.weight, momentum_encoder.module.0.layer4.1.bn3.weight, momentum_encoder.module.0.layer4.1.bn3.bias, momentum_encoder.module.0.layer4.1.bn3.running_mean, momentum_encoder.module.0.layer4.1.bn3.running_var, momentum_encoder.module.0.layer4.1.bn3.num_batches_tracked, momentum_encoder.module.0.layer4.2.conv1.weight, momentum_encoder.module.0.layer4.2.bn1.weight, momentum_encoder.module.0.layer4.2.bn1.bias, momentum_encoder.module.0.layer4.2.bn1.running_mean, momentum_encoder.module.0.layer4.2.bn1.running_var, momentum_encoder.module.0.layer4.2.bn1.num_batches_tracked, momentum_encoder.module.0.layer4.2.conv2.weight, momentum_encoder.module.0.layer4.2.bn2.weight, momentum_encoder.module.0.layer4.2.bn2.bias, momentum_encoder.module.0.layer4.2.bn2.running_mean, momentum_encoder.module.0.layer4.2.bn2.running_var, momentum_encoder.module.0.layer4.2.bn2.num_batches_tracked, momentum_encoder.module.0.layer4.2.conv3.weight, momentum_encoder.module.0.layer4.2.bn3.weight, momentum_encoder.module.0.layer4.2.bn3.bias, momentum_encoder.module.0.layer4.2.bn3.running_mean, momentum_encoder.module.0.layer4.2.bn3.running_var, momentum_encoder.module.0.layer4.2.bn3.num_batches_tracked, momentum_encoder.module.1.fc0.weight, momentum_encoder.module.1.bn0.weight, momentum_encoder.module.1.bn0.bias, momentum_encoder.module.1.bn0.running_mean, momentum_encoder.module.1.bn0.running_var, momentum_encoder.module.1.bn0.num_batches_tracked, momentum_encoder.module.1.fc1.weight, momentum_encoder.module.1.bn1.running_mean, momentum_encoder.module.1.bn1.running_var, momentum_encoder.module.1.bn1.num_batches_tracked\n", "\n", "missing keys in source state_dict: conv1.weight, bn1.weight, bn1.bias, bn1.running_mean, bn1.running_var, layer1.0.conv1.weight, layer1.0.bn1.weight, layer1.0.bn1.bias, layer1.0.bn1.running_mean, layer1.0.bn1.running_var, layer1.0.conv2.weight, layer1.0.bn2.weight, layer1.0.bn2.bias, layer1.0.bn2.running_mean, layer1.0.bn2.running_var, layer1.0.conv3.weight, layer1.0.bn3.weight, layer1.0.bn3.bias, layer1.0.bn3.running_mean, layer1.0.bn3.running_var, layer1.0.downsample.0.weight, layer1.0.downsample.1.weight, layer1.0.downsample.1.bias, layer1.0.downsample.1.running_mean, layer1.0.downsample.1.running_var, layer1.1.conv1.weight, layer1.1.bn1.weight, layer1.1.bn1.bias, layer1.1.bn1.running_mean, layer1.1.bn1.running_var, layer1.1.conv2.weight, layer1.1.bn2.weight, layer1.1.bn2.bias, layer1.1.bn2.running_mean, layer1.1.bn2.running_var, layer1.1.conv3.weight, layer1.1.bn3.weight, layer1.1.bn3.bias, layer1.1.bn3.running_mean, layer1.1.bn3.running_var, layer1.2.conv1.weight, layer1.2.bn1.weight, layer1.2.bn1.bias, layer1.2.bn1.running_mean, layer1.2.bn1.running_var, layer1.2.conv2.weight, layer1.2.bn2.weight, layer1.2.bn2.bias, layer1.2.bn2.running_mean, layer1.2.bn2.running_var, layer1.2.conv3.weight, layer1.2.bn3.weight, layer1.2.bn3.bias, layer1.2.bn3.running_mean, layer1.2.bn3.running_var, layer2.0.conv1.weight, layer2.0.bn1.weight, layer2.0.bn1.bias, layer2.0.bn1.running_mean, layer2.0.bn1.running_var, layer2.0.conv2.weight, layer2.0.bn2.weight, layer2.0.bn2.bias, layer2.0.bn2.running_mean, layer2.0.bn2.running_var, layer2.0.conv3.weight, layer2.0.bn3.weight, layer2.0.bn3.bias, layer2.0.bn3.running_mean, layer2.0.bn3.running_var, layer2.0.downsample.0.weight, layer2.0.downsample.1.weight, layer2.0.downsample.1.bias, layer2.0.downsample.1.running_mean, layer2.0.downsample.1.running_var, layer2.1.conv1.weight, layer2.1.bn1.weight, layer2.1.bn1.bias, layer2.1.bn1.running_mean, layer2.1.bn1.running_var, layer2.1.conv2.weight, layer2.1.bn2.weight, layer2.1.bn2.bias, layer2.1.bn2.running_mean, layer2.1.bn2.running_var, layer2.1.conv3.weight, layer2.1.bn3.weight, layer2.1.bn3.bias, layer2.1.bn3.running_mean, layer2.1.bn3.running_var, layer2.2.conv1.weight, layer2.2.bn1.weight, layer2.2.bn1.bias, layer2.2.bn1.running_mean, layer2.2.bn1.running_var, layer2.2.conv2.weight, layer2.2.bn2.weight, layer2.2.bn2.bias, layer2.2.bn2.running_mean, layer2.2.bn2.running_var, layer2.2.conv3.weight, layer2.2.bn3.weight, layer2.2.bn3.bias, layer2.2.bn3.running_mean, layer2.2.bn3.running_var, layer2.3.conv1.weight, layer2.3.bn1.weight, layer2.3.bn1.bias, layer2.3.bn1.running_mean, layer2.3.bn1.running_var, layer2.3.conv2.weight, layer2.3.bn2.weight, layer2.3.bn2.bias, layer2.3.bn2.running_mean, layer2.3.bn2.running_var, layer2.3.conv3.weight, layer2.3.bn3.weight, layer2.3.bn3.bias, layer2.3.bn3.running_mean, layer2.3.bn3.running_var, layer3.0.conv1.weight, layer3.0.bn1.weight, layer3.0.bn1.bias, layer3.0.bn1.running_mean, layer3.0.bn1.running_var, layer3.0.conv2.weight, layer3.0.bn2.weight, layer3.0.bn2.bias, layer3.0.bn2.running_mean, layer3.0.bn2.running_var, layer3.0.conv3.weight, layer3.0.bn3.weight, layer3.0.bn3.bias, layer3.0.bn3.running_mean, layer3.0.bn3.running_var, layer3.0.downsample.0.weight, layer3.0.downsample.1.weight, layer3.0.downsample.1.bias, layer3.0.downsample.1.running_mean, layer3.0.downsample.1.running_var, layer3.1.conv1.weight, layer3.1.bn1.weight, layer3.1.bn1.bias, layer3.1.bn1.running_mean, layer3.1.bn1.running_var, layer3.1.conv2.weight, layer3.1.bn2.weight, layer3.1.bn2.bias, layer3.1.bn2.running_mean, layer3.1.bn2.running_var, layer3.1.conv3.weight, layer3.1.bn3.weight, layer3.1.bn3.bias, layer3.1.bn3.running_mean, layer3.1.bn3.running_var, layer3.2.conv1.weight, layer3.2.bn1.weight, layer3.2.bn1.bias, layer3.2.bn1.running_mean, layer3.2.bn1.running_var, layer3.2.conv2.weight, layer3.2.bn2.weight, layer3.2.bn2.bias, layer3.2.bn2.running_mean, layer3.2.bn2.running_var, layer3.2.conv3.weight, layer3.2.bn3.weight, layer3.2.bn3.bias, layer3.2.bn3.running_mean, layer3.2.bn3.running_var, layer3.3.conv1.weight, layer3.3.bn1.weight, layer3.3.bn1.bias, layer3.3.bn1.running_mean, layer3.3.bn1.running_var, layer3.3.conv2.weight, layer3.3.bn2.weight, layer3.3.bn2.bias, layer3.3.bn2.running_mean, layer3.3.bn2.running_var, layer3.3.conv3.weight, layer3.3.bn3.weight, layer3.3.bn3.bias, layer3.3.bn3.running_mean, layer3.3.bn3.running_var, layer3.4.conv1.weight, layer3.4.bn1.weight, layer3.4.bn1.bias, layer3.4.bn1.running_mean, layer3.4.bn1.running_var, layer3.4.conv2.weight, layer3.4.bn2.weight, layer3.4.bn2.bias, layer3.4.bn2.running_mean, layer3.4.bn2.running_var, layer3.4.conv3.weight, layer3.4.bn3.weight, layer3.4.bn3.bias, layer3.4.bn3.running_mean, layer3.4.bn3.running_var, layer3.5.conv1.weight, layer3.5.bn1.weight, layer3.5.bn1.bias, layer3.5.bn1.running_mean, layer3.5.bn1.running_var, layer3.5.conv2.weight, layer3.5.bn2.weight, layer3.5.bn2.bias, layer3.5.bn2.running_mean, layer3.5.bn2.running_var, layer3.5.conv3.weight, layer3.5.bn3.weight, layer3.5.bn3.bias, layer3.5.bn3.running_mean, layer3.5.bn3.running_var, layer4.0.conv1.weight, layer4.0.bn1.weight, layer4.0.bn1.bias, layer4.0.bn1.running_mean, layer4.0.bn1.running_var, layer4.0.conv2.weight, layer4.0.bn2.weight, layer4.0.bn2.bias, layer4.0.bn2.running_mean, layer4.0.bn2.running_var, layer4.0.conv3.weight, layer4.0.bn3.weight, layer4.0.bn3.bias, layer4.0.bn3.running_mean, layer4.0.bn3.running_var, layer4.0.downsample.0.weight, layer4.0.downsample.1.weight, layer4.0.downsample.1.bias, layer4.0.downsample.1.running_mean, layer4.0.downsample.1.running_var, layer4.1.conv1.weight, layer4.1.bn1.weight, layer4.1.bn1.bias, layer4.1.bn1.running_mean, layer4.1.bn1.running_var, layer4.1.conv2.weight, layer4.1.bn2.weight, layer4.1.bn2.bias, layer4.1.bn2.running_mean, layer4.1.bn2.running_var, layer4.1.conv3.weight, layer4.1.bn3.weight, layer4.1.bn3.bias, layer4.1.bn3.running_mean, layer4.1.bn3.running_var, layer4.2.conv1.weight, layer4.2.bn1.weight, layer4.2.bn1.bias, layer4.2.bn1.running_mean, layer4.2.bn1.running_var, layer4.2.conv2.weight, layer4.2.bn2.weight, layer4.2.bn2.bias, layer4.2.bn2.running_mean, layer4.2.bn2.running_var, layer4.2.conv3.weight, layer4.2.bn3.weight, layer4.2.bn3.bias, layer4.2.bn3.running_mean, layer4.2.bn3.running_var\n", "\n", "11/19 11:33:34 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Checkpoints will be saved to /content/mmyolo/work_dirs/yolov5_s_res50_mocov3-v61_1xb2-1e_coco128.\n", "11/19 11:34:02 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Epoch(train) [1][50/63] lr: 4.9000e-04 eta: 0:00:07 time: 0.5597 data_time: 0.0066 memory: 10724 loss: 0.5685 loss_cls: 0.2051 loss_obj: 0.1493 loss_bbox: 0.2142\n", "11/19 11:34:07 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Exp name: yolov5_s_res50_mocov3-v61_1xb2-1e_coco128_20221119_113324\n", "11/19 11:34:07 - mmengine - \u001b[4m\u001b[37mINFO\u001b[0m - Saving checkpoint at 1 epochs\n", "11/19 11:34:08 - mmengine - \u001b[5m\u001b[4m\u001b[33mWARNING\u001b[0m - `save_param_scheduler` is True but `self.param_schedulers` is None, so skip saving parameter schedulers\n", "tcmalloc: large alloc 1085169664 bytes == 0xb7884000 @ 0x7fca5fe3f615 0x58ead6 0x4f355e 0x58f8af 0x58fb26 0x7fca39d8a07f 0x7fca39d5a974 0x7fca13a84ef5 0x7fca13a7f441 0x7fca13a86549 0x7fca39d5af3b 0x7fca399e4f61 0x58f6e4 0x590691 0x510946 0x5b575e 0x58ff2e 0x50c4fc 0x5b4ee6 0x58ff2e 0x510325 0x5b4ee6 0x58ff2e 0x50c4fc 0x5b4ee6 0x4bad0a 0x50e18c 0x5b575e 0x4bad0a 0x4d3249 0x591e56\n" ] } ], "source": [ "# 启动训练\n", "!python tools/train.py configs/yolov5/yolov5_s_res50_mocov3-v61_1xb2-1e_coco128.py" ] }, { "cell_type": "markdown", "metadata": { "id": "s4Pz4W9juPNt" }, "source": [ "## 3 彩蛋:如何确定主干网络输出通道数\n", "PPYOLO-E 最大模型 x 中的 `widen_factor` 为 1.25。假设我们想要构建一个更大的网络,将 `widen_factor` 设为 1.5,此时其主干网络 `PPYOLOECSPResNet` 的输出通道数会是多少呢?\n", "\n", "下面我们就来看一下。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "5bMQ5T9gseWL", "outputId": "ca64d6bf-5169-4895-acd6-93e72bc8786c" }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/usr/local/lib/python3.7/dist-packages/mmcv/cnn/bricks/hsigmoid.py:36: UserWarning: In MMCV v1.4.4, we modified the default value of args to align with PyTorch official. Previous Implementation: Hsigmoid(x) = min(max((x + 1) / 2, 0), 1). Current Implementation: Hsigmoid(x) = min(max((x + 3) / 6, 0), 1).\n", " 'In MMCV v1.4.4, we modified the default value of args to align '\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "[torch.Size([1, 384, 80, 80]), torch.Size([1, 768, 40, 40]), torch.Size([1, 1536, 20, 20])]\n" ] } ], "source": [ "import torch\n", "from mmyolo.models import PPYOLOECSPResNet\n", "from mmyolo.utils import register_all_modules\n", "\n", "# 注册所有模块\n", "register_all_modules()\n", "\n", "imgs = torch.randn(1, 3, 640, 640)\n", "out_indices=(2, 3, 4)\n", "model = PPYOLOECSPResNet(arch='P5', widen_factor=1.5, out_indices=out_indices)\n", "out = model(imgs)\n", "out_shapes = [out[i].shape for i in range(len(out_indices))]\n", "print(out_shapes)" ] }, { "cell_type": "markdown", "metadata": { "id": "eiAxPHXSr1O_" }, "source": [ "**小问题:**\n", "\n", "为什么第二节的配置文件中 `widen_factor` 都要设置为 1.0?是必须的吗?" ] }, { "cell_type": "markdown", "metadata": { "id": "ZMu-ZotLNsPo" }, "source": [ " ## 4 总结\n", "- 本教程以 YOLOv5 算法为例简短的介绍了如何更换各种主干网络。更多教程和示例详见 MMYOLO 的 [文档](https://mmyolo.readthedocs.io/zh_CN/latest/)。\n", "\n", "- 如果你有任何需求或者建议,欢迎在 MMYOLO [开发计划](https://github.com/open-mmlab/mmyolo/issues/136) 中留言。也欢迎添加小助手微信 **OpenMMLabwx**,邀你加入 MMYOLO 微信群,每天都有热心大佬在线答疑哦~\n", "\n", "- 如果觉得 MMYOLO 好用的话帮点个 [![GitHub stars](https://img.shields.io/github/stars/open-mmlab/mmyolo.svg?style=social&label=Star&maxAge=2592000)](https://github.com/open-mmlab/mmyolo),你的支持就是我们动力。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ByJUYUy4vVCU" }, "outputs": [], "source": [] } ], "metadata": { "accelerator": "GPU", "colab": { "provenance": [], "toc_visible": true }, "gpuClass": "standard", "kernelspec": { "display_name": "Python 3.8.5 ('base')", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.8.5" }, "vscode": { "interpreter": { "hash": "b09ec625f77bf4fd762565a912b97636504ad6ec901eb2d0f4cf5a7de23e1ee5" } } }, "nbformat": 4, "nbformat_minor": 0 }