yolov5剪枝 实战

article/2025/10/26 1:09:17

(1)步骤

剪枝的一般步骤只是在正常训练的后面加上了稀疏化训练和剪枝的步骤。

(2)稀疏化训练

在这里插入图片描述

主要区别

稀疏化训练的代码和正常训练的代码的差别主要体现在
①反向传播 ②优化器 ③parse_opt代码

接下来从代码执行训练简单分析
(下面代码均为稀疏化训练的代码)

(1)parse_opt代码

加入了这两行!!!

    parser.add_argument('--st', action='store_true',default=True, help='train with L1 sparsity normalization')parser.add_argument('--sr', type=float, default=0.0001, help='L1 normal sparse rate')

sr:平衡因子lamda(就是论文里面的这个红圈里面的东西)
在这里插入图片描述

(2)反向传播

train.py

# Backwardscaler.scale(loss).backward()     

train_sparsity.py

            loss.backward()

③优化器

     # Optimizeif ni - last_opt_step >= accumulate:scaler.step(optimizer)  # optimizer.stepscaler.update()optimizer.zero_grad()if ema:ema.update(model)last_opt_step = ni
            # # ============================= sparsity training ========================== #srtmp = opt.sr*(1 - 0.9*epoch/epochs)if opt.st:ignore_bn_list = []for k, m in model.named_modules():if isinstance(m, Bottleneck):if m.add:ignore_bn_list.append(k.rsplit(".", 2)[0] + ".cv1.bn")ignore_bn_list.append(k + '.cv1.bn')ignore_bn_list.append(k + '.cv2.bn')if isinstance(m, nn.BatchNorm2d) and (k not in ignore_bn_list):m.weight.grad.data.add_(srtmp * torch.sign(m.weight.data))  # L1m.bias.grad.data.add_(opt.sr*10 * torch.sign(m.bias.data))  # L1# # ============================= sparsity training ========================== #

1)for k, m in model.named_modules() 此处使用k,m两个变量是因为model.named_modules()在网络中的所有模块上返回一个迭代器,该迭代器不仅包含模块名称,还包含模块本身。因此在这段代码中k应该是模块名称,m应该是模块本身

2)不进行稀疏化的内容
图片来自知乎的博主
在这里插入图片描述
3)调节权重和bias

                        m.weight.grad.data.add_(srtmp * torch.sign(m.weight.data))  # L1m.bias.grad.data.add_(opt.sr*10 * torch.sign(m.bias.data))  # L1

直接给出稀疏化训练的完整代码:

# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
"""
Train a YOLOv5 model on a custom dataset.Models and datasets download automatically from the latest YOLOv5 release.
Models: https://github.com/ultralytics/yolov5/tree/master/models
Datasets: https://github.com/ultralytics/yolov5/tree/master/data
Tutorial: https://github.com/ultralytics/yolov5/wiki/Train-Custom-DataUsage:$ python path/to/train.py --data coco128.yaml --weights yolov5s.pt --img 640  # from pretrained (RECOMMENDED)$ python path/to/train.py --data coco128.yaml --weights '' --cfg yolov5s.yaml --img 640  # from scratch
"""import argparse
import math
import os
import random
import sys
import time
from copy import deepcopy
from datetime import datetime
from pathlib import Pathimport numpy as np
import torch
import torch.distributed as dist
import torch.nn as nn
import yaml
from torch.cuda import amp
from torch.nn.parallel import DistributedDataParallel as DDP
from torch.optim import SGD, Adam, AdamW, lr_scheduler
from tqdm import tqdmFILE = Path(__file__).resolve()
ROOT = FILE.parents[0]  # YOLOv5 root directory
if str(ROOT) not in sys.path:sys.path.append(str(ROOT))  # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd()))  # relativeimport val  # for end-of-epoch mAP
from models.experimental import attempt_load
from models.yolo import Model
from utils.autoanchor import check_anchors
from utils.autobatch import check_train_batch_size
from utils.callbacks import Callbacks
from utils.datasets import create_dataloader
from utils.downloads import attempt_download
from utils.general import (LOGGER, check_dataset, check_file, check_git_status, check_img_size, check_requirements,check_suffix, check_yaml, colorstr, get_latest_run, increment_path, init_seeds,intersect_dicts, labels_to_class_weights, labels_to_image_weights, methods, one_cycle,print_args, print_mutation, strip_optimizer)
from utils.loggers import Loggers
from utils.loggers.wandb.wandb_utils import check_wandb_resume
from utils.loss import ComputeLoss
from utils.metrics import fitness
from utils.plots import plot_evolve, plot_labels
from utils.torch_utils import EarlyStopping, ModelEMA, de_parallel, select_device, torch_distributed_zero_first
from models.common import BottleneckLOCAL_RANK = int(os.getenv('LOCAL_RANK', -1))  # https://pytorch.org/docs/stable/elastic/run.html
RANK = int(os.getenv('RANK', -1))
WORLD_SIZE = int(os.getenv('WORLD_SIZE', 1))def train(hyp,  # path/to/hyp.yaml or hyp dictionaryopt,device,callbacks):save_dir, epochs, batch_size, weights, single_cls, evolve, data, cfg, resume, noval, nosave, workers, freeze = \Path(opt.save_dir), opt.epochs, opt.batch_size, opt.weights, opt.single_cls, opt.evolve, opt.data, opt.cfg, \opt.resume, opt.noval, opt.nosave, opt.workers, opt.freeze# Directoriesw = save_dir / 'weights'  # weights dir(w.parent if evolve else w).mkdir(parents=True, exist_ok=True)  # make dirlast, best = w / 'last.pt', w / 'best.pt'# Hyperparametersif isinstance(hyp, str):with open(hyp, errors='ignore') as f:hyp = yaml.safe_load(f)  # load hyps dictLOGGER.info(colorstr('hyperparameters: ') + ', '.join(f'{k}={v}' for k, v in hyp.items()))# Save run settingsif not evolve:with open(save_dir / 'hyp.yaml', 'w') as f:yaml.safe_dump(hyp, f, sort_keys=False)with open(save_dir / 'opt.yaml', 'w') as f:yaml.safe_dump(vars(opt), f, sort_keys=False)# Loggersdata_dict = Noneif RANK in [-1, 0]:loggers = Loggers(save_dir, weights, opt, hyp, LOGGER)  # loggers instanceif loggers.wandb:data_dict = loggers.wandb.data_dictif resume:weights, epochs, hyp, batch_size = opt.weights, opt.epochs, opt.hyp, opt.batch_size# Register actionsfor k in methods(loggers):callbacks.register_action(k, callback=getattr(loggers, k))# Configplots = not evolve  # create plotscuda = device.type != 'cpu'init_seeds(1 + RANK)with torch_distributed_zero_first(LOCAL_RANK):data_dict = data_dict or check_dataset(data)  # check if Nonetrain_path, val_path = data_dict['train'], data_dict['val']nc = 1 if single_cls else int(data_dict['nc'])  # number of classesnames = ['item'] if single_cls and len(data_dict['names']) != 1 else data_dict['names']  # class namesassert len(names) == nc, f'{len(names)} names found for nc={nc} dataset in {data}'  # checkis_coco = isinstance(val_path, str) and val_path.endswith('coco/val2017.txt')  # COCO dataset# Modelcheck_suffix(weights, '.pt')  # check weightspretrained = weights.endswith('.pt')if pretrained:with torch_distributed_zero_first(LOCAL_RANK):weights = attempt_download(weights)  # download if not found locallyckpt = torch.load(weights, map_location='cpu')  # load checkpoint to CPU to avoid CUDA memory leakmodel = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device)  # createexclude = ['anchor'] if (cfg or hyp.get('anchors')) and not resume else []  # exclude keyscsd = ckpt['model'].float().state_dict()  # checkpoint state_dict as FP32csd = intersect_dicts(csd, model.state_dict(), exclude=exclude)  # intersectmodel.load_state_dict(csd, strict=False)  # loadLOGGER.info(f'Transferred {len(csd)}/{len(model.state_dict())} items from {weights}')  # reportelse:model = Model(cfg, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device)  # create# Freezefreeze = [f'model.{x}.' for x in (freeze if len(freeze) > 1 else range(freeze[0]))]  # layers to freezefor k, v in model.named_parameters():v.requires_grad = True  # train all layersif any(x in k for x in freeze):LOGGER.info(f'freezing {k}')v.requires_grad = False# Image sizegs = max(int(model.stride.max()), 32)  # grid size (max stride)imgsz = check_img_size(opt.imgsz, gs, floor=gs * 2)  # verify imgsz is gs-multiple# Batch sizeif RANK == -1 and batch_size == -1:  # single-GPU only, estimate best batch sizebatch_size = check_train_batch_size(model, imgsz)loggers.on_params_update({"batch_size": batch_size})# Optimizernbs = 64  # nominal batch sizeaccumulate = max(round(nbs / batch_size), 1)  # accumulate loss before optimizinghyp['weight_decay'] *= batch_size * accumulate / nbs  # scale weight_decayLOGGER.info(f"Scaled weight_decay = {hyp['weight_decay']}")g0, g1, g2 = [], [], []  # optimizer parameter groupsfor v in model.modules():if hasattr(v, 'bias') and isinstance(v.bias, nn.Parameter):  # biasg2.append(v.bias)if isinstance(v, nn.BatchNorm2d):  # weight (no decay)g0.append(v.weight)elif hasattr(v, 'weight') and isinstance(v.weight, nn.Parameter):  # weight (with decay)g1.append(v.weight)if opt.optimizer == 'Adam':optimizer = Adam(g0, lr=hyp['lr0'], betas=(hyp['momentum'], 0.999))  # adjust beta1 to momentumelif opt.optimizer == 'AdamW':optimizer = AdamW(g0, lr=hyp['lr0'], betas=(hyp['momentum'], 0.999))  # adjust beta1 to momentumelse:optimizer = SGD(g0, lr=hyp['lr0'], momentum=hyp['momentum'], nesterov=True)optimizer.add_param_group({'params': g1, 'weight_decay': hyp['weight_decay']})  # add g1 with weight_decayoptimizer.add_param_group({'params': g2})  # add g2 (biases)LOGGER.info(f"{colorstr('optimizer:')} {type(optimizer).__name__} with parameter groups "f"{len(g0)} weight (no decay), {len(g1)} weight, {len(g2)} bias")del g0, g1, g2# Schedulerif opt.cos_lr:lf = one_cycle(1, hyp['lrf'], epochs)  # cosine 1->hyp['lrf']else:lf = lambda x: (1 - x / epochs) * (1.0 - hyp['lrf']) + hyp['lrf']  # linearscheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf)  # plot_lr_scheduler(optimizer, scheduler, epochs)# EMAema = ModelEMA(model) if RANK in [-1, 0] else None# Resumestart_epoch, best_fitness = 0, 0.0if pretrained:# Optimizerif ckpt['optimizer'] is not None:optimizer.load_state_dict(ckpt['optimizer'])best_fitness = ckpt['best_fitness']# EMAif ema and ckpt.get('ema'):ema.ema.load_state_dict(ckpt['ema'].float().state_dict())ema.updates = ckpt['updates']# Epochsstart_epoch = ckpt['epoch'] + 1if resume:assert start_epoch > 0, f'{weights} training to {epochs} epochs is finished, nothing to resume.'if epochs < start_epoch:LOGGER.info(f"{weights} has been trained for {ckpt['epoch']} epochs. Fine-tuning for {epochs} more epochs.")epochs += ckpt['epoch']  # finetune additional epochsdel ckpt, csd# DP modeif cuda and RANK == -1 and torch.cuda.device_count() > 1:LOGGER.warning('WARNING: DP not recommended, use torch.distributed.run for best DDP Multi-GPU results.\n''See Multi-GPU Tutorial at https://github.com/ultralytics/yolov5/issues/475 to get started.')model = torch.nn.DataParallel(model)# SyncBatchNormif opt.sync_bn and cuda and RANK != -1:model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model).to(device)LOGGER.info('Using SyncBatchNorm()')print(model)# Trainloadertrain_loader, dataset = create_dataloader(train_path, imgsz, batch_size // WORLD_SIZE, gs, single_cls,hyp=hyp, augment=True, cache=None if opt.cache == 'val' else opt.cache,rect=opt.rect, rank=LOCAL_RANK, workers=workers,image_weights=opt.image_weights, quad=opt.quad,prefix=colorstr('train: '), shuffle=True)mlc = int(np.concatenate(dataset.labels, 0)[:, 0].max())  # max label classnb = len(train_loader)  # number of batchesassert mlc < nc, f'Label class {mlc} exceeds nc={nc} in {data}. Possible class labels are 0-{nc - 1}'# Process 0if RANK in [-1, 0]:val_loader = create_dataloader(val_path, imgsz, batch_size // WORLD_SIZE * 2, gs, single_cls,hyp=hyp, cache=None if noval else opt.cache,rect=True, rank=-1, workers=workers * 2, pad=0.5,prefix=colorstr('val: '))[0]if not resume:labels = np.concatenate(dataset.labels, 0)# c = torch.tensor(labels[:, 0])  # classes# cf = torch.bincount(c.long(), minlength=nc) + 1.  # frequency# model._initialize_biases(cf.to(device))if plots:plot_labels(labels, names, save_dir)# Anchorsif not opt.noautoanchor:check_anchors(dataset, model=model, thr=hyp['anchor_t'], imgsz=imgsz)model.half().float()  # pre-reduce anchor precisioncallbacks.run('on_pretrain_routine_end')# DDP modeif cuda and RANK != -1:model = DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK)# Model attributesnl = de_parallel(model).model[-1].nl  # number of detection layers (to scale hyps)hyp['box'] *= 3 / nl  # scale to layershyp['cls'] *= nc / 80 * 3 / nl  # scale to classes and layershyp['obj'] *= (imgsz / 640) ** 2 * 3 / nl  # scale to image size and layershyp['label_smoothing'] = opt.label_smoothingmodel.nc = nc  # attach number of classes to modelmodel.hyp = hyp  # attach hyperparameters to modelmodel.class_weights = labels_to_class_weights(dataset.labels, nc).to(device) * nc  # attach class weightsmodel.names = names# Start trainingt0 = time.time()nw = max(round(hyp['warmup_epochs'] * nb), 1000)  # number of warmup iterations, max(3 epochs, 1k iterations)# nw = min(nw, (epochs - start_epoch) / 2 * nb)  # limit warmup to < 1/2 of traininglast_opt_step = -1maps = np.zeros(nc)  # mAP per classresults = (0, 0, 0, 0, 0, 0, 0)  # P, R, mAP@.5, mAP@.5-.95, val_loss(box, obj, cls)scheduler.last_epoch = start_epoch - 1  # do not movescaler = amp.GradScaler(enabled=cuda)stopper = EarlyStopping(patience=opt.patience)compute_loss = ComputeLoss(model)  # init loss classLOGGER.info(f'Image sizes {imgsz} train, {imgsz} val\n'f'Using {train_loader.num_workers * WORLD_SIZE} dataloader workers\n'f"Logging results to {colorstr('bold', save_dir)}\n"f'Starting training for {epochs} epochs...')for epoch in range(start_epoch, epochs):  # epoch ------------------------------------------------------------------model.train()# Update image weights (optional, single-GPU only)if opt.image_weights:cw = model.class_weights.cpu().numpy() * (1 - maps) ** 2 / nc  # class weightsiw = labels_to_image_weights(dataset.labels, nc=nc, class_weights=cw)  # image weightsdataset.indices = random.choices(range(dataset.n), weights=iw, k=dataset.n)  # rand weighted idx# Update mosaic border (optional)# b = int(random.uniform(0.25 * imgsz, 0.75 * imgsz + gs) // gs * gs)# dataset.mosaic_border = [b - imgsz, -b]  # height, width bordersmloss = torch.zeros(3, device=device)  # mean lossesif RANK != -1:train_loader.sampler.set_epoch(epoch)pbar = enumerate(train_loader)LOGGER.info(('\n' + '%10s' * 7) % ('Epoch', 'gpu_mem', 'box', 'obj', 'cls', 'labels', 'img_size'))if RANK in [-1, 0]:pbar = tqdm(pbar, total=nb, bar_format='{l_bar}{bar:10}{r_bar}{bar:-10b}')  # progress baroptimizer.zero_grad()for i, (imgs, targets, paths, _) in pbar:  # batch -------------------------------------------------------------ni = i + nb * epoch  # number integrated batches (since train start)imgs = imgs.to(device, non_blocking=True).float() / 255  # uint8 to float32, 0-255 to 0.0-1.0# Warmupif ni <= nw:xi = [0, nw]  # x interp# compute_loss.gr = np.interp(ni, xi, [0.0, 1.0])  # iou loss ratio (obj_loss = 1.0 or iou)accumulate = max(1, np.interp(ni, xi, [1, nbs / batch_size]).round())for j, x in enumerate(optimizer.param_groups):# bias lr falls from 0.1 to lr0, all other lrs rise from 0.0 to lr0x['lr'] = np.interp(ni, xi, [hyp['warmup_bias_lr'] if j == 2 else 0.0, x['initial_lr'] * lf(epoch)])if 'momentum' in x:x['momentum'] = np.interp(ni, xi, [hyp['warmup_momentum'], hyp['momentum']])# Multi-scaleif opt.multi_scale:sz = random.randrange(imgsz * 0.5, imgsz * 1.5 + gs) // gs * gs  # sizesf = sz / max(imgs.shape[2:])  # scale factorif sf != 1:ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]]  # new shape (stretched to gs-multiple)imgs = nn.functional.interpolate(imgs, size=ns, mode='bilinear', align_corners=False)# Forwardwith amp.autocast(enabled=cuda):pred = model(imgs)  # forwardloss, loss_items = compute_loss(pred, targets.to(device))  # loss scaled by batch_sizeif RANK != -1:loss *= WORLD_SIZE  # gradient averaged between devices in DDP modeif opt.quad:loss *= 4.# Backward# scaler.scale(loss).backward()loss.backward()# # ============================= sparsity training ========================== #srtmp = opt.sr*(1 - 0.9*epoch/epochs)if opt.st:ignore_bn_list = []for k, m in model.named_modules():if isinstance(m, Bottleneck):if m.add:ignore_bn_list.append(k.rsplit(".", 2)[0] + ".cv1.bn")ignore_bn_list.append(k + '.cv1.bn')ignore_bn_list.append(k + '.cv2.bn')if isinstance(m, nn.BatchNorm2d) and (k not in ignore_bn_list):m.weight.grad.data.add_(srtmp * torch.sign(m.weight.data))  # L1m.bias.grad.data.add_(opt.sr*10 * torch.sign(m.bias.data))  # L1# # ============================= sparsity training ========================== ## Optimize# if ni - last_opt_step >= accumulate:optimizer.step()# scaler.step(optimizer)  # optimizer.step# scaler.update()optimizer.zero_grad()if ema:ema.update(model)# last_opt_step = ni# Logif RANK in [-1, 0]:mloss = (mloss * i + loss_items) / (i + 1)  # update mean lossesmem = f'{torch.cuda.memory_reserved() / 1E9 if torch.cuda.is_available() else 0:.3g}G'  # (GB)pbar.set_description(('%10s' * 2 + '%10.4g' * 5) % (f'{epoch}/{epochs - 1}', mem, *mloss, targets.shape[0], imgs.shape[-1]))callbacks.run('on_train_batch_end', ni, model, imgs, targets, paths, plots, opt.sync_bn)if callbacks.stop_training:return# end batch ------------------------------------------------------------------------------------------------# Schedulerlr = [x['lr'] for x in optimizer.param_groups]  # for loggersscheduler.step()# =============== show bn weights ===================== #module_list = []for i, layer in model.named_modules():if isinstance(layer, nn.BatchNorm2d) and i not in ignore_bn_list:bnw = layer.state_dict()['weight']bnb = layer.state_dict()['bias']module_list.append(bnw)size_list = [idx.data.shape[0] for idx in module_list]bn_weights = torch.zeros(sum(size_list))bnb_weights = torch.zeros(sum(size_list))index = 0for idx, size in enumerate(size_list):bn_weights[index:(index + size)] = module_list[idx].data.abs().clone()            index += sizeif RANK in [-1, 0]:# mAPcallbacks.run('on_train_epoch_end', epoch=epoch)ema.update_attr(model, include=['yaml', 'nc', 'hyp', 'names', 'stride', 'class_weights'])final_epoch = (epoch + 1 == epochs) or stopper.possible_stopif not noval or final_epoch:  # Calculate mAPresults, maps, _ = val.run(data_dict,batch_size=batch_size // WORLD_SIZE * 2,imgsz=imgsz,model=ema.ema,single_cls=single_cls,dataloader=val_loader,save_dir=save_dir,plots=False,callbacks=callbacks,compute_loss=compute_loss)# Update best mAPfi = fitness(np.array(results).reshape(1, -1))  # weighted combination of [P, R, mAP@.5, mAP@.5-.95]if fi > best_fitness:best_fitness = fi#log_vals = list(mloss) + list(results) + lr + [srtmp]log_vals = list(mloss) + list(results) + lrcallbacks.run('on_fit_epoch_end', log_vals, epoch, best_fitness, fi)callbacks.run('on_fit_epoch_end_prune', bn_weights.numpy(), epoch)# Save modelif (not nosave) or (final_epoch and not evolve):  # if saveckpt = {'epoch': epoch,'best_fitness': best_fitness,'model': deepcopy(de_parallel(model)).half(),'ema': deepcopy(ema.ema).half(),'updates': ema.updates,'optimizer': optimizer.state_dict(),'wandb_id': loggers.wandb.wandb_run.id if loggers.wandb else None,'date': datetime.now().isoformat()}# Save last, best and deletetorch.save(ckpt, last)if best_fitness == fi:torch.save(ckpt, best)if (epoch > 0) and (opt.save_period > 0) and (epoch % opt.save_period == 0):torch.save(ckpt, w / f'epoch{epoch}.pt')del ckptcallbacks.run('on_model_save', last, epoch, final_epoch, best_fitness, fi)# Stop Single-GPUif RANK == -1 and stopper(epoch=epoch, fitness=fi):break# Stop DDP TODO: known issues shttps://github.com/ultralytics/yolov5/pull/4576# stop = stopper(epoch=epoch, fitness=fi)# if RANK == 0:#    dist.broadcast_object_list([stop], 0)  # broadcast 'stop' to all ranks# Stop DPP# with torch_distributed_zero_first(RANK):# if stop:#    break  # must break all DDP ranks# end epoch ----------------------------------------------------------------------------------------------------# end training -----------------------------------------------------------------------------------------------------if RANK in [-1, 0]:LOGGER.info(f'\n{epoch - start_epoch + 1} epochs completed in {(time.time() - t0) / 3600:.3f} hours.')for f in last, best:if f.exists():strip_optimizer(f)  # strip optimizersif f is best:LOGGER.info(f'\nValidating {f}...')results, _, _ = val.run(data_dict,batch_size=batch_size // WORLD_SIZE * 2,imgsz=imgsz,model=attempt_load(f, device).half(),iou_thres=0.65 if is_coco else 0.60,  # best pycocotools results at 0.65single_cls=single_cls,dataloader=val_loader,save_dir=save_dir,save_json=is_coco,verbose=True,plots=True,callbacks=callbacks,compute_loss=compute_loss)  # val best model with plotsif is_coco:callbacks.run('on_fit_epoch_end', list(mloss) + list(results) + lr, epoch, best_fitness, fi)callbacks.run('on_train_end', last, best, plots, epoch, results)LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}")torch.cuda.empty_cache()return resultsdef parse_opt(known=False):parser = argparse.ArgumentParser()parser.add_argument('--st', action='store_true',default=True, help='train with L1 sparsity normalization')parser.add_argument('--sr', type=float, default=0.0001, help='L1 normal sparse rate')parser.add_argument('--weights', type=str, default=ROOT / 'yolov5l.pt', help='initial weights path')parser.add_argument('--cfg', type=str, default='', help='model.yaml path')parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='dataset.yaml path')parser.add_argument('--hyp', type=str, default=ROOT / 'data/hyps/hyp.scratch-low.yaml', help='hyperparameters path')parser.add_argument('--epochs', type=int, default=300)parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs, -1 for autobatch')parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='train, val image size (pixels)')parser.add_argument('--rect', action='store_true', help='rectangular training')parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')parser.add_argument('--noval', action='store_true', help='only validate final epoch')parser.add_argument('--noautoanchor', action='store_true', help='disable AutoAnchor')parser.add_argument('--evolve', type=int, nargs='?', const=300, help='evolve hyperparameters for x generations')parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')parser.add_argument('--cache', type=str, nargs='?', const='ram', help='--cache images in "ram" (default) or "disk"')parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class')parser.add_argument('--optimizer', type=str, choices=['SGD', 'Adam', 'AdamW'], default='SGD', help='optimizer')parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')parser.add_argument('--project', default=ROOT / 'runs/train', help='save to project/name')parser.add_argument('--name', default='exp', help='save to project/name')parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')parser.add_argument('--quad', action='store_true', help='quad dataloader')parser.add_argument('--cos-lr', action='store_true', help='cosine LR scheduler')parser.add_argument('--label-smoothing', type=float, default=0.0, help='Label smoothing epsilon')parser.add_argument('--patience', type=int, default=100, help='EarlyStopping patience (epochs without improvement)')parser.add_argument('--freeze', nargs='+', type=int, default=[0], help='Freeze layers: backbone=10, first3=0 1 2')parser.add_argument('--save-period', type=int, default=-1, help='Save checkpoint every x epochs (disabled if < 1)')parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify')# Weights & Biases argumentsparser.add_argument('--entity', default=None, help='W&B: Entity')parser.add_argument('--upload_dataset', nargs='?', const=True, default=False, help='W&B: Upload data, "val" option')parser.add_argument('--bbox_interval', type=int, default=-1, help='W&B: Set bounding-box image logging interval')parser.add_argument('--artifact_alias', type=str, default='latest', help='W&B: Version of dataset artifact to use')opt = parser.parse_known_args()[0] if known else parser.parse_args()return optdef main(opt, callbacks=Callbacks()):# Checksif RANK in [-1, 0]:print_args(FILE.stem, opt)check_git_status()check_requirements(exclude=['thop'])# Resumeif opt.resume and not check_wandb_resume(opt) and not opt.evolve:  # resume an interrupted runckpt = opt.resume if isinstance(opt.resume, str) else get_latest_run()  # specified or most recent pathassert os.path.isfile(ckpt), 'ERROR: --resume checkpoint does not exist'with open(Path(ckpt).parent.parent / 'opt.yaml', errors='ignore') as f:opt = argparse.Namespace(**yaml.safe_load(f))  # replaceopt.cfg, opt.weights, opt.resume = '', ckpt, True  # reinstateLOGGER.info(f'Resuming training from {ckpt}')else:opt.data, opt.cfg, opt.hyp, opt.weights, opt.project = \check_file(opt.data), check_yaml(opt.cfg), check_yaml(opt.hyp), str(opt.weights), str(opt.project)  # checksassert len(opt.cfg) or len(opt.weights), 'either --cfg or --weights must be specified'if opt.evolve:if opt.project == str(ROOT / 'runs/train'):  # if default project name, rename to runs/evolveopt.project = str(ROOT / 'runs/evolve')opt.exist_ok, opt.resume = opt.resume, False  # pass resume to exist_ok and disable resumeopt.save_dir = str(increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok))# DDP modedevice = select_device(opt.device, batch_size=opt.batch_size)if LOCAL_RANK != -1:msg = 'is not compatible with YOLOv5 Multi-GPU DDP training'assert not opt.image_weights, f'--image-weights {msg}'assert not opt.evolve, f'--evolve {msg}'assert opt.batch_size != -1, f'AutoBatch with --batch-size -1 {msg}, please pass a valid --batch-size'assert opt.batch_size % WORLD_SIZE == 0, f'--batch-size {opt.batch_size} must be multiple of WORLD_SIZE'assert torch.cuda.device_count() > LOCAL_RANK, 'insufficient CUDA devices for DDP command'torch.cuda.set_device(LOCAL_RANK)device = torch.device('cuda', LOCAL_RANK)dist.init_process_group(backend="nccl" if dist.is_nccl_available() else "gloo")# Trainif not opt.evolve:train(opt.hyp, opt, device, callbacks)if WORLD_SIZE > 1 and RANK == 0:LOGGER.info('Destroying process group... ')dist.destroy_process_group()# Evolve hyperparameters (optional)else:# Hyperparameter evolution metadata (mutation scale 0-1, lower_limit, upper_limit)meta = {'lr0': (1, 1e-5, 1e-1),  # initial learning rate (SGD=1E-2, Adam=1E-3)'lrf': (1, 0.01, 1.0),  # final OneCycleLR learning rate (lr0 * lrf)'momentum': (0.3, 0.6, 0.98),  # SGD momentum/Adam beta1'weight_decay': (1, 0.0, 0.001),  # optimizer weight decay'warmup_epochs': (1, 0.0, 5.0),  # warmup epochs (fractions ok)'warmup_momentum': (1, 0.0, 0.95),  # warmup initial momentum'warmup_bias_lr': (1, 0.0, 0.2),  # warmup initial bias lr'box': (1, 0.02, 0.2),  # box loss gain'cls': (1, 0.2, 4.0),  # cls loss gain'cls_pw': (1, 0.5, 2.0),  # cls BCELoss positive_weight'obj': (1, 0.2, 4.0),  # obj loss gain (scale with pixels)'obj_pw': (1, 0.5, 2.0),  # obj BCELoss positive_weight'iou_t': (0, 0.1, 0.7),  # IoU training threshold'anchor_t': (1, 2.0, 8.0),  # anchor-multiple threshold'anchors': (2, 2.0, 10.0),  # anchors per output grid (0 to ignore)'fl_gamma': (0, 0.0, 2.0),  # focal loss gamma (efficientDet default gamma=1.5)'hsv_h': (1, 0.0, 0.1),  # image HSV-Hue augmentation (fraction)'hsv_s': (1, 0.0, 0.9),  # image HSV-Saturation augmentation (fraction)'hsv_v': (1, 0.0, 0.9),  # image HSV-Value augmentation (fraction)'degrees': (1, 0.0, 45.0),  # image rotation (+/- deg)'translate': (1, 0.0, 0.9),  # image translation (+/- fraction)'scale': (1, 0.0, 0.9),  # image scale (+/- gain)'shear': (1, 0.0, 10.0),  # image shear (+/- deg)'perspective': (0, 0.0, 0.001),  # image perspective (+/- fraction), range 0-0.001'flipud': (1, 0.0, 1.0),  # image flip up-down (probability)'fliplr': (0, 0.0, 1.0),  # image flip left-right (probability)'mosaic': (1, 0.0, 1.0),  # image mixup (probability)'mixup': (1, 0.0, 1.0),  # image mixup (probability)'copy_paste': (1, 0.0, 1.0)}  # segment copy-paste (probability)with open(opt.hyp, errors='ignore') as f:hyp = yaml.safe_load(f)  # load hyps dictif 'anchors' not in hyp:  # anchors commented in hyp.yamlhyp['anchors'] = 3opt.noval, opt.nosave, save_dir = True, True, Path(opt.save_dir)  # only val/save final epoch# ei = [isinstance(x, (int, float)) for x in hyp.values()]  # evolvable indicesevolve_yaml, evolve_csv = save_dir / 'hyp_evolve.yaml', save_dir / 'evolve.csv'if opt.bucket:os.system(f'gsutil cp gs://{opt.bucket}/evolve.csv {evolve_csv}')  # download evolve.csv if existsfor _ in range(opt.evolve):  # generations to evolveif evolve_csv.exists():  # if evolve.csv exists: select best hyps and mutate# Select parent(s)parent = 'single'  # parent selection method: 'single' or 'weighted'x = np.loadtxt(evolve_csv, ndmin=2, delimiter=',', skiprows=1)n = min(5, len(x))  # number of previous results to considerx = x[np.argsort(-fitness(x))][:n]  # top n mutationsw = fitness(x) - fitness(x).min() + 1E-6  # weights (sum > 0)if parent == 'single' or len(x) == 1:# x = x[random.randint(0, n - 1)]  # random selectionx = x[random.choices(range(n), weights=w)[0]]  # weighted selectionelif parent == 'weighted':x = (x * w.reshape(n, 1)).sum(0) / w.sum()  # weighted combination# Mutatemp, s = 0.8, 0.2  # mutation probability, sigmanpr = np.randomnpr.seed(int(time.time()))g = np.array([meta[k][0] for k in hyp.keys()])  # gains 0-1ng = len(meta)v = np.ones(ng)while all(v == 1):  # mutate until a change occurs (prevent duplicates)v = (g * (npr.random(ng) < mp) * npr.randn(ng) * npr.random() * s + 1).clip(0.3, 3.0)for i, k in enumerate(hyp.keys()):  # plt.hist(v.ravel(), 300)hyp[k] = float(x[i + 7] * v[i])  # mutate# Constrain to limitsfor k, v in meta.items():hyp[k] = max(hyp[k], v[1])  # lower limithyp[k] = min(hyp[k], v[2])  # upper limithyp[k] = round(hyp[k], 5)  # significant digits# Train mutationresults = train(hyp.copy(), opt, device, callbacks)callbacks = Callbacks()# Write mutation resultsprint_mutation(results, hyp.copy(), save_dir, opt.bucket)# Plot resultsplot_evolve(evolve_csv)LOGGER.info(f'Hyperparameter evolution finished {opt.evolve} generations\n'f"Results saved to {colorstr('bold', save_dir)}\n"f'Usage example: $ python train.py --hyp {evolve_yaml}')def run(**kwargs):# Usage: import train; train.run(data='coco128.yaml', imgsz=320, weights='yolov5m.pt')opt = parse_opt(True)for k, v in kwargs.items():setattr(opt, k, v)main(opt)return optif __name__ == "__main__":opt = parse_opt()main(opt)

直接把完整代码给到你们
链接:https://pan.baidu.com/s/1MRSjUoe9pWKWsOEpFRKTMw?pwd=adzm
提取码:adzm


http://chatgpt.dhexx.cn/article/juEyjVI3.shtml

相关文章

Spark 推测执行(speculative)

一 speculative简介 在spark作业运行中&#xff0c;一个spark作业会构成一个DAG调度图&#xff0c;一个DAG又切分成多个stage&#xff0c;一个stage由多个Task组成&#xff0c;一个stage里面的不同task的执行时间可能不一样&#xff0c;有的task很快就执行完成了&#xff0c;…

INT8 中的稀疏性:加速的训练工作流程和NVIDIA TensorRT 最佳实践

INT8 中的稀疏性&#xff1a;加速的训练工作流程和NVIDIA TensorRT 最佳实践 文章目录 INT8 中的稀疏性&#xff1a;加速的训练工作流程和NVIDIA TensorRT 最佳实践结构稀疏量化在 TensorRT 中部署稀疏量化模型的工作流程案例研究&#xff1a;ResNet-34要求第 1 步&#xff1a;…

非引导方法深度补全系列——1—— 《Sparsity invariant cnns》文章细读

提示&#xff1a;文章写完后&#xff0c;目录可以自动生成&#xff0c;如何生成可参考右边的帮助文档 目录 创新点 论文概述 方法详解 网络结构&#xff1a; 杂七杂八 总结 参考 创新点 1.提出了一种稀疏卷积层&#xff0c;在卷积过程中使用二进制有效性掩码来指示缺失…

模型剪枝三:Learning Structured Sparsity in Deep Neural Networks

论文&#xff1a;https://arxiv.org/abs/1608.03665 代码&#xff1a;https://github.com/wenwei202/caffe/tree/scnn 1 核心思想 前面两篇文章https://blog.csdn.net/cdknight_happy/article/details/110953977和https://blog.csdn.net/cdknight_happy/article/details/1110…

Exploring Sparsity in Image Super-Resolution for Efficient Inference

目录 原文翻译 Abstract 1. Introduction 2. Related Work 4. Our SMSR Network 4.1. Sparse Mask Generation 4.2. Sparse Mask Convolution 5. Experiments 5.1. Implementation Details 5.2. Model Analysis 5.3. Comparison with State-of-the-art Methods Conclusion …

稀疏大模型简述:从MoE、Sparse Attention到GLaM

文 | 唐工源 | 知乎 Sparsity, ..., is another important algorithmic advance that can greatly improve efficiency. 稀疏性&#xff0c;是&#xff08;神经架构搜索&#xff09;之外另一个重要的算法进步&#xff0c;可以大大提高效率。 The use of sparsity in models is…

Sparse Learning

Sparse Learning 本文的内容主要来自余凯老师在CVPR2012上给的Tutorial。前面在总结ScSPM和LLC的时候&#xff0c;引用了很多Tutorial上的图片。其实这个Tutorial感觉写的挺好的&#xff0c;所以这次把它大致用自己的语言描述一下。不过稀疏编码是前两年比较火的东西&#xff0…

理解sparse coding

稀疏编码系列&#xff1a; &#xff08;一&#xff09;----Spatial Pyramid 小结&#xff08;二&#xff09;----图像的稀疏表示——ScSPM和LLC的总结&#xff08;三&#xff09;----理解sparse coding&#xff08;四&#xff09;----稀疏模型与结构性稀疏模型 -------------…

干货!SparCL,将Sparsity用在连续学习任务,相同表现下训练效率提升23倍!

点击蓝字 关注我们 AI TIME欢迎每一位AI爱好者的加入&#xff01; 作者简介 王梓枫 美国东北大学博士生&#xff0c;主要研究方向持续学习&#xff08;Continual Learning&#xff09;。 个人主页&#xff1a;https://kingspencer.github.io/ 报告题目 SparCL: 边缘设备上的稀疏…

MORE CONVNETS IN THE 2020S: SCALING UP KER- NELS BEYOND 51 × 51 USING SPARSITY

论文链接: https://arxiv.org/pdf/2207.03620.pdf code: https://github.com/VITA-Group/SLaKlink MORE CONVNETS IN THE 2020S: SCALING UP KER- NELS BEYOND 51 51 USING SPARSITY 一、引言&#xff08;二&#xff09;、大内核注意力&#xff08;二&#xff09;、卷积中的大…

Self-supervised Learning for Label Sparsity in Computational Drug Repositioning

论文地址&#xff1a;Self-supervised Learning for Label Sparsity in Computational Drug Repositioning 1. Introduction 药物重定位旨在根据已知的药物-疾病关联性揭示上市药物的新用途。其背后的逻辑是&#xff1a;目前市场上的小分子药物具有多靶点特性&#xff0c;这意…

xgboost:分割Sparsity-aware Split Finding

Sparsity-aware Split Finding1 在许多现实问题中&#xff0c;输入 x x x是稀疏的是很常见的。造成稀疏性的可能原因有很多: 1)数据中存在缺失值&#xff1b; 2)统计中频繁出现零项&#xff1b; 3)特征工程的处理结果&#xff0c;如独热编码。 重要的是使算法意识到数据中…

Sparsity constraint稀疏约束详解

Sparsity constraint稀疏约束详解 引子&#xff1a; 线性模型是我们经常使用的一种模型&#xff0c;比如&#xff1a; 文本分类中&#xff0c;bag-of-words 有p 20 K 个特征&#xff0c; 共有 N 5K 个文本样例&#xff1b; 在图像去模糊化&#xff0c;图像分类中&#xff0c;…

[机器学习速成课程] 稀疏性正则化 (Regularization for Sparsity)-学习笔记

稀疏性和 L1 正则化 学习目标&#xff1a; 计算模型大小通过应用 L1 正则化来增加稀疏性&#xff0c;以减小模型大小 降低复杂性的一种方法是使用正则化函数&#xff0c;它会使权重正好为零。对于线性模型&#xff08;例如线性回归&#xff09;&#xff0c;权重为零就相当于完…

机器学习14:稀疏性-Sparsity

现实世界中&#xff0c;问题的特征的数量往往是很大的&#xff0c;而其中起决定性作用的往往是很小的一部分&#xff0c;稀疏规则化算子的引入会学习去掉这些没有信息的特征&#xff0c;也就是把这些特征对应的权重置为 0。 1.稀疏性正则化&#xff1a;L₁ 正则化 稀疏向量通常…

稀疏(sparsity)矩阵的行压缩存储

压缩矩阵行或列来存储矩阵的格式是很普遍的&#xff0c;它们不会存储不必要的元素&#xff08;即空值&#xff09;。但是它们也不是非常有效的&#xff0c;当在一个矩阵-向量积或预解决的每个简单标量中需要间接寻址。行压缩存储方式会把一个稀疏矩阵行的非零数值放在连续的存储…

redis删除锁

redis删除锁 参考&#xff1a;百度安全验证 前言 在分布式系统中&#xff0c;由于redis分布式锁相对于更简单和高效&#xff0c;成为了分布式锁的首先&#xff0c;被我们用到了很多实际业务场景当中。 但不是说用了redis分布式锁&#xff0c;就可以高枕无忧了&#xff0c;如…

Redis进阶: 锁的使用

Redis进阶: 锁的使用 1. 概念1. 原子性2. 事务 2. 使用Redis构建全局并发锁3. Redlock&#xff08;redis分布式锁&#xff09;总结 相关Blog 1. 概念 1. 原子性 原子性 原子性是数据库的事务中的特性。在数据库事务的情景下&#xff0c;原子性指的是&#xff1a;一个事务&…

redis锁的几种实现

1. redis加锁分类 redis能用的的加锁命令分表是INCR、SETNX、SET 2. 第一种锁命令INCR 这种加锁的思路是&#xff0c; key 不存在&#xff0c;那么 key 的值会先被初始化为 0 &#xff0c;然后再执行 INCR 操作进行加一。 然后其它用户在执行 INCR 操作进行加一时&#xff0…

java redis锁_Java中Redis锁的实现

由于具体业务场景的需求,需要保证数据在分布式环境下的正确更新,所以研究了一下Java中分布式锁的实现。 Java分布式锁的实现方式主要有以下三种: 数据库实现的乐观锁 Redis实现的分布式锁 Zookeeper实现的分布式锁 其中,较常用的是前两种方式,但是数据库实现方式需要较多的…