Navigation

    Gpushare.com

    • Register
    • Login
    • Search
    • Popular
    • Categories
    • Recent
    • Tags

    Learning Rate Decay

    语音识别与语义处理领域
    1
    1
    37
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • 155****7220
      155****7220 last edited by

      学习率对整个函数模型的优化起着至关重要的作用

      最左边的图由于learning rate设置小了,可能需要大量时间才能找到全局最小值;中间的图表示learning rate设置的刚刚好,则很快就能找到全局最小值;最右边的图表示learning rate设置过大,可能造成loss忽大忽小,无法找到全局最小值

      由此可以看出,选择合适的learning rate是很讲究技巧的。如下图所示,设置一个可以自动衰减的learning rate可能会在一定程度上加快优化

      在pytorch中有一个函数可以帮助我们实现learning rate decay

      class torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=10,
      verbose=False, threshold=0.0001, threshold_mode='rel', cooldown=0, min_lr=0, eps=1e-8)
      
      # patience=10代表的是耐心值为10,
      # 当loss出现10次不变化时,即开始调用learning rate decay功能
      optimizer = torch.optim.SGD(model.parameters(),
                                  args.lr,
                                  momentum=args.momentum,
                                  weight_decay=args.weight_decay)
      scheduler = ReduceLROnPlateau(optimizer, 'min')
      
      for epoch in xrange(args.start_epoch, args.epochs):
      	train(train_loder, model, criterion, optimizer, epoch)
      	result_avg, loss_val = validate(val_loder, model, criterion, epoch)
      	scheduler.step(loss_val)
      	# 设置监听的是loss
      
      1 Reply Last reply Reply Quote 1
      • First post
        Last post