Navigation

    Gpushare.com

    • Register
    • Login
    • Search
    • Popular
    • Categories
    • Recent
    • Tags

    图像描述【1】

    语音识别与语义处理领域
    1
    1
    49
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • 188****7632
      188****7632 last edited by

      看的有个组 最近连续发了image caption的文章 aaai2021两篇 cvpr2021两篇
      之前那篇看过了,这次介绍剩下的那篇aaai2021
      总的来说,aaai的质量确实比cvpr差不少

      看的这一篇就没啥东西,主要是会讲故事,会包装,aaai2021文章都没和CVPR2020去比较
      Improving Image Captioning by Leveraging Intra- and Inter-layer Global
      Representation in Transformer Network AAAI2021

      模型架构如下:

      用我的理解来看,就在在标准的transformer架构上,加了一个全局表示


      经过encoder编码后,这样会有若干个全局表示,用LSTM得到最终表示

      然后decoder部分,把全局表示拼接起来,self-attention加cross-attention就完成了

      没什么有价值的insight

      参考文献:
      https://arxiv.org/abs/2012.07061

      1 Reply Last reply Reply Quote 2
      • First post
        Last post