• 16: Infini-Attention: Google's Solution for Infinite Memory in LLMs

  • May 22 2024
  • Length: 23 mins
  • Podcast

16: Infini-Attention: Google's Solution for Infinite Memory in LLMs

  • Summary

  • In this episode of the AI Paper Club Podcast, hosts Rafael Herrera and Sonia Marques welcome Leticia Fernandes, a Deeper Insights Senior Data Scientist and Generative AI Ambassador. Together, they explore the groundbreaking "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" paper from Google. This paper addresses the challenge of fitting infinite context into large language models, introducing the Infini-attention method. The trio discusses how this approach works, including how it uses linear attention and employs compressive memory to store key-value pairs, enabling models to handle extensive contexts.

    We also extend a special thank you to the research team Google for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2404.07143.pdf

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at thepaperclub@deeperinsights.com.


    Show More Show Less
activate_samplebutton_t1

What listeners say about 16: Infini-Attention: Google's Solution for Infinite Memory in LLMs

Average Customer Ratings

Reviews - Please select the tabs below to change the source of reviews.

In the spirit of reconciliation, Audible acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.