Media Summary: This has been my favorite video so far to make! I think interpretability is so important both in terms of ensuring safe AI and also ... One of the core roadblocks to understanding the computation inside a transformer is the fact that individual neurons do not seem ... Warning: This is an ad-libbed talk, and I'm sure I got some facts wrong. This is a talk I gave to my MATS 9.0 training program on ...
Sparse Autoencoders Unlearn Knowledge In Llms A Paper Based Walkthrough - Detailed Analysis & Overview
This has been my favorite video so far to make! I think interpretability is so important both in terms of ensuring safe AI and also ... One of the core roadblocks to understanding the computation inside a transformer is the fact that individual neurons do not seem ... Warning: This is an ad-libbed talk, and I'm sure I got some facts wrong. This is a talk I gave to my MATS 9.0 training program on ... In this video, we dive deep into the world of A visual explanation of how transformers piece concepts together, told in the style of 3Blue1Brown. Introducing SAEs. What truly ... Links to the book: - (Amazon) - (Manning) Link to the GitHub repository: ...
Description This presentation provides a comprehensive survey of "SAeUron: Interpretable Concept In this AI Research Roundup episode, Alex discusses the