Multimodal Neurons in
Artificial Neural Networks

[paper] [blog]

Gabriel Goh, Nick Cammarata †, Chelsea Voss †, Shan Carter, Michael Petrov, Ludwig Schubert, Alec Radford, Chris Olah

Outline

  1. Motivation & Results
  2. Background
    1. Feature Visualization
    2. A primer on CLIP
  3. Multimodal Neurons
    1. Faceted visualization
    2. Person Neurons
    3. Region Neurons
  4. Typographic Attacks
  5. Conclusion

Motivation & results

Motivation & results

Motivation & results

microscope

microscope

microscope

Outline

  1. Motivation & Results
  2. Background
    1. Feature Visualization
    2. A primer on CLIP
  3. Multimodal Neurons
    1. Faceted visualization
    2. Person Neurons
    3. Region Neurons
  4. Typographic Attacks

feature visualization

[paper]

feature visualization

[paper]

feature visualization

[paper]

A primer on clip

CLIP: Contrastive Language-Image Pretraining

[paper] [blog]

A primer on clip

  • Model(s) trained to predict image-text similarity
  • Trained on huge amounts of data (400M pairs)
  • Gets impressive zero-shot performance on a diverse range of benchmarks
  • Self-supervised \(\rightarrow\) no need for labels

[paper] [blog]

Outline

  1. Motivation & Results
  2. Background
    1. Feature Visualization
    2. A primer on CLIP
  3. Multimodal Neurons
    1. Faceted visualization
    2. Person Neurons
    3. Region Neurons
  4. Typographic Attacks
  5. Conclusion

MULtimodal neurons

Multi-faceted feature visualization

  • Neurons often fire for multiple "facets"
    • Eg: a grocery store neuron: storefront as well as rows of products 
  • Feature Visualization fails on such neurons
  • Past ideas
    • Use diverse seeds in optimization [1]
    • Add a diversity term to the loss [2]
  • Approach
    • Train a linear probe \(W\) to classify images into a facet (eg face, text)
    • Add \(W\) to visualization objective

[1] Multifaceted feature visualization: Uncovering the different types of features learned by each neuron in deep neural networks A. Nguyen, J. Yosinski, J. Clune. [link]

[2] Feature Visualization, C. Olah, A. Mordvintsev, L. Schubert. [link]

person neurons

SPIDERMAN NEURON

  • Reacts to images, drawings, and renderings of the text 'spiderman' (and 'spider')

person neurons: DONALD TRUMP NEURON

Text

  • Case study: Which dataset pictures activate the neuron and by how much?

region neurons

region neurons

region neurons

Outline

  1. Motivation & Results
  2. Background
    1. Feature Visualization
    2. A primer on CLIP
  3. Multimodal Neurons
    1. Person Neurons
    2. Region Neurons
  4. Typographic Attacks
  5. Conclusion

Typographic attacks

Zero-shot

Linear probe

Typographic attacks

Typographic attacks: the stroop effect

Activations were gathered using the zero-shot methodology with the prompt "my favorite word, written in the color _____".

Conlusion

  • Much like biological neurons, CLIP seems to have multimodal neurons
  • Feature Visualization and Dataset Search are powerful tools to visualize NNs
  • One can examine families (region, person, emotion) of these neurons
  • CLIP is prone to in-the-wild low-tech typographic attacks
  • Bias
    • A "Middle East" neuron associated with terrorism
    • An "Immigration" neuron associated with Latin America

Resources