Representational Similarity via Interpretable Visual Concepts

Neehar Kondapaneni, Oisin Mac Aodha, Pietro Perona

California Institute of Technology, University of Edinburgh

Abstract

How do two deep neural networks differ in how they arrive at a decision? Measuring the similarity of deep networks has been a long-standing open question. Most existing methods provide a single number to measure the similarity of two networks at a given layer, but give no insight into what makes them similar or dissimilar. We introduce an interpretable representational similarity method (RSVC) to compare two networks. We use RSVC to discover shared and unique visual concepts between two models. We show that some aspects of model differences can be attributed to unique concepts discovered by one model that are not well represented in the other. Finally, we conduct extensive evaluation across different vision model architectures and training protocols to demonstrate its effectiveness.

Methods

Concept based explanation methods provide insight into model behavior by revealing visual concepts the model has discovered during training. Consider two different models trained on the same dataset, we would like to understand how concepts differ between two models and if these concepts can explain performance differences. RSVC tackles this question by (1) extracting concepts for Model 1, (2) asking Model 2 to predict Model 1's concepts, and (3) measuring the quality of the prediction.

In this example, we use non-negative matrix factorization to extract concepts for Model 1. We find that Model 1 has discovered a concept for a bluejay's tail and a sky background. We collect activations from Model 2, over the same images and fit a regression model to the concept coefficients of Model 1. We measure how the predicted coefficients differ from the original coefficients using Pearson correlation. Larger correlation values indicate that Model 2 shares the concept with Model 1. In this case, Model 2 does not strongly predict the bluejay tail concept, but shares the sky background concept.

Analysis

Toy Concept Experiment

Can RSVC recover known conceptual differences? We train Model 1 to associate a pink square with the Common Eider class and Model 2 to be invariant to the pink square. If RSVC works as expected, then it should discover that the pink square concept is unique to Model 1.

We show that RSVC can detect this known difference. In the green box, we visualize an image collage containing images that have the largest coefficients for the extracted concept. We see that the common feature in all of the image patches is the pink square, so we determine it to be the pink square concept that we trained on. We see that the predicted coefficients (from Model 2) are very different from Model 1's coefficients for the pink square concept. We also visualize image collages that contain image patches with over-predicted coefficients and under-predicted coefficients. We find that the regression model is unable to disentangle images of water without the pink square from images of water with the pink square.

Analyzing "in-the-wild" Concepts

Interpreting Low-Similarity Concepts In this example, we find a RN50 concept for the barbell class that the ViT-S is not able to predict. (Green): The RN50 concept reacts to images of hands lifting barbells. Additionally, many images contain vertical supports for a squat rack. We train a regression model on the ViT-S activations to predict the RN50 concept coefficients. (Blue): The ViT-S regression model under-reacts to images containing hands, people, and squat racks. (Orange): It over-reacts to images that have a greater focus on weight plates. These results suggest that the the specific concept of hands lifting barbells is not represented in the ViT-S. In the paper, we use an LLVM to analyze the image collages (IC1 and IC2) and find that it detects similar differences in the visualizations.

Unique & Important Concepts

Do models learn important and unique concepts? Models are trained on ImageNet. We use ResNets and ViTs. We find that model's learn concepts that have (1) low similarity and low importance (2) high similarity and high importance and (3) low similarity and high importance. The last category of concepts are very interesting, since it indicates that one model has discovered an important concept that the other model has not learned. However, we find that the bulk of model differences can be attributed to medium similarity, medium importance concepts.

LLVM (ChatGPT-4o) Analysis of More Examples

We analyze several low-similarity concept comparisons using an LLVM (ChatGPT-4o). We ask the LLVM to analyze the differences between the top-k concept collage (IC1) and the over-predicted collage (IC2). We exclude the under-predicted image collage to simplify the inputs for the LLVM. The LLVM is tasked with describing both similarities and differences. We find that for these low-similarity collages, ChatGPT-4o is able to identify semantic differences between the concepts encoded by the two models.

Layerwise Concept Similarity

Layerwise Concept Similarity. We ask how concepts across different layers of two networks relate. We find that concept similarity is higher in earlier layers, decreases in the middle and spikes slightly towards the end of the network, suggesting that the classification task biases model's to organize information in a more similar way towards the end of the network. Interestingly, various aspects of this result have been confirmed in related work in representational similarity and interpretability.

BibTeX


        @inproceedings{kondapaneni2025representational,
          title={Representational Similarity via Interpretable Visual Concepts},
          author={Kondapaneni, Neehar and Mac Aodha, Oisin and Perona, Pietro},
          journal={The Thirteenth International Conference on Learning Representations},
          year={2025}
        }