Activation magnitude
auto_circuit.prune_algos.activation_magnitude
Attributes
Classes
Functions
activation_magnitude_prune_scores
activation_magnitude_prune_scores(model: PatchableModel, dataloader: PromptDataLoader, official_edges: Optional[Set[Edge]]) -> PruneScores
Simple baseline circuit discovery algorithm. Prune scores are the mean activation magnitude of each edge.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
PatchableModel
|
The model to find the circuit for. |
required |
dataloader |
PromptDataLoader
|
The dataloader to use for input. |
required |
official_edges |
Optional[Set[Edge]]
|
Not used. |
required |
Returns:
Type | Description |
---|---|
PruneScores
|
An ordering of the edges by importance to the task. Importance is equal to the absolute value of the score assigned to the edge. |