AutoCircuit
A library for efficient patching and automatic circuit discovery.
Installation
Easy and Efficient Edge Patching
patch_edges = [
"Resid Start->MLP 2",
"MLP 2->A2.4.Q",
"A2.4->Resid End",
]
with patch_mode(model, ablations, patch_edges):
patched_out = model(tokens)
Different Ablation Methods
Automatic Circuit Discovery
attribution_scores: PruneScores = mask_gradient_prune_scores(
model=model,
dataloader=train_loader,
official_edges=None,
grad_function="logit",
answer_function="avg_diff",
mask_val=0.0,
)