Skip to content

AutoCircuit

A library for efficient patching and automatic circuit discovery.

Installation

pip install auto-circuit

Easy and Efficient Edge Patching

patch_edges = [
    "Resid Start->MLP 2",
    "MLP 2->A2.4.Q",
    "A2.4->Resid End",
]
with patch_mode(model, ablations, patch_edges):
    patched_out = model(tokens)

Different Ablation Methods

ablations = src_ablations(model, test_loader, AblationType.TOKENWISE_MEAN_CORRUPT)

Automatic Circuit Discovery

attribution_scores: PruneScores = mask_gradient_prune_scores(
    model=model,
    dataloader=train_loader,
    official_edges=None,
    grad_function="logit",
    answer_function="avg_diff",
    mask_val=0.0,
)

Visualization

fig = draw_seq_graph(model, prune_scores)