Sports players official
auto_circuit.metrics.official_circuits.circuits.sports_players_official
Classes
Functions
sports_players_probe_true_edges
sports_players_probe_true_edges(model: PatchableModel, token_positions: bool = False, word_idxs: Dict[str, int] = {}, seq_start_idx: int = 0) -> Set[Edge]
Wrapper for
sports_players_true_edges
that does not include the extract_sport
section of the circuit. Instead, we extend
the lookup
MLP stack to the final layer of the model.
This is included to make it easier to reproduce the probing results from the post
which just probe the MLP stack and ignore the extract_sport
section.
Source code in auto_circuit/metrics/official_circuits/circuits/sports_players_official.py
sports_players_true_edges
sports_players_true_edges(model: PatchableModel, token_positions: bool = False, word_idxs: Dict[str, int] = {}, seq_start_idx: int = 0) -> Set[Edge]
The full Sports Players circuit from input to output, as discovered by Rajamanoharan et al. (2023).
Read the source code comments for precise details on our interpretation of the circuit. The focus of the paper was on probing for sports features, so the exact set of edges that constitute the circuit is somewhat ambiguous.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
PatchableModel
|
A patchable TransformerLens Pythia 2.8B |
required |
token_positions |
bool
|
Whether to distinguish between token positions when returning
the set of circuit edges. If |
False
|
word_idxs |
Dict[str, int]
|
A dictionary defining the index of specific named tokens in the circuit definition. For this circuit, the required tokens positions are:
|
{}
|
seq_start_idx |
int
|
Offset to add to all of the token positions in |
0
|
Returns:
Type | Description |
---|---|
Set[Edge]
|
The set of edges in the circuit. |
Source code in auto_circuit/metrics/official_circuits/circuits/sports_players_official.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
|