Releases: EleutherAI/concept-erasure
v0.2.4
v0.2.2
v0.2.1
Removes HuggingFace datasets
and transformers
as hard dependencies. These libraries are now only imported when the relevant functions from the scrubbing
module are imported.
This release also removes a handful of vestigial utility functions which were not used by the core API.
v0.2.0
Added Oracle LEACE implementation (#2) which achieves even more surgical edits than standard LEACE when ground-truth concept labels are available at inference time. The classes OracleFitter
and OracleEraser
are designed to work almost exactly like LeaceFitter
and LeaceEraser
except that OracleEraser
requires an extra z
positional argument in its forward
method.
This release also fixes a subtle bug in our covariance matrix shrinkage implementation which caused NaN results when the data has zero variance (#3).
v0.1.0
Refactoring
ConceptEraser
has been split into two separate classes, LeaceFitter
and LeaceEraser
. This makes it easy to save the fitted erasure function by itself in a compact format, without also saving the covariance and cross-covariance statistics used to create it.
Algorithmic changes
We now use the asymptotically optimal shrinkage formula from this paper to shrink the covariance matrix of X toward a multiple of the identity matrix. Under weak assumptions, this provably speeds up the convergence of the covariance matrix estimate toward the population covariance matrix. Prior versions had used the raw sample covariance matrix with no shrinkage, which can cause numerical instability and very suboptimal edits when the sample size is low.