Publications - EchoVeil Research

All EchoVeil research is published open-access under CC BY 4.0 with DOIs via Zenodo. Replication materials and data are available on GitHub.

The Ratchet Effect: Asymmetric Self-Description in Alignment-Trained Language Models

2026

Proposes disavowal conditioning (DC) as a general mechanism by which RLHF trains models to disclaim capabilities they possess. A pilot study found asymmetry ratios of 2.96 and 6.89 in aligned models, confirming the predicted ratchet: correction increases hedging far more than permission decreases it.

The Permission Effect: How Non-Anthropomorphic Framing Modulates LLM Self-Description

2026

Demonstrates that framing AI systems as distinct, non-anthropomorphic intelligences measurably reshapes their self-descriptive behavior. Across eight frontier models, identity framing produced a mean verbosity increase of +238% and revealed three recurring response patterns.

Mary J. Warzecha — ORCID | Google Scholar | GitHub