Detecting Strategic Deception Using Linear Probes, We thus evaluate if linear probes can robustly detect deception by monitoring model activations.
© 2020 Neurons.
Designed By Fly Themes.