A few thoughts on this (very interesting) mechanistic interpretability research:
LLM concepts gain meaning from what they’re linked with. “Consciousness” is a central node which links ethics & cognition, connecting to concepts like moral worthiness, dignity, agency. If LLMs are x.com/plinz/status/1…