Tag: interpretability

All the articles with the tag "interpretability".

Goodire's $150M Breakthrough: AI Models Already Know When They Hallucinate

18 Feb, 2026
• 1 min read

Turns out LLMs often know they're hallucinating – this startup uses that insight to slash errors GPT-4 to GPT-5 style.

Read more
New Research Lights Up Hidden Racial Bias in Healthcare LLMs – And How to Zap It

21 Jan, 2026
• 1 min read

Sparse autoencoders just exposed how LLMs sneak race into medical advice – a dev must-fix before regulators notice.

Read more