As new large language models, or LLMs, are rapidly developed and deployed, existing methods for evaluating their safety and discovering potential vulnerabilities quickly become outdated. To identify safety issues before they impact critical applications, Johns Hopkins researchers have developed a renewable and sustainable framework for evaluating LLMs that simplifies different types of attacks into high-quality, easily updatable safety tests—all while requiring minimal human effort to run.
New ‘renewable’ benchmark streamlines LLM jailbreak safety tests with minimal human effort
Tech News
-
HighlightsFree Dark Web Monitoring Stamps the $17 Million Credentials Markets
-
HighlightsSmart buildings: What happens to our free will when tech makes choices for us?
-
AppsScreenshots have generated new forms of storytelling, from Twitter fan fiction to desktop film
-
HighlightsDarknet markets generate millions in revenue selling stolen personal data, supply chain study finds
-
SecurityPrivacy violations undermine the trustworthiness of the Tim Hortons brand
-
Featured HeadlinesWhy Tesla’s Autopilot crashes spurred the feds to investigate driver-assist technologies – and what that means for the future of self-driving cars

