캘빈의 감금원's Insight

02-04

This article is machine translated

Show original

a16z crypto posted an article about leading AI security solutions and AI security in the era of Vibe Coding. @peachmint shared it with me, so I read it. The AI security solutions introduced in the article are broadly categorized into three categories: 1. Solutions from the AIxCC (AI Security Solutions Competition hosted by the US Department of Defense) • These solutions focus on automating fuzzing, and most of them integrate existing tools with AI rather than introducing completely new approaches. Fine-tuned models are used for patching vulnerabilities after vulnerability detection. 2. Google's Big Sleep • An agent that mimics the behavior of human security researchers. It primarily discovers memory vulnerabilities in C code and proves them using address sanitizers. • These solutions can only detect vulnerabilities, not patch them. The CodeMender project currently under development at Google is expected to improve this. 3. OpenAI's Aardbark • Rather than focusing on bug detection, it is expected to be more of a reasoning-based assistant that can assist human researchers. The article concludes that in the current era of "vibcoding," programs have inconsistent code and diverse security practices, making it difficult to consistently apply existing security systems. AI security systems, in particular, frequently suffer from hallucinations when identifying and patching vulnerabilities. Nevertheless, the article concludes that AI will be the tool to solve this problem, and that special-purpose models and agent systems are expected to evolve over time. This discussion was a bit more theoretical than expected, which was a bit disappointing. Since we're talking about AIxCC, I'd like to discuss the approaches and recent developments of the finalists at AIxCC. Watch this if you're bored. 1st Place: Team Atlanta's Atlantis • A joint team of Georgia Tech, Samsung Research, KAIST, and POSTECH • Fuzzing + symbolic execution + a fine-tuned proprietary model • Uses agents with different strategies for each language and stage • github.com/Team-Atlanta/aixcc-... 2nd Place: Trail of Bits' Buttercup • Traditional fuzzing tools (e.g., libfuzzer) + non-inference LLM => High cost-effectiveness • github.com/trailofbits/butterc... 3rd Place: Theory's Roboduck • Relies on modern LLM code analysis rather than traditional binary analysis techniques. Traditional techniques serve as a backup • Reproducing the workflow of a human security researcher • Developing a commercial security solution using Xint Code • github.com/theori-io/aixcc-afc... 4th Place: Fuzzing Brain by All You Need Is A Fuzzing Brain • Like Roboduck, it relies on LLM and has a fuzzing fallback. • It runs 23 different LLM strategies in parallel. • github.com/o2lab/afc-crs-all-y... 5th Place: Shellphish's Artiphishell • Joint team from UC Santa Barbara, Arizona State University, and Purdue • Development of GrammarGuy, specialized for fuzzing complex input formats and evolving grammar generation based on LLM coverage feedback • A pipeline that connects static analysis, dynamic analysis, triage, and patching 6th Place: 42-b3yong-bug's BugBuster • Joint team from Northwestern University, etc. • Fuzzing-focused vulnerability detection • Ranked second in number of vulnerabilities detected, but low patch success rate resulted in a low final ranking 7th Place: Lacrosse from US defense company SIFT • Modernizing a 10-year-old legacy system • Fuzzing + symbolic reasoning • github.com/siftech/afc-crs-lac... Original text: a16zcrypto.com/posts/article/a...

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content