Hannah Brown
PhD Student @ NUS | Researching trustworthiness in LLMs 📚✨
Website
Website
GitHub
GitHub
LinkedIn
LinkedIn
Email
Email
AAAI '25
Paper
Paper
Slides
Slides
Poster
Poster
Single Character Perturbations Break LLM Alignment Other Tokens and Models Other single character tokens are effective as well Most effective found include other punctuation marks This phenomenon is observed for Claude and GPT-3.5 as well using template prefilling Add space Top-k predicted tokens...
My Website
My Website
Google Scholar
Google Scholar
View on mobile
Explore other Linktrees
Lolife
@lolife.wav
TMG Studios
@tmgstudios
Tate McRae
@tatemcrae
jenniferhudsonshow
@jenniferhudsonshow
Stanley Tucci
@stanleytucci
Tika the Iggy
@tikatheiggy
Katie Lynn
@katielynnteaches
Alli Weatherly
@alliweatherly
davestewart
@davestewart
goodnoticingspod
@goodnoticingspod
previous
next
Discover more
See all
@Fatasa
@VINmounts
@maddyskalos
@brianmunyae
@sunshinecottagesh
@amerihotdogs
@Hhusse77
@stuartst.cellars
@suzukimotosec
@eurogeriogouveia
@curiousbynature
@eloysegura
@HolyPizza
@nutrinafri
@SALON.DE.EVENTOS.ALEGRIA
@PaintandHueCreations
@mengl
@culaccinocaffe
@rachelhur
@neuraldsp
@enerelkosmetika
@unboundedestinations
@ActivateModelo_
@sundayjamy
@pastel.rifqi3
@amitozbir
@coldwavedrinks
@Clarisa_Ayllon
@rosehandyworks26
@perennialyoga
@asonyemcdonald
@filmbylivv
@sammienewj15
@andosalab
@RafaelCoelhoPersonal
@fleur_ism
@mads.creativee
@alicewild
@khu_cll_zhangming
@ry6nr
@Rekomendasi2
@lindsey_young
@SimonT18
@corretoraizabela
@maruchanmuncher
✕