EUISUH JEONG
Engineer by training, researcher by drift.
I'm an AI researcher and software engineer currently serving as a Staff Sergeant with the Republic of Korea Air Force, where I build computer-vision systems for runway integrity.
Before the service, I helped found AIxamine at QCRI — a platform that stress-tests language models against safety benchmarks. I'm a Carnegie Mellon CS '22 grad, with a minor in mathematical studies.
I've been moving since I was three. Seoul, then a small town in the US, then back to Seoul, then India for secondary, then Qatar and the US for undergrad, then Qatar for work, then home again. Cultures stack, like middleware. The interesting work happens in the seams.
What I'm doing this week.
Runway integrity, end-to-end.
- Re-training the defect classifier on a fresh 12k batch
- Reading: Vision-Language Models for Industrial Inspection
- Pulling Korean military service to a close in Q4 — looking for what comes next
Six years, three time zones.
AI Researcher · Staff Sergeant · Squad Leader
Led squad developing a deep-learning system for automated runway pavement evaluation. Directed reconstruction of a high-quality dataset (sourced from low-quality captures, valued north of $500K) and drove a 38% relative accuracy gain — from ~62% to >84% — through feature engineering, postprocessing, and targeted data curation against model bias. Full project lifecycle as technical lead; authored the project paper.
Research Assistant
Co-developed aiXamine — a black-box LLM safety evaluation platform with 40+ benchmarks across 8 security dimensions. Built the modular reporting + visualization architecture; evaluated 50+ models across 2K+ exams, surfacing vulnerabilities in GPT-4o, Grok-3, and Gemini 2.0. Also investigated backdoor Trojan attacks on code-focused LLMs (finetuning + susceptibility testing).
Junior Software Engineer
Built a multi-channel notification system (SMS, email, push) for the consumer fintech app. Migrated payment processing to a compliant platform under regulatory scrutiny. Designed and shipped a Clubhouse-style waiting list + lottery system tied to FIFA World Cup Qatar 2022.
Teaching Assistant · 11-785 Deep Learning (PhD-level)
Planned and delivered lectures, recitations, and assignments to 350+ students in CMU's flagship deep-learning course. Mentored research projects and guided exploration of novel directions.
B.S. Computer Science · Minor, Mathematical Studies · University Honors
Split between the Doha and Pittsburgh campuses. Coursework concentrated in systems, machine learning, and applied math.
Things I built that went live.
Runway Evaluation System
Detects cracks and surface defects on airbase runways and computes PCI scores from drone-collected pavement imagery. I instrumented the data-labeling pipeline, commanded the labeling team, trained the segmentation + classification backbone, and built the async FastAPI + Celery + Redis backend that ties capture, queue, inference, and reporting together. Currently in operational use.
AIxamine
A safety-evaluation platform for language models. Run a model through a battery of bias, robustness, and jailbreak benchmarks; get an honest scorecard back. One of the founding members; co-author on the accompanying paper.
LLM Code Poisoning & Vulnerability Induction
Dissertation exploring data-poisoning attacks that coerce code-LLMs into emitting vulnerable source. Designed and tested stealthy trigger-based backdoors; fine-tuned code LLMs on poisoned corpora to measure susceptibility; analyzed model limitations in vulnerability detection. Companion work to the AIxamine research direction.
Papers and conference work.
AIxamine: A Comprehensive Safety Evaluation Platform for Large Language Models
Add additional talks or publications here
Notes & long-form.
operator.html when you have them. Until then, these are scaffolds.POSTS arraySix cities, one stack.
Get in touch.
Open to research collaborators, post-service roles, and the occasional good email. Fastest reply on LinkedIn.