[Remote] AI Safety Research Intern (PhD)

Remote, USA Full-time

Note: The job is a remote job and is open to candidates in USA. Centific is focused on advancing AI safety and responsible AI development. As a Ph.D. Research Intern, you will conduct high-impact experiments and contribute to the security guarantees of AI systems through innovative research and practical implementations. Responsibilities Advance AI Safety: Design, implement, and evaluate attack and defense strategies for LLM jailbreaks (prompt injection, obfuscation, narrative red teaming) Evaluate AI Behavior: Analyze and simulate human-AI interaction patterns to uncover behavioral vulnerabilities, social engineering risks, and over-defensive vs. permissive response tradeoffs Agentic AI Security: Prototype workflows for multi-agent safety (e.g., agent self-checks, regulatory compliance, defense chains) that span perception, reasoning, and action Benchmark & Harden LLMs: Create reproducible evaluation protocols/KPIs for safety, over-defensiveness, adversarial resilience, and defense effectiveness across diverse models (including latest benchmarks and real-world exploit scenarios) Deploy and Monitor: Package research into robust, monitorable AI services using modern stacks (Kubernetes, Docker, Ray, FastAPI); integrate safety telemetry, anomaly detection, and continuous red-teaming Jailbreaking Analysis: Systematically red-team advanced LLMs (GPT-4o, GPT-5, LLaMA, Mistral, Gemma, etc.), uncovering novel exploits and defense gaps Multi-turn Obfuscation Defense: Implement context-aware, multi-turn attack detection and guardrail mechanisms, including countermeasures for obfuscated prompts (e.g., StringJoin, narrative exploits) Agent Self-Regulation: Develop agentic architectures for autonomous self-check and self-correct, minimizing risk in complex, multi-agent environments Human-Centered Safety: Study human behavior models in adversarial contexts—how users probe, trick, or manipulate LLMs, and how defenses can adapt without excessive over-defensiveness Skills Ph.D. student in CS/EE/ML/Security (or related); actively publishing in AI Safety, NLP robustness, or adversarial ML (ACL, NeurIPS, BlackHat, IEEE S&P, etc.) Strong Python and PyTorch/JAX skills; comfort with toolkits for language models, benchmarking, and simulation Demonstrated research in at least one of: LLM jailbreak attacks/defense, agentic AI safety, human-AI interaction vulnerabilities Proven ability to go from concept → code → experiment → result, with rigorous tracking and ablation studies Experience in adversarial prompt engineering, jailbreak detection (narrative, obfuscated, sequential attacks) Prior work on multi-agent architectures or robust defense strategies for LLMs Familiarity with red-teaming, synthetic behavioral data, and regulatory safety standards Scalable training and deployment: Ray, distributed evaluation, CI/telemetry for defense protocols Public code artifacts (GitHub) and first-author publications or strong open-source impact Benefits Comprehensive healthcare, dental, and vision coverage 401k plan Paid time off (PTO) And more! Company Overview Zero distance innovation for GenAI creators and industries Expertly engineering platforms and curating multimodal, multilingual data, we empower the ‘Magnificent Seven’ and enterprise clients with safe, scalable AI deployment We a team of over 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers. It was founded in 2020, and is headquartered in Redmond, Washington, USA, with a workforce of 5001-10000 employees. Its website is Company H1B Sponsorship Centific has a track record of offering H1B sponsorships, with 10 in 2025, 22 in 2024, 14 in 2023. Please note that this does not guarantee sponsorship for this specific role.

Apply Now

Experienced Part-Time Airbnb Host Property Assistant for Luxury Vacation Rentals in Tybee Island, GA – Excellent Customer Service and Property Management Skills Required

Remote, USA Full-time

Experienced Overnight Customer Service Representative – Remote 3rd Shift Loan Approval and Customer Support

Remote, USA Full-time

Back to Home

[Remote] AI Safety Research Intern (PhD)

Similar Jobs

Applications Engineer I

Canada Immigration Law Clerk - Associate - Vancouver

[Remote] GenAI PhD Applied Scientist Intern - Oracle Cloud Infrastructure (OCI)

[Remote] Medicare Sales Field Agent - Lake Charles, LA

Nursing Informatics Summer Clinical Intern

[Remote] Financial Analyst (Remote)

[Remote] 2026 Summer Internship Program: Pharmacovigilance (PV) Operations Intern

Master's Machine Learning Internship Summer Term 2026 (Toronto)

Clinical Pharmacology and Quantitative Science Intern (Programming/Computer Science)

2026 Summer Internship Program: Oncology Computational Biology Intern

Sales Manager - SMB Outbound Sales

Experienced Remote Chat Support Agent – Delivering Exceptional Customer Service and Earning $20/hr with arenaflex

Experienced Customer Service Representative – Work from Home Opportunity with Arsenault

Experienced Customer Service Representative – Retail Sales and Store Operations Support at blithequark

Remote Pilot Operator at ZMA

Experienced Tier 1 Support Specialist (Remote - Customer Service) – Clinical Communications and Scheduling Expert

Experienced Part-Time Data Entry Specialist – Remote Work Opportunities with Flexible Hours and Competitive Pay Rates

Experienced Remote Customer Service Representative – Deliver Exceptional Blithequark Customer Experience

Experienced Part-Time Airbnb Host Property Assistant for Luxury Vacation Rentals in Tybee Island, GA – Excellent Customer Service and Property Management Skills Required

Experienced Overnight Customer Service Representative – Remote 3rd Shift Loan Approval and Customer Support

[Remote] AI Safety Research Intern (PhD)

Similar Jobs

Applications Engineer I

Canada Immigration Law Clerk - Associate - Vancouver

[Remote] GenAI PhD Applied Scientist Intern - Oracle Cloud Infrastructure (OCI)

[Remote] Medicare Sales Field Agent - Lake Charles, LA

Nursing Informatics Summer Clinical Intern

[Remote] Financial Analyst (Remote)

[Remote] 2026 Summer Internship Program: Pharmacovigilance (PV) Operations Intern

Master's Machine Learning Internship Summer Term 2026 (Toronto)

Clinical Pharmacology and Quantitative Science Intern (Programming/Computer Science)

2026 Summer Internship Program: Oncology Computational Biology Intern

Sales Manager - SMB Outbound Sales

Experienced Remote Chat Support Agent – Delivering Exceptional Customer Service and Earning $20/hr with arenaflex

**Experienced Customer Service Representative – Work from Home Opportunity with Arsenault**

Experienced Customer Service Representative – Retail Sales and Store Operations Support at blithequark

Remote Pilot Operator at ZMA

**Experienced Tier 1 Support Specialist (Remote - Customer Service) – Clinical Communications and Scheduling Expert**

Experienced Part-Time Data Entry Specialist – Remote Work Opportunities with Flexible Hours and Competitive Pay Rates

**Experienced Remote Customer Service Representative – Deliver Exceptional Blithequark Customer Experience**

Experienced Part-Time Airbnb Host Property Assistant for Luxury Vacation Rentals in Tybee Island, GA – Excellent Customer Service and Property Management Skills Required

**Experienced Overnight Customer Service Representative – Remote 3rd Shift Loan Approval and Customer Support**

Experienced Customer Service Representative – Work from Home Opportunity with Arsenault

Experienced Tier 1 Support Specialist (Remote - Customer Service) – Clinical Communications and Scheduling Expert

Experienced Remote Customer Service Representative – Deliver Exceptional Blithequark Customer Experience

Experienced Overnight Customer Service Representative – Remote 3rd Shift Loan Approval and Customer Support