AI Researcher/ Engineer (LLM)

Pantheon Lab Limited-company-logo
AI Researcher/ Engineer (LLM)
Pantheon Lab Limited
數據科學
中西區, 香港
7 天前
全職
辦公室工作
科技、資訊和媒體
工作描述
7 天前
About The Role

We are seeking an experienced AI Researcher/ Engineer (LLM) to manage the design, deployment, and optimization of production-grade language model systems. This role involves building applications using both commercial LLM APIs and self-hosted open-source models, implementing RAG pipelines, and creating end-to-end LLM workflows. The ideal candidate combines practical experience integrating LLM APIs with technical expertise in deploying and optimizing local models.

Key Responsibilities
• Design and implement high-throughput, low-latency serving architectures for LLM applications
• Build and maintain RAG pipelines and end-to-end LLM workflows
• Integrate and optimize commercial LLM APIs (OpenAI, Anthropic, Google, etc.) into production systems
• Develop prompt engineering techniques and prompt management systems
• Deploy and serve local open-source language models for specific use cases
• Optimize local model inference performance through efficient serving frameworks
• Fine-tune models to improve performance on domain-specific tasks (commercial or
• Monitor and troubleshoot production LLM systems to ensure reliability
• Research and experiment with emerging models and techniques to improve system capabilities
• Document architectures, best practices, and technical decisions
• Collaborate with engineering teams to integrate LLM capabilities into products
• Communicate technical terms and recommendations to stakeholders
• Build evaluation frameworks to measure model quality, latency, cost, and user satisfaction
• Design intelligent routing and fallback strategies across multiple LLM providers
• Scale LLM services to handle production workloads efficiently
• Implement caching, batching, and request optimization strategies for both APIs and local models

Experience

Required Qualifications
• 2+ years of software engineering experience with 2+ years focused on LLM applications
• Proven track record of building production LLM applications
• Experience integrating and optimizing commercial LLM APIs
• Hands-on experience deploying local models in production environments

Technical Skills
• Strong Python programming with emphasis on async/await patterns and production-quality code
• Deep understanding of transformer architectures and LLM fundamentals
• Experience with LLM APIs (OpenAI, Anthropic Claude, Google Gemini, or similar)
• Familiarity with local open-source models (Qwen, Llama, Mistral, or similar)
• Experience with RAG implementation using LlamaIndex or similar frameworks
• Proficiency with FastAPI for building high-performance APIs
• Experience with vector databases (Pinecone, Weaviate, Chroma, Milvus, or similar)
• Working knowledge of MongoDB or other NoSQL databases
• Experience with Docker containerization and deployment
• Good to have hands-on fine-tuning experience (LoRA, QLoRA, full fine-tuning)
• Familiarity with local model serving frameworks (vLLM, TGI, or similar)
• Familiarity with Hugging Face ecosystem and transformer libraries
• Experience with cloud platforms (AWS, GCP, or Azure)
• Proficiency with Git/GitHub and version control workflows

Domain Knowledge
• Understanding of prompt engineering and optimization techniques
• Knowledge of LLM evaluation metrics and benchmarking methodologies
• Experience with cost optimization for LLM applications
• Familiarity with distributed computing and scaling strategies
• Understanding of LLM inference optimization (quantization, batching, caching)

Preferred Qualifications
• Understanding of digital human technologies and multimodal applications
• Knowledge of MLOps practices and CI/CD for ML systems
• Experience with Kubernetes for container orchestration
• Experience with streaming inference and real-time applications
• Background in function calling and tool use with LLMs
• Familiarity with RLHF (Reinforcement Learning from Human Feedback)
• Experience with model distillation and knowledge compression
• Understanding of distributed training and GPU optimization
• Experience with multi-agent systems and LLM orchestration

Soft Skills & Communication
• Excellent English communication skills (written and verbal)
• Ability to explain complex technical concepts to both technical and non-technical audiences
• Strong problem-solving and analytical thinking capabilities
• Self-motivated with ability to work independently and drive projects to completion
• Collaborative team player who thrives in fast-paced environments
• Passion for staying current with rapidly evolving LLM technologies
• Ability to balance research experimentation with production reliability requirements
分享到
更多相似工作
Gravitas Recruitment Group (Global) Ltd-company-logo
AI/LLM Engineer
Gravitas Recruitment Group (Global) Ltd
中西區, 香港
Gravitas Recruitment Group (Global) Ltd-company-logo
AI/LLM Engineer
Gravitas Recruitment Group (Global) Ltd
中西區, 香港
TCL Corporate Research(HK) Co., Ltd-company-logo
LLM Engineer
TCL Corporate Research(HK) Co., Ltd
中西區, 香港
TCL Corporate Research(HK) Co., Ltd-company-logo
LLM Engineer
TCL Corporate Research(HK) Co., Ltd
中西區, 香港