At Emphie, we’re not just building AI—we’re engineering intelligence that performs at the speed of thought. From real-time AI applications to ultra-optimized LLM pipelines, we help global clients push the boundaries of what's possible with high-performance, low-latency machine learning.
We're expanding our team and looking for a highly specialized AI Engineer to join our elite group of engineers. If you're driven by microsecond optimizations, multi-agent systems, and real-world AI deployment at scale—this one's for you.
- Design, train, and optimize deep learning models (e.g., CNNs, Transformers) for ultra-low-latency applications (<10ms response time).
- Optimize large language models and foundation models (e.g., LLaMA, vLLM) for edge or high-throughput environments using quantization, pruning, and Retrieval-Augmented Generation (RAG).
- Build multi-agent AI systems using orchestration frameworks such as LangChain and LangGraph with optimized runtime execution.
- Develop and manage real-time embedding and vector search pipelines using FAISS, ScaNN, or Milvus.
- Deploy models at scale using platforms like Azure ML, Databricks, or Kubernetes-based environments.
-
Programming: Expert-level Python, with experience in performance-critical programming using C++ and/or CUDA. Deep knowledge of Pandas, NumPy, and Scikit-Learn.
-
Deep Learning Frameworks: Strong experience with PyTorch or TensorFlow.
-
Optimization Techniques: Mastery of quantization (e.g., INT8/FP16), pruning, knowledge distillation, and model graph optimization.
-
Latency Profiling: Advanced skills in profiling and optimizing inference runtimes; experience with NVIDIA NSight or similar tools.
-
NLP & LLMs: Proven experience with deploying LLMs like LLaMA, GGUF, vLLM, and integrating Transformer acceleration techniques.
-
Vector Databases: Familiarity with FAISS, ScaNN, or Milvus for low-latency vector search.
-
Deployment & Infrastructure: Hands-on experience with CI/CD for ML, containerization (Docker), and GPU monitoring using Triton Metrics, Prometheus, or equivalent.
- The opportunity to work on some of the most technically demanding AI challenges in production environments.
- Access to cutting-edge infrastructure and tools to help you deliver your best work.
- Long-term professional growth path with opportunities to specialize or lead advanced AI initiatives.
- Diverse projects with international clients in domains such as autonomous systems, real-time personalization, and industrial AI.
- 100% remote work with flexible hours.
Are you ready to build AI systems that deliver intelligence in real-time? Join us and redefine what’s possible with Emphie Solutions.