AI Engineer – Real-Time & High-Performance Systems

Emphie Solutions

6 384 - 9 310 USD/monthNet per month - B2B

Type of work

Full-time

Experience

Senior

Employment Type

B2B

Operating mode

Remote

Tech stack

Polish

English

LLM

advanced

AI

advanced

Databases

advanced

Deep Learning

advanced

Cloud

regular

Generative AI

regular

Job description

Online interview

Join the frontlines of AI performance at Emphie Solutions!

At Emphie, we’re not just building AI—we’re engineering intelligence that performs at the speed of thought. From real-time AI applications to ultra-optimized LLM pipelines, we help global clients push the boundaries of what's possible with high-performance, low-latency machine learning.

We're expanding our team and looking for a highly specialized AI Engineer to join our elite group of engineers. If you're driven by microsecond optimizations, multi-agent systems, and real-world AI deployment at scale—this one's for you.

What You’ll Do

Design, train, and optimize deep learning models (e.g., CNNs, Transformers) for ultra-low-latency applications (<10ms response time).
Optimize large language models and foundation models (e.g., LLaMA, vLLM) for edge or high-throughput environments using quantization, pruning, and Retrieval-Augmented Generation (RAG).
Build multi-agent AI systems using orchestration frameworks such as LangChain and LangGraph with optimized runtime execution.
Develop and manage real-time embedding and vector search pipelines using FAISS, ScaNN, or Milvus.
Deploy models at scale using platforms like Azure ML, Databricks, or Kubernetes-based environments.

What We Expect

Programming: Expert-level Python, with experience in performance-critical programming using C++ and/or CUDA. Deep knowledge of Pandas, NumPy, and Scikit-Learn.
Deep Learning Frameworks: Strong experience with PyTorch or TensorFlow.
Optimization Techniques: Mastery of quantization (e.g., INT8/FP16), pruning, knowledge distillation, and model graph optimization.
Latency Profiling: Advanced skills in profiling and optimizing inference runtimes; experience with NVIDIA NSight or similar tools.
NLP & LLMs: Proven experience with deploying LLMs like LLaMA, GGUF, vLLM, and integrating Transformer acceleration techniques.
Vector Databases: Familiarity with FAISS, ScaNN, or Milvus for low-latency vector search.
Deployment & Infrastructure: Hands-on experience with CI/CD for ML, containerization (Docker), and GPU monitoring using Triton Metrics, Prometheus, or equivalent.

What We Offer

The opportunity to work on some of the most technically demanding AI challenges in production environments.
Access to cutting-edge infrastructure and tools to help you deliver your best work.
Long-term professional growth path with opportunities to specialize or lead advanced AI initiatives.
Diverse projects with international clients in domains such as autonomous systems, real-time personalization, and industrial AI.
100% remote work with flexible hours.

Are you ready to build AI systems that deliver intelligence in real-time? Join us and redefine what’s possible with Emphie Solutions.

6 384 - 9 310 USD/month

Net per month - B2B