#1 Job Board for tech industry in Europe

  • Job offers
  • AI Engineer – Real-Time & High-Performance Systems
    New
    AI/ML

    AI Engineer – Real-Time & High-Performance Systems

    6 384 - 9 310 USD/monthNet per month - B2B
    6 384 - 9 310 USD/monthNet per month - B2B
    Type of work
    Full-time
    Experience
    Senior
    Employment Type
    B2B
    Operating mode
    Remote

    Tech stack

      Polish

      C2

      English

      B2

      LLM

      advanced

      AI

      advanced

      Databases

      advanced

      Deep Learning

      advanced

      Cloud

      regular

      Generative AI

      regular

    Job description

    Online interview

    Join the frontlines of AI performance at Emphie Solutions!


    At Emphie, we’re not just building AI—we’re engineering intelligence that performs at the speed of thought. From real-time AI applications to ultra-optimized LLM pipelines, we help global clients push the boundaries of what's possible with high-performance, low-latency machine learning.


    We're expanding our team and looking for a highly specialized AI Engineer to join our elite group of engineers. If you're driven by microsecond optimizations, multi-agent systems, and real-world AI deployment at scale—this one's for you.


    What You’ll Do


    • Design, train, and optimize deep learning models (e.g., CNNs, Transformers) for ultra-low-latency applications (<10ms response time).
    • Optimize large language models and foundation models (e.g., LLaMA, vLLM) for edge or high-throughput environments using quantization, pruning, and Retrieval-Augmented Generation (RAG).
    • Build multi-agent AI systems using orchestration frameworks such as LangChain and LangGraph with optimized runtime execution.
    • Develop and manage real-time embedding and vector search pipelines using FAISS, ScaNN, or Milvus.
    • Deploy models at scale using platforms like Azure ML, Databricks, or Kubernetes-based environments.


    What We Expect


    • Programming: Expert-level Python, with experience in performance-critical programming using C++ and/or CUDA. Deep knowledge of Pandas, NumPy, and Scikit-Learn.
    • Deep Learning Frameworks: Strong experience with PyTorch or TensorFlow.
    • Optimization Techniques: Mastery of quantization (e.g., INT8/FP16), pruning, knowledge distillation, and model graph optimization.
    • Latency Profiling: Advanced skills in profiling and optimizing inference runtimes; experience with NVIDIA NSight or similar tools.
    • NLP & LLMs: Proven experience with deploying LLMs like LLaMA, GGUF, vLLM, and integrating Transformer acceleration techniques.
    • Vector Databases: Familiarity with FAISS, ScaNN, or Milvus for low-latency vector search.
    • Deployment & Infrastructure: Hands-on experience with CI/CD for ML, containerization (Docker), and GPU monitoring using Triton Metrics, Prometheus, or equivalent.


    What We Offer


    • The opportunity to work on some of the most technically demanding AI challenges in production environments.
    • Access to cutting-edge infrastructure and tools to help you deliver your best work.
    • Long-term professional growth path with opportunities to specialize or lead advanced AI initiatives.
    • Diverse projects with international clients in domains such as autonomous systems, real-time personalization, and industrial AI.
    • 100% remote work with flexible hours.


    Are you ready to build AI systems that deliver intelligence in real-time? Join us and redefine what’s possible with Emphie Solutions.

    6 384 - 9 310 USD/month

    Net per month - B2B

    Apply for this job

    File upload
    Add document

    Format: PDF, DOCX, JPEG, PNG. Max size 5 MB

    This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
    Informujemy, że administratorem danych jest Emphie Solutions sp. z o.o. z siedzibą w Gliwicach, ul. Dolnych Wałów 13/3LU...more

    Check similar offers

    Senior GenAI ML Engineer (Python / TensorFlow / Cloud)

    New
    1dea
    47 - 52 USD/h
    Szczecin
    , Fully remote
    Fully remote
    TensorFlow
    Python
    Cloud