ML Quantization Engineer

Festanstellung, Vollzeit · Dresden

About the Role
We’re SEMRON, a venture-backed startup focused on redefining AI hardware for Edge devices. If you’re deep into quantization and enjoy working at the intersection of machine learning and hardware, we’d like to hear from you. In this role, you will be responsible for building a highly scalable inference framework for our future chip generations. You will participate in fundamental architectural decisions and have the opportunity to contribute to upstream open-source projects.
What you will do:
  • Develop and maintain an inference framework that’s tightly tuned for SEMRON hardware.
  • Collaborate directly with ML, compiler, and hardware teams to refine and adapt quantization algorithms for our specific needs.
  • Apply and innovate on the latest quantization methods like AdaRound, BRECQ, GPTQ, and QuaRot, bringing fresh ideas to SEMRON’s approach.
What you should bring in:
  • Solid skills in PyTorch and experience with torch.FX, plus the know-how to write efficient, custom CUDA kernels.
  • A solid understanding of current quantization research and hands-on experience with techniques that push performance.
Helpful but not required:
  • Experience with State-of-the-art NN compression methods like Adaround, QDrop, QUIP, or GPTQ
  • Experience with typical tools used in ML environments like HuggingFace’s transformers or DeepSpeed
Wir freuen uns auf Sie!
Wir freuen uns über Ihr Interesse an der Demo Daten GmbH. Bitte füllen Sie das folgende kurze Formular aus. Sollten Sie Schwierigkeiten mit dem Upload Ihrer Daten haben, wende Sie sich gerne per Email an demodaten@demo.de.
Dokument wird hochgeladen. Bitte warten Sie.
Fügen Sie alle erforderlichen (mit einem * gekennzeichneten) Angaben hinzu, um Ihre Bewerbung abzusenden.