Machine Learning / AI Engineer (RL)

Cenimy Twoją prywatność

Chcemy Cię poinformować, że gdy odwiedzasz nasz serwis, za pomocą plików cookies lub podobnych technologii (nazywamy je łącznie cookies), my lub nasi partnerzy, zbieramy informacje o Twojej aktywności w serwisie the:protocol. Dzięki temu możemy zapewnić Ci wygodne i bezpieczne korzystanie z naszego serwisu i naszych usług, dopasować do Twoich preferencji wyświetlane treści, oferty pracy oraz umożliwić Ci korzystanie z funkcji mediów społecznościowych.

Szanujemy Twoją prywatność, dlatego umożliwiamy Ci wybór Twoich preferencji odnośnie cookies. Skorzystaj z przycisku „Akceptuj wszystkie” lub „Dostosuj”, aby dokonać wyboru i udzielić zgód na cookies. Możesz cofnąć lub zmienić zgody w dowolnym momencie. Wystarczy, że wybierzesz „Ustawienia plików cookies” w stopce każdej z naszych podstron.

Pamiętaj jednak – rezygnując z niektórych rodzajów cookies, możesz uniemożliwić lub utrudnić sobie korzystanie z naszego serwisu i jego funkcji.

Szczegółowy wykaz używanych cookies w serwisie the:protocol został wskazany tutaj.

Polityka prywatności·Polityka plików cookies

Praca w miastach:Praca IT Warszawa•Praca IT Kraków•Praca IT Wrocław•Praca IT Szczecin•Praca IT Poznań•Praca IT Lublin•Praca IT Katowice•Praca IT Bydgoszcz•Praca IT Łódź•Praca IT Białystok•Praca IT Toruń•Praca IT Gdańsk

Stanowiska:Grafik komputerowy praca•Tester gier praca•Programista praca•Tester oprogramowania praca•Frontend Developer praca•Data scientist praca•Tester manualny praca•Tester praca•Analytics praca•Tester aplikacji praca•Programista java praca•Web developer praca

Technologie i narzędzia:Python praca•Cyberbezpieczeństwo praca•Java praca•Bazy danych praca•Cisco praca•C++ praca•Javascript praca•SQL praca•Business Intelligence praca•Delphi praca•Wordpress praca•Testowanie aplikacji praca•C# praca•PHP praca

Offer summary

(Summary generated by AI based on the full job description)

The project focuses on developing advanced RL environments and evaluation systems for AI agents to enable safe AGI. Required skills include Langchain, Langraph, mcp-server, and experience in RL environment engineering and machine learning. Responsibilities include designing RL environments, building task pipelines, developing reward models, and collaborating with infrastructure teams for scalability and telemetry.

you can start ASAP

Machine Learning / AI Engineer (RL)

Company: ACAISOFT POLAND Sp. z o.o.

from: 18 May 2026

to: 17 June 2026

160 - 240 złnet (+ VAT)/ hr.B2B contract (full-time)

Salary details

basic salary

Offer parameters

level:mid • senior

working mode:remote

Warszawa, Mokotów

Warszawa, MokotówAleja Niepodległości 18View on map

Requirements

Expected technologies

Python

Optional technologies

Reinforcement Learning

Operating system

macOS

Linux

Our requirements

5+ years of overall experience in the IT industry.
Minimum 3 years in Machine Learning/Environment Engineering, Data Scientist
Practical knowledge of AI frameworks (Langchain, Langraph, mcp-server ).
Extensive practical experience in working with AI, including prompt engineering and vibe coding.
Experience in working with business requirements (analysis, summarizing, responding to changes).
Expertise in planning your own work or that of a small team.
Being able to work 2 p.m. - 10 p.m

Optional

Knowledge of Codex or Claude Code.
Experience in integrating AI with a system would be an advantage.
Understanding of RL concepts - reward modeling, environment dynamics, verifiability, evaluation, and agent interaction loops.
Familiarity with instrumentation, metrics, and data pipelines for RL evaluation.

Your responsibilities

Design and implement RL environments that support large-scale agent evaluation and reinforcement learning experiments.
Build task generation pipelines, dynamic datasets, and scripted environments with controlled complexity and stochasticity.
Develop verifiers and reward models to automatically score trajectories and evaluate model reasoning.
Collaborate with infrastructure and systems engineers to ensure environments are scalable, reproducible, and instrumented for detailed telemetry.
Design APIs and orchestration frameworks for running, resetting, and evaluating agents across environments.
Optimize environment performance, logging, and reward reproducibility across distributed setups.

About the project

You will be cooperating with a leading provider of AI evaluation and optimization solutions, trusted by multinational companies to optimize AI agents and detect performance issues in large language models.

In this role, you’ll help develop advanced reinforcement learning (RL) environments and scalable evaluation systems that guide and shape the behavior of cutting-edge AI models. The company’s mission is to enable safe, verifiable, and aligned AGI through rigorous, real-world agent evaluation.

Due to the client’s time zone, we would appreciate a candidate who can work 2 p.m. - 10 p.m.

Join us and make a real impact!

If you’re ready to broaden your horizons and work with an innovative company at the forefront of AI, we’d love to hear from you. You’ll help build the environments that shape how future AI systems are trained, evaluated, and aligned - and collaborate with world-class engineers and researchers on one of the most important technical challenges of our time.

This is how we organize our work

This is how we work

at the client's siteyou focus on a single project at a timeyou can change the projectyou focus on product developmentagile

This is how we work on a project

documentation
issue tracking tools
testing environments

Benefits

sharing the costs of sports activities
private medical care
remote work opportunities
flexible working time
integration events
extra social benefits
baby layette
school layette
employee referral program
charity initiatives
Gift vouchers for kids (birthdays, Christmas, Child's Day)

Recruitment stages

1.
HR call (max 15 min.)
2.
Technical skills assessment via discussion of a case study
3.
Technical interview with our client (max 30 min.)*

ACAISOFT POLAND Sp. z o.o.

At Acaisoft we specialize in cloud-native application development and transformations from legacy to cloud-native environments.

We provide end-to-end software solutions, from business analysis, through project evaluation, to UI/UX, Frontend, and Backend design and implementation. We integrate manual and automated QA finest practices, to make sure that the final product is top-notch.

Our customers range from startups to large enterprises based in the US, mainly Silicon Valley, and Western Europe.

Since technology is constantly being developed at such a fast pace, we always strive to be one step ahead of the market and keep up with the latest solutions.

This is how we work

Machine Learning / AI Engineer (RL)

160–240 zł / hr. (B2B)

I apply to:

ACAISOFT POLAND Sp. z o.o.

Warszawa, Mokotów

Pracodawca zbiera zgłoszenia przez swój system.

Przejdziesz na zewnętrzny formularz.

By clicking "Aplikuj" you confirm that you've read and accepted our Terms and Conditions.

This is how the employer processes your data

Wszystkie informacje o przetwarzaniu danych osobowych w tej rekrutacji znajdziesz w formularzu aplikacyjnym, po kliknięciu w przycisk "Aplikuj Teraz".

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Need more information?

Make sure the body of the offer doesn’t already include what you’re looking for.
Ask a question if you need more information you’re interested in.
We’ll forward your question to the employer and aim to provide a response within 3 business days.

Share this offer

Link