Jobs / Summary

Software Engineer - Data Infrastructure & Acquisition

Confidential company · Berlin · Posted May 20, 2026

Public summary

Join a leading AI-driven tech company focused on text-to-speech technology to help millions read and learn better. This fully remote role in Berlin involves building and operating large-scale audio data ingestion pipelines on Google Cloud, collaborating closely with AI scientists and leadership to power next-generation AI models and products. Ideal candidates have 5+ years software engineering experience, proficiency with Linux, Python, Docker, cloud infrastructure, and a passion for building impactful AI data solutions.

Location and work setup

Location
Berlin
Remote status
Unknown
German requirement signal
No German Required Detected
Detected job language
English

Responsibilities

Develop and operate cloud-based data ingestion pipelines for large-scale audio datasets; seek and integrate new audio data sources; manage infrastructure using Terraform on GCP; collaborate with AI scientists to optimize cost, throughput, and quality of datasets; contribute to the AI team's dataset roadmap to support consumer and enterprise product development.

Qualifications

Bachelor’s, Master’s or PhD in Computer Science or related field; 5+ years experience in software development; strong skills in Python and bash scripting in Linux; experience with Docker, Infrastructure-as-Code, and Google Cloud Platform; knowledge of web crawlers and large-scale data workflows is advantageous; excellent communication skills; adaptable and able to manage multiple priorities.

Skills

software development Python bash scripting Linux Docker Infrastructure as Code cloud infrastructure Google Cloud Platform (GCP) data ingestion pipelines web crawlers large-scale data processing collaboration communication skills