Public summary
Join a leading AI-driven tech company focused on text-to-speech technology to help millions read and learn better. This fully remote role in Berlin involves building and operating large-scale audio data ingestion pipelines on Google Cloud, collaborating closely with AI scientists and leadership to power next-generation AI models and products. Ideal candidates have 5+ years software engineering experience, proficiency with Linux, Python, Docker, cloud infrastructure, and a passion for building impactful AI data solutions.
Location and work setup
- Location
- Berlin
- Remote status
- Unknown
- German requirement signal
- No German Required Detected
- Detected job language
- English
Responsibilities
Develop and operate cloud-based data ingestion pipelines for large-scale audio datasets; seek and integrate new audio data sources; manage infrastructure using Terraform on GCP; collaborate with AI scientists to optimize cost, throughput, and quality of datasets; contribute to the AI team's dataset roadmap to support consumer and enterprise product development.
Qualifications
Bachelor’s, Master’s or PhD in Computer Science or related field; 5+ years experience in software development; strong skills in Python and bash scripting in Linux; experience with Docker, Infrastructure-as-Code, and Google Cloud Platform; knowledge of web crawlers and large-scale data workflows is advantageous; excellent communication skills; adaptable and able to manage multiple priorities.