Remote Software Engineer Data Infrastructure & Acquisition
The mission of Speechify is to make sure that reading is never a barrier to learning.
Over 50 million people use Speechifyโs text-to-speech products to turn whatever theyโre reading โ PDFs, books, Google Docs, news articles, websites โ into audio, so they can read faster, read more, and remember more. Speechifyโs text-to-speech reading products include its iOS app, Android App, Mac App, Chrome Extension, and Web App.ย Google recently named Speechify the Chrome Extension of the Year and Apple named Speechify its 2025 Design Award winner for Inclusivity.ย ย
Today, nearly 200 people around the globe work on Speechify in a 100% distributed setting โ Speechify has no office. These include frontend and backend engineers, AI research scientists, and others from Amazon, Microsoft, and Google, leading PhD programs like Stanford, high growth startups like Stripe, Vercel, Bolt, and many founders of their own companies.
Overview
We're looking to hire for our Data side of our AI team at Speechify. This role is responsible for all aspects of data collection to support our model training operations. We are able to build high-quality datasets at petabyte-scale and low cost through a tight integration of infrastructure, engineering, and research work. We are looking for a skilled Software Engineer to join us.
What Youโll Do
Be scrappy to find new sources of audio data and bring it into our ingestion pipeline
Operate and extend the cloud infrastructure for our ingestion pipeline, currently running on GCP and managed with Terraform.
Collaborate closely with our Scientists to shift the cost/throughput/quality frontier, delivering richer data at bigger scale and lower cost to power our next-generation models.
Collaborate with others on the AI Team and Speechify Leadership to craft the AI Teamโs dataset roadmap to power Speechifyโs next-generation consumer and enterprise products.
An Ideal Candidate Should Have
BS/MS/PhD in Computer Science or a related field.
5+ years of industry experience in software development.
Proficiency with bash/Python scripting in Linux environments
Proficiency in Docker and Infrastructure-as-Code concepts and professional experience with at least one major Cloud Provider (we use GCP)
Experience with web crawlers, large-scale data processing workflows is a plus
Ability to handle multiple tasks and adapt to changing priorities.
Strong communication skills, both written and verbal.
Please mention the word PLEASINGLY when applying to show you read the job post completely (#RMjE2LjczLjIxNi4xNDE=). This is a feature to avoid fake spam applicants. Companies can search these words to find applicants that read this and instantly see they're human.
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.