We are looking for a data engineer, preferably with some audio processing experience, to join our ML team. You will transform raw data into consumable formats for machine learning. The role involves building infrastructure to harness the data streams that flow into our servers and collating them accordingly. As an extension of the job role, you'll also get to enjoy working with our data scientists to explore statistical methods. In all, you’ll be owning the data collation platform.
To succeed in this data engineering position, you should have a strong ability to build a platform and automation of services that collate and organize data from different sources. You are a strong programmer with attention to detail and possess good analytical skills. Also, knowledge of audio processing would be a huge bonus.
About Viva
Viva Translate is seeking passionate individuals to join our team of ML researchers & engineers from top institutions such as Google and Stanford. We are creating a world where language is no longer a barrier to work and opportunity, and we are starting across Latin America.
Viva is building an AI tool that helps people read, write, and speak better across English, Spanish, and Portuguese. If you share our vision for a borderless future and are a motivated builder, idealist, and explorer, we would love to hear from you.
What you'll be doing
Analyze and organize raw audio
Build data systems and pipelines
Prepare data for predictive modeling
Explore ways to enhance data quality, reliability, and security
Develop analytical tools
Collaborate with data scientists & architects, and human transcribers & translators
Our ML tech stack includes Python/Django, AWS, CI/CD, Terraform, BERT, Spacey
Must-have skills 💪
1+ years of experience as a data engineer or in a similar role
Knowledge of programming languages (e.g., Python)
Hands-on experience with SQL database design (e.g. Postgres)
Effective communication with team members of diverse technical backgrounds
Nice to haves 🍒
Degree in Computer Science, IT, or similar fieldsUsing project management tools (e.g., GitHub, Asana)
Experience in handling audio streaming data
Prior experience of building data platforms Experience with Cloud providers (e.g. AWS, GCP, Azure)
Experience with distributed/streaming data-processing technologies and frameworks (e.g. Scala, Apache Spark, Databricks, Apache Kafka, Redpanda, CockroachDB)
Fluent in Spanish and/or Portuguese
Our values 💛
Continuous growth - we take pride in setting new standards and taking ownership in our work
Data-driven - Decisions made together based on data and logic
Open integrity - we promote a low ego environment that treasures transparency, empathy, and feedback
Play - we champion diversity and creativity, and want everyone not be afraid to fail & fail quickly
What we offer ✨
Fully remote team 🌎
3+ in-person retreats (past locations include Mexico, Colombia & Ecuador) annually
Join an early-stage startup (12 people & growing 🚀)
Home office stipend
Health & fitness benefits
Learning stipend - we are here to support your personal & professional development journey
Success story sharing