Staff / Principal Machine Learning Engineer, Speech - USA
Why Join Inworld
Inworld is the best-funded startup in AI and gaming with a $500 million valuation and backing from top tier investors like Intel, Microsoft, Lightspeed, Bitkraft, Founders Fund, Kleiner Perkins, and more. Inworld was recognized by CB Insights list of the 100 most promising AI companies in the world. We’ve also been nominated alongside Anthropic, DeepMind, OpenAI and Nvidia for the Generative AI Innovator of the Year at the VentureBeat Awards 2023, and are a Gartner Cool Vendor in 2023.
Inworld is the leading character engine for creating AI NPCs in games and immersive entertainment. Inworld powers NPCs in experiences built by Niantic, NetEase Games, LG, Alpine Electronics, the Disney Accelerator, and more. We go beyond large language models (LLMs) to add multimodal orchestration of personality and contextual awareness that renders NPCs within the lore and logic of their worlds.
Inworld is well positioned to take a long-term view when it comes to supporting the developer community today, and stay ahead of the curve in the ever-evolving landscape of generative AI for tomorrow. By joining us now, you'll be stepping into a role where your ideas and efforts will directly influence our path forward, making this moment an extraordinary one to become a key player in our journey of shaping the future of AI and gaming.
We are seeking Staff and Principal Machine Learning Speech Engineers with extensive experience in R&D of text-to-speech (TTS) and speech-to-text (STT) technologies. You will be at the forefront of building a generative AI stack to power next-generation AI characters.
- Bachelor’s degree in Computer Science, Engineering, or a similar technical field.
- 6+ years of experience with software development in one or more programming languages, machine learning algorithms and tools (e.g., PyTorch), artificial intelligence, deep learning and/or natural language processing.
- Master's degree or PhD in speech synthesis/recognition or adjacent fields.
- 5+ years of experience with design and architecture; and testing/launching software products.
- 1 year of experience in working with sourcing and curating speech datasets.
- 1 year of experience in a technical leadership role leading project teams and setting technical direction.
- Research and experiment with cutting edge ML techniques for TTS and STT applications.
- Develop and test production-grade training and inference pipelines for TTS and STT applications.
- Understand optimization problems in the area of speech, signals, and natural language processing.
In-office location: Mountain View, CA, United States.
Remote location: United States.
The US base salary range for this full-time position is $220,000 - $350,000. In addition to base pay, total compensation includes equity and benefits. Within the range, individual pay is determined by work location, level, and additional factors, including competencies, experience, and business needs. The base pay range is subject to change and may be modified in the future.