Inspired Opportunities

Inventing The Future of Gaming

Staff/Principal Machine Learning Engineer, Speech - USA

Inworld AI

Inworld AI

Software Engineering
Mountain View, CA, USA
Posted on Tuesday, July 9, 2024

view open roles

Why Join Inworld

Inworld is the best-funded startup in AI and games with a $500 million valuation and backing from top tier investors like Intel, Microsoft, Lightspeed, Bitkraft, Founders Fund, Kleiner Perkins, and more. Inworld was recognized by CB Insights as one of the 100 most promising AI companies in the world and was nominated alongside Anthropic, DeepMind, OpenAI and Nvidia for Generative AI Innovator of the Year at the VentureBeat Awards 2023.

Inworld is the leading AI engine for games, enabling developers to build groundbreaking game mechanics, dynamic NPCs and worlds that evolve with each action. Inworld powers experiences built by Ubisoft, NVIDIA, Niantic, NetEase Games and LG, among others, and has partnerships with key industry players such as Microsoft/Xbox, Epic Games, and Unity.

We are seeking Staff and Principal Machine Learning Speech Engineers with extensive experience in R&D of text-to-speech (TTS) and speech-to-text (STT) technologies. In this role, you will be at the forefront of building generative AI stack to power next-generation AI characters.

Minimum Qualifications

  • Bachelor’s degree in Computer Science, Engineering, or a similar technical field.
  • 6+ years of experience with software development in one or more programming languages, machine learning algorithms and tools (e.g., PyTorch), artificial intelligence, deep learning and/or natural language processing.
  • Excellent problem solving skills and the ability to work independently and as part of a team.

Preferred Qualifications

  • Master's degree or PhD in speech synthesis/recognition or adjacent fields.
  • 5+ years of experience with design and architecture, and testing/launching software products.
  • 1+ years of experience in working with sourcing and curating speech datasets.
  • 1+ years of experience in a technical leadership role leading project teams and setting technical direction.
  • 1+ years of experience in building end-to-end speech processing systems and real-time applications.


  • Research and experiment with cutting edge ML techniques for TTS and STT applications.
  • Develop and test production-grade training and inference pipelines for TTS and STT applications.
  • Understand optimization problems in the area of speech, signals, and natural language processing.
  • Collaborate with cross-functional teams to integrate speech technologies into products.

In-office location: Mountain View, CA, United States.

Remote location: United States.

The US base salary range for this full-time position is $240,000 - $385,000. In addition to base pay, total compensation includes equity and benefits. Within the range, individual pay is determined by work location, level, and additional factors, including competencies, experience, and business needs. The base pay range is subject to change and may be modified in the future.

Inworld Jobs Privacy