Presentation
Performance Modeling Engineer
·
Groq
·
Remote, USA
SessionJob Postings
DescriptionMission:
Performance modeling of Groq systems on state-of-the-art AI/ML workloads, identify bottlenecks early and guide future hardware development of the most advanced AI accelerator on the market.
Responsibilities & opportunities in this role:
Develop and maintain performance models for multiple generations of Groq hardware on the latest AI/ML workloads (LLMs, CNNs, LSTMs, etc.)
Analyze AI/ML algorithms to understand their compute, networking and memory requirements, and map them effectively onto the underlying hardware architecture
Lead a matrixed team to enable SW/HW co-optimization across chip, system and software teams
Identify performance bottlenecks and help drive next generation chip architecture through a solid understanding of Groq's software and hardware
Work with silicon and system integration engineers to evaluate the costs & benefits of new technologies on Groq systems
Provide what-if scenarios / continuous guidance directly to CEO & senior leadership
Develop the Design Space Exploration (DSE) tool for performance analysis and exploration of both chip and system across various workloads
Define custom hardware solutions for high profile customers
Ideal candidates have/are:
Computer science, mathematics, ECE or equivalent background and/or experience in this domain
Strong fundamentals in computer architecture, with deep knowledge and experience of working on domain specific AI architectures, is highly preferred
In-depth understanding of latest AI/ML algorithms and their hardware implications
Ability to analyze and simplify complex hardware designs into simple abstracted timing models
Past experience on modeling AI/ML workloads, and creating necessary tools for performance optimization. Experience with modeling LLM performance is beneficial, but not required
Proficient in programming languages such as C/C++ and Python
Experience with cycle-accurate simulators for benchmarking analysis
Experience with developing ASIC microarchitecture design is a plus
Experience with understanding and simulating RTL (systemVerilog) designs is a plus
Attributes of a Groqster:
Humility - Egos are checked at the door
Collaborative & Team Savvy - We make up the smartest person in the room, together
Growth & Giver Mindset - Learn it all versus know it all, we share knowledge generously
Curious & Innovative - Take a creative approach to projects, problems, and design
Passion, Grit, & Boldness - no limit thinking, fueling informed risk taking
Performance modeling of Groq systems on state-of-the-art AI/ML workloads, identify bottlenecks early and guide future hardware development of the most advanced AI accelerator on the market.
Responsibilities & opportunities in this role:
Develop and maintain performance models for multiple generations of Groq hardware on the latest AI/ML workloads (LLMs, CNNs, LSTMs, etc.)
Analyze AI/ML algorithms to understand their compute, networking and memory requirements, and map them effectively onto the underlying hardware architecture
Lead a matrixed team to enable SW/HW co-optimization across chip, system and software teams
Identify performance bottlenecks and help drive next generation chip architecture through a solid understanding of Groq's software and hardware
Work with silicon and system integration engineers to evaluate the costs & benefits of new technologies on Groq systems
Provide what-if scenarios / continuous guidance directly to CEO & senior leadership
Develop the Design Space Exploration (DSE) tool for performance analysis and exploration of both chip and system across various workloads
Define custom hardware solutions for high profile customers
Ideal candidates have/are:
Computer science, mathematics, ECE or equivalent background and/or experience in this domain
Strong fundamentals in computer architecture, with deep knowledge and experience of working on domain specific AI architectures, is highly preferred
In-depth understanding of latest AI/ML algorithms and their hardware implications
Ability to analyze and simplify complex hardware designs into simple abstracted timing models
Past experience on modeling AI/ML workloads, and creating necessary tools for performance optimization. Experience with modeling LLM performance is beneficial, but not required
Proficient in programming languages such as C/C++ and Python
Experience with cycle-accurate simulators for benchmarking analysis
Experience with developing ASIC microarchitecture design is a plus
Experience with understanding and simulating RTL (systemVerilog) designs is a plus
Attributes of a Groqster:
Humility - Egos are checked at the door
Collaborative & Team Savvy - We make up the smartest person in the room, together
Growth & Giver Mindset - Learn it all versus know it all, we share knowledge generously
Curious & Innovative - Take a creative approach to projects, problems, and design
Passion, Grit, & Boldness - no limit thinking, fueling informed risk taking
Company DescriptionGroq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need. From our Bay Area roots to our growing global presence, we are on a mission to make high performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible. Build fast.
·
·

Event Type
Job Posting
TimeMonday, 17 November 20254:54pm - 4:55pm CST
LocationHall 6
United States of America
Groq
Remote
Full Time
Permanent
