Reward Functions in AI PCB Design

Hardware Rich Development

Published

Written by

Workbench

Reward Functions in AI PCB Design

Published

Written by

Cody Stetzel

Marketing Consultant

Read the Full Series

This article is one part of a walkthrough detailing how we recreated an NXP i.MX 8M Mini–based computer using Quilter’s physics-driven layout automation.

Overview: Project Speedrun

Part 1: Preparing the Design

Part 2: Compiling the Design

Part 3: Cleaning Up the Design

Part 4: Testing & Validation

Part 5: Building Firmware for an AI-Laid-out Computer: From Boot to Google Meet

Most AI products begin with language. Quilter begins with physical constraint. Printed circuit boards are not paragraphs waiting to be generated; they are dense physical systems where geometry, current, impedance, thermal behavior, routing legality, fabrication limits, and component relationships all press against one another. Good layout is not judged by plausibility. Good layout is judged by whether a board can survive contact with physics.

AI PCB design becomes serious when it moves beyond surface-level automation. Layout tools need to search through possible configurations, score trade-offs, avoid obstacles, respect geometric constraints, and improve candidate generation over time. Reward functions sit at the center of that process because they define what the system should value. Across conversations with Fariz Rahman, Osman Romero, and Katie Scholl, a shared Quilter worldview emerges: difficult engineering problems are not solved by hiding complexity; they are solved by making complexity executable, inspectable, and measurable.

Code Becomes Ground Truth When Physics Reviews the Work

“For me, code is the ground truth… if I read a research paper and I just tell myself that okay, I understood it, you know, I have no proof that I understood it. But if I can like, you know, fire up my code IDE and… prototype it, you know, that would usually expose any holes in my understanding.”

Engineering knowledge becomes real when an idea survives implementation. Research papers, whiteboard diagrams, and architectural debates can all create the feeling of comprehension, but executable code exposes hidden assumptions. In AI PCB design, hidden assumptions are especially dangerous because software abstractions eventually become copper, dielectric, vias, copper pours, and traces. A model can seem elegant until geometry, manufacturing tolerances, and electrical behavior begin pushing back.

“In the real world you have real numbers and in the computer world we have like a fixed number of pixels… But for [layout] we had to do that… But then you had to actually go back to the real world and print the board. So you have a two way transformation.”

PCB automation depends on translation between representations: schematic intent, board geometry, component footprints, computational grids, routed paths, manufacturable coordinates, and physical behavior. Each translation can distort information. A coordinate transform can introduce edge cases. A discretization scheme can simplify reality in ways that matter later. Fariz’s point is simple but profound: only implementation reveals whether the abstraction held its shape.

Reward Functions Turn Engineering Judgment Into System Behavior

“They’re not on the AI hype because everyone is writing an LLM wrapper… In this case, old school reinforcement learning to solve a very hard problem.”

Osman Romero draws a sharp line between AI hype and engineering utility. PCB layout does not need a system that merely sounds like an engineer. PCB layout needs a system that can explore physical design possibilities, evaluate trade-offs, learn from failure, and improve its next attempt. Reinforcement learning gives that process a technical frame because the agent acts, the environment responds, the outcome is scored, and future behavior changes.

A reward function can encourage completed routes, shorter paths, cleaner topology, improved obstacle avoidance, better manufacturability, or stronger adherence to geometric constraints. Poorly designed reward functions can create shallow success, such as layouts that connect nets but produce fragile routing or downstream congestion. Stronger reward functions encode a richer definition of quality. In robust optimization, good does not mean the candidate scored well on one isolated metric; good means the candidate remains viable when multiple constraints collide.

“Having it being like the more reinforcement learning type approach… I was just like, oh, this is very grounded. This is stuff that I can really sink my teeth into.”

Grounded is the key word. Reinforcement learning matters here because PCB layout has measurable consequences. Routes either satisfy constraints or violate them. Components either fit into physical space or create collisions. Traces either preserve the design’s electrical intent or introduce risk. Reward functions give an AI PCB design system a way to learn from those consequences instead of generating board-shaped guesswork.

Candidate Generation Makes Hardware Trade-Offs Visible

Candidate generation changes the tempo of PCB design. Traditional layout often advances through expensive, sequential iteration: place components, route the board, review the result, catch problems, revise, simulate, fabricate, test, and learn. Each loop costs time. Each design path also hides alternatives that might have worked better under a different set of constraints.

AI-assisted candidate generation creates a broader field of comparison. One candidate may route cleanly but consume more area. Another may satisfy dense placement goals while increasing congestion. A third may preserve cleaner signal paths while forcing compromise in power distribution or component placement. Engineers already reason through these trade-offs, but automation can make more of them visible at once.

Reward functions and candidate generation belong together. Candidate generation expands the search space. Reward functions guide the search toward better outcomes. Physics checks, design rules, and human review help determine whether apparent progress is real. In strong AI PCB design workflows, the machine does not replace judgment; it gives judgment more serious options to evaluate.

Geometric Constraints Are Never Just Geometry

Geometric constraints may sound like a spatial problem, but PCB geometry carries electrical meaning. Pads, vias, keepouts, mounting holes, copper pours, traces, board edges, component bodies, and differential pairs all exist as shapes, but their placement affects performance. A trace is not just a line. A via is not just a hole. A clearance rule is not just a measurement.

Obstacle avoidance in PCB design is therefore more complex than steering around a wall. A route may avoid one object while creating routing congestion somewhere else. A placement may satisfy clearance while damaging downstream routability. A compact layout may look efficient while making thermal behavior, return paths, or manufacturability worse. Physical space on a board is not empty territory; it is electrically and mechanically loaded terrain.

Reward functions need to account for that terrain. A system that rewards only connection completion can learn to produce legal but undesirable layouts. A system that rewards compactness without enough awareness of routing feasibility can teach itself to create bottlenecks. Stronger optimization requires multiple signals: legality, connectivity, clearance, manufacturability, electrical quality, and resilience under design change. Power supply PCB layout shows why current paths, thermal behavior, and placement choices cannot be treated as separate problems.

The Art of PCB Layout Is Tacit Knowledge Under Constraint

“All of them said… this is practically impossible… there’s a lot of manual work that goes into it. Science, it’s an art.”

Experienced engineers often use art to describe knowledge that has not been fully formalized. Some layout requirements are explicit: clearance rules, design-rule checks, net classes, board boundaries, and fabrication limits. Other requirements live in judgment: where sensitive components belong, when symmetry matters, how routing cleanliness affects future changes, and which legal choices will become painful two revisions later. Calling that work art is not an argument against automation; it is a warning against shallow automation.

Quilter’s opportunity is to translate more of that tacit knowledge into systems that can search, score, and learn. Reward functions are one part of that translation. Candidate generation is another. Geometric constraints, obstacle avoidance, and physics-aware evaluation give the system a stronger relationship to real board behavior. Human expertise remains essential because someone still has to define what better means.

Diagrams, Code, and Shared Context Make Complexity Usable

“I made a diagram for the way some of our logic is structured and some of our data structures and some of the processes… the process of getting it laid out, it just helped me a whole bunch figuring it out how I could… fit in.”

Diagrams perform a similar function to code. Both turn private understanding into shared structure. In complex engineering systems, clarity often comes from externalizing the model: drawing it, coding it, testing it, reviewing it, and revising it. Mental models become more useful when other people can inspect them.

AI PCB design requires that kind of shared inspection. Algorithm designers need visibility into physical constraints. Product engineers need visibility into user workflows. Engineers working on candidate generation need feedback from people who understand why one board candidate is merely legal while another is genuinely useful. Reward functions improve only when the team can discuss what the system is actually learning.

The Mountain Range Model of Engineering Culture

“Instead of creating a silo… I think what is being made here are more like mountain ranges where there might be six or seven peaks… you can see each other peak to peak and you can… yell back and forth.”

Mountain ranges are a better metaphor than flat collaboration. Serious engineering teams need peaks of expertise. One person may understand reinforcement learning, another may understand routing infrastructure, another may understand PCB manufacturing, and another may understand the user’s design workflow. Excellence requires height, but coordination requires visibility between peaks.

Quilter’s Hardware Rich Development work depends on that kind of visibility. Reward functions cannot be separated from domain judgment. Candidate generation cannot be separated from layout evaluation. Obstacle avoidance cannot be separated from electrical and mechanical consequences. A strong AI PCB design culture needs specialists who can see far enough across the range to understand how their decisions shape the whole system.

“As people’s changes are coming through, I’m also going on little rabbit holes to figure out the surrounding context… going spelunking into our code base.”

Spelunking is more than onboarding. It is how engineers map hidden terrain. Codebases contain product decisions, technical compromises, modeling assumptions, abandoned paths, and hard-won lessons. In a company building AI for physical design, understanding that terrain is part of the engineering work.

Iteration Requires Discipline More Than Personality

“I would write code on paper and… by the weekend I have code that I can actually, you know, run on my computer.”

Scarcity forced precision. Each attempt had weight because feedback was delayed. Modern development environments make iteration faster, but speed alone does not create better engineering. Useful iteration requires attention to what failed, why it failed, and what the next attempt should learn. Rapid iteration in hardware becomes valuable when every loop increases the team’s understanding of the system.

“The way you do one thing is how you do everything.”

Such discipline matters for reward functions, candidate generation, and robust optimization. Engineers need to notice when a metric is teaching the wrong lesson. Teams need to inspect whether generated candidates are improving in meaningful ways or simply exploiting weak scoring. Review culture becomes part of the optimization loop because humans refine the system’s definition of success. Katie captures this human version of iteration when she describes taking change sets as an opportunity to learn about a particular part of the code.

Why Reward Functions Matter for the Future of AI PCB Design

Reward functions are not just a machine learning mechanism. In AI PCB design, reward functions become a formal expression of engineering priorities. A board layout system must learn which trade-offs matter, which violations are unacceptable, which candidates deserve attention, and which forms of apparent progress are actually fragile. Good reward functions help the system distinguish between layout completion and layout quality.

Candidate generation expands the number of possible futures a hardware team can consider. Geometric constraints keep those futures attached to physical reality. Obstacle avoidance prevents naïve routes from becoming unusable routes. Robust optimization helps the system prefer candidates that can survive messy engineering trade-offs rather than candidates that win a narrow scoring game. Together, these capabilities point toward a more serious form of autonomous PCB design.

Quilter’s strongest claim is not that AI makes hardware easy. A more credible claim is that hard hardware problems can be searched more intelligently, evaluated more rigorously, and iterated more quickly. Fariz’s implementation-first mindset protects the work from abstraction drift. Osman’s reinforcement learning framing protects it from AI hype. Katie’s mountain range metaphor protects it from shallow collaboration language.

PCB layout has always been a negotiation with constraint. Copper records decisions. Signals reveal assumptions. Heat exposes shortcuts. Manufacturing punishes ambiguity. Reward functions matter because they help an AI system learn from those realities instead of pretending they do not exist.

‍

Try Quilter for Yourself

Project Speedrun demonstrated what autonomous layout looks like in practice and the time compression Quilter enables. Now, see it on your own hardware.

Get Started

Validating the Design

With cleanup complete, the final question is whether the hardware works. Power-on is where most electrical mistakes reveal themselves, and it’s the moment engineers are both nervous and excited about.

Continue to Part 4

Cleaning Up the Design

Autonomous layout produces a complete, DRC'd design; cleanup is a brief precision pass to finalize it for fabrication.

Continue to Part 3

Compiling the Design

Once the design is prepared, the next step is handing it off to Quilter. In traditional workflows, this is where an engineer meets with a layout specialist to clarify intent. Quilter replaces that meeting with circuit comprehension: you upload the project, review how constraints are interpreted, and submit the job.

Continue to Part 2