In mid July, Oath crossed a major development milestone: design lock. This means that all of the core systems are no longer subject to any change. I still have some work on the solo game ahead of me and want to take the cards through another round of editing, but the game is getting very close to being fully done.
At this point, the game has been in full-time development for almost a year. For reference, Root was in full-time development for about six months. In addition, Oath has involved many more staff members than Root. Purely as a measure of labor, Oath has received perhaps three or four times the attention that we directed to any other project the studio has ever undertaken. In the course of this process, especially in these final months, I've found my own position has often felt more like a director than a designer. That is, I spend most of my days coordinating several simultaneous efforts and making sure each individual element harmonizes with the larger project. This is a far cry from the early days of the project where I spent most days just locked in my office typing away.
I don't want to pretend that this is entirely a good thing. The change in my responsibilities and the broader scope of this project has altered the kinds of threats that Oath faces. Just because a game receives a lot of attention (and money) doesn't mean that it's going to be any better for it. In fact, if triple-A video games teach us anything, it's that very often these things lead to a game becoming overwrought and disingenuous. Over the past year I've done my best to navigate around those obstacles and lean into the strengths of our process. For one, I've tried to forget the success of Root and attempt to just make something that rang true for myself and the team. I don't have any sales metrics I want the game to hit or any particular accolades that I'm gunning for. It's enough to just try to realize this project on it's own terms. Back when I was working on the second edition of Pax Pamir, I kept a similar mindset. I kept reminding myself not to be bothered by the folks who wanted more things like Root or the folks who wanted another version of the first edition. I'm immensely happy with how that project turned out, and anyway, it's really the only way I know how to make games.
My second strategy for dealing with the size and scope of Oath is simply to be completely transparent wherever possible. Just about as soon as I have something to share about the game, I try to put it out in the open. Presenting the design-in-progress has always helped me understand my own work, but the exposure also allows me to see if what I'm valuing in the design is resonating and what parts are ringing false. I realize this sounds a little contradictory, and perhaps it is. But, while making something for yourself, I think it's important to get the occasional gut check and make sure you're not undertaking some stupid vanity project. If writers and designers can be their best readers/players, they can also sometimes be their worst.
I haven't written much about Oath since the quarantine began. There has simply been too much happening in the wider world, and I didn't want to clutter people's feeds with a long piece on game design as there were many more important things to talk about. I do think games are important, but everything has it's place. I also haven't had a spare moment to step back and consider the tremendous work of the past few months. At this point, I don't think I would have any trouble assembling a whole book that chronicles the long history of Oath's design and development. I have been completely overwhelmed by the care and attention of the staff here at Leder Games and of our testers for whom our little Discord became a place of stability and encouragement for everyone during these difficult months, and I am especially grateful for the heroic efforts of my development staff Nick Brachmann and Josh Yearsley, the graphics work of Pati Hyun, and the support of Kyle Ferrin whose illustrations continuously challenge me to do better work.
I could spend a long time enumerating the contributions of each of these people and even longer discussing the many fine suggestions and productive arguments I've had with the game's other testers. But, I imagine such a long list would get kind of boring to you all, so instead I want to talk to you about a single system in the game and some of the attention it received over the past month or two. My hope is that it will give you all a good sense of how we work as a team while also explaining the story behind a critical late-stage change to the core of the design.
Part 1: Shifting Timetables
Oath is a big project. Just in terms of assets, I would guess it is perhaps six times the size of Root. If our team is larger, we have plenty of work to do. Over the course of this project, I regularly check in with various team members and get a sense of what parts of the projects are encountering snags and then may make adjustments to the game's larger schedule.
In general, I pride myself on keeping big projects running on schedule. Both Root and Underworld were finished on time or early in terms of the design and the submission of assets to the factory. That's likewise the case for every project I've ever done with Phil and for the second edition of Pax Pamir that I built with my brother. All of those projects presented challenges and had moments of both good luck and misfortune. In every case there was always room to move things around and make adjustments so that we could keep to the schedule.
But, when the pandemic came to the states and forced us to shut down our in-person testing network and central offices, I had no idea how it would impact Oath's timeline. We moved all of our testing to Tabletop Simulator, a clunky but sorta love-able platform for playing games online. Once we got the kinks ironed out, our schedule looked like like it wouldn't be impacted at all. At least, it seemed that way at first. As the weeks wore on, I found myself devoting a lot more time to staff coordination and management to compensate for the fact that we weren't sharing the same office any more. In addition, with three kids at home and lots of remote learning lesson plans to get through, I took small teacher-shifts each day to help my partner with the kids. Mostly I could make up this time by working in the evenings, but I was finding myself routinely putting in 60 hard hours a week to accomplish what might have been done in 40 previously. My case was hardly unique. Many staff members were experiencing similar delays.
When we built the production timeline around Oath, we added a two to three month grace period so that we can, for whatever reason, mitigate a delay or several small delays without impacting the promised delivery date. In April/May we decided to use this period and push our file submission back to July. There was no reason to announce anything since the project was still on schedule. But, as May wore into June it became clear that we were going to need more time. The delays had started cascading into one another in a pattern that will be familiar to project managers everywhere. Little thing A stops bigger thing B which ultimately delays a much bigger C from getting to the right person at the right time. Oath had a large staff by our own standards to begin with, but the additional support staff demanded by Covid19 meant that these chains were longer and more susceptible to cascades than ever before.
Strangely, the design itself had been unaffected by these cascades. We were more-or-less on schedule for having all of our files done on time and ready for usability testing and the final stages of internal production. This could have easily not been the case, we just happened to get lucky. This advantage actually presented something of a problem. Should the design team finish the game on time and move on to other projects, should we slow down the last stage, or should we take the game through another round of development? At Leder there is no shortage of future projects that we could apply ourselves to. I for one want to get working on the next wave of Root material! But, I also realized that this delay presented a huge opportunity. As anyone who has ever acted or helped in the production of a stage play can tell you, working on a single project for a long period of time creates a tremendous amount of fellowship and intimacy both with the other members of the team and the work itself. Even if Oath was a blockbuster and demanded an immediate expansion, I was never going to get this close to understanding the game or have such a strong team for building content to the game again. If I wanted to tackle anything about the design, now was the time to do it.
So, with an extra six weeks tacked on the schedule, I decided to spend some time looking closely at the design for any element that I didn't like, and, if I found anything, I would devote what remained to building a replacement.
Part 2: The Audit and the Survey
I started this process by teaching Oath a few times to players who hadn't played the game. Sometimes I hopped into a random board game discord and gathered people for a game. Other times, I asked friends who hadn't yet played. In these teaches, I looked in particular for moments where I found myself a little embarrassed or apologetic. I would watch for phrases like “okay this next part is a little complicated” or “don't worry, this will make sense later.” I'll be the first to admit that the teaches for any of my games will be full of both phrases, but I was particularly interested in moments when I would find myself saying these phrases more than normal, and, critically, when I was using them to avoid teaching a particular element because I dreaded teaching it.
For instance, when I teach new players how to play the game, I almost never teach them how the Darkest Secret and the People's Favor work. It's just too much to put in anyone's head right at the start of the game and usually I can introduce them on later turns when they become more relevant. That's likewise true of citizenship. However, though I hesitate to teach those parts of the game right at the start, I love teaching players about them when they are ready. I think that they are all clever, interesting elements of the design that present players with a lot of interesting tools.
I couldn't say the same for the game's combat system. Those of you who have been following the development of this game know that the combat system in Oath has a long, complicated genealogy which I detailed here:
But, as I audited the design, I became increasing dissatisfied in three respects. First, the system was hard to teach and think about. It asked players to do addition, subtraction, multiplication and division all in a single phase. Moreover, the chances of victory and expected losses were often unintuitive and too often players had to take in account the entire gamestate to figure out exactly what advantages and liabilities they enjoyed. Second, for all of that, there were still moments where the system broke down. As forces got larger in the late game, attackers could very nearly stack the odds in ways that seemed completely wrong. Basically about 10% of the engagement archetypes were a little goofy, but those were the most common types of engagements in the late game when they mattered most. (A note to future designers out there: this didn't become apparent until after we rounded our 100th internal game!) Finally, and perhaps most critically, the combat system didn't hit the right emotional or narrative notes. One of the things most important elements of the design for me was that a victory should feel like a victory and a defeat should feel like a defeat. Oath is a game of amazing turnarounds, accidents of fate, and clever schemes that take years to realize. It's common for even the quiet players to get carried away when they play the game, to worry that their plan is about to backfire or to leap up when a lucky break happens at exactly the right time (how the wider market of game players will react to this thing is anyone's guess). Combat, was almost never part of these moments. It was out of step with the rest of the design and belonged much more in a wargame about force projection in the 20th century than something sent in Oath's blend of Jim Henson style low fantasy.
These problems sound fatal now, but they were all pretty minor in their own way. The game was moving along and folks were liking it quite a bit. If I were on an assembly line, I wouldn't have pulled the red cord and shut production down. But, if I had a little time, I think it made sense to try to wrestle with this system and see if I could build something better.
The basic plan looked something like this. First, I went about taking a full accounting of the existing combat system and how it performed in about 10 different force configurations over 6 different situations each. I also took a survey of every combat ability in the game to get a sense of what kinds of work those plans were doing both from a mechanical and narrative standpoint. Then, I would attempt to generate a new system that fit into those parameters as cleanly as possible. Ideally, there would be some room to improve the system both in terms of those problematic late-game scenarios I mentioned early and to make the card list even more expressive. And I wanted to do all of this while making it simpler and easier to teach.
In short, I was attempting something like the game-design equivalent of an organ transplant. The new organ needed to be stronger and more efficient in nearly ever respect. But I didn't have time to rebuild the entire game around the new system. It needed to fit into basically the same cavity that the old one was lifted out of. With that in mind, I got to iterating.
Part 3: Iteration
The first step was the measure that old system. I took the probability table that I had built the old system out of and created a simple table that showed a players chances of victory at some various force configurations and with certain allowed losses. Mapping out the system like this helped me understand exactly why those late-game battles were proving so troublesome. The problem came from the way the sacrifice rule worked with the force ratios. In order to win a battle, the attacker needed to achieve a certain amount of “advantage.” The base value for any battle was a function of the force size ratio (e.g. did you have twice the number of war-bands as your enemy? Three times?). From there the number would be modified depending on the scope of the campaign and the special powers that you and the defender hand. In the final calculus attackers could also sacrifice a number of warbands to gain additional advantage. Now, in most combats this price could be steep. Let's say I have 4 warbands fighting 4 warbands. That a ratio of 1:1. This is a starting advantage of -2. That means you need to sacrifice 2 warbands just to cross the finish line with all else being equal. This makes sense from a thematic perspective and it wasn't hard to come up with examples of 1:1 engagements having a 50% attrition rate or higher for the attacking force. But at at higher numbers the attrition rate for similar engagements began to go down. This is totally wrong. Gigantic pitched battles with equal sized forces should be incredibly chaotic and have high average attrition rates.
After I had mapped out the parameters of the old system, it became immediately apparent that any iteration was going to require an easily adjustable model. Ideally, I wanted to come up with an idea, and adjust a digital model that would simulate many thousands of battles at all of the various force configurations so that I could quickly compare one with another. If the model had trouble taking the right shape, it didn't matter how good it felt or how much I wanted it to work. The new organ had to fit in the old cavity. So I dusted off my bad programming skills and booted up Python to build just such a model.
Once I had the script working, I decided to start my iterative process by attempting to cut the thorniest parts of the old system. The first one of the chopping block was the force ratio system. As much as I wish Oath could introduce a swath of players to the elegance of a wargamer's CRT, Oath was already making severe demands on its players, and I should cut where I could. My first instinct was to apply a version of the combat system I had built for an unpublished game of mine called Heaven's Mandate. Combat in that game was basically a wager where each side bid a number of armies. After the wager, each side would roll dice equal to the armies and the higher total would win. Critically however, some dice faces (often the higher numbers) would feature losses, so that often the victorious player had to take a higher loss ratio.
I liked a lot of things about this system. It reduced combat to a simple, “who rolls higher” model that enfranchised the defender. I also liked that it allowed me to keep the old battle plan system of +/- advantage. It also allowed me some new design options. One of the most compelling options was to swap the warband pieces with colored dice matching the six suits. This way players would generate liability based on where they mustered—not which cards they happened to rule. To keep costs down, mustering would go from generating to warband meeples to a single warband die. This also allowed the countering system to be made a lot more visceral. If you had Order dice that were good against Discord dice, you could just auto roll them to sixes. I loved this idea. On the surface it seemed to attack my most serious problems with a solution that was both simpler and more visceral.
There were two big problems though. First, there were the numbers:
The biggest problem here was that the larger force had a heavy advantage and that this heavy advantage was able to overcome the early parts of the exponential curve of the misfortune dice. (This system still used misfortune dice which were added to the defenders roll). That by itself was not fatal. Using my script I iterated a half-dozen variations that adjusted the dice faces to control for variance and try to shape the numbers appropriately.
However, when I got this version to the table for a live test, another problem became immediately apparent. Because Oath was already a little over-budget in terms of its production costs, it was clear I was going to have to swap the wooden warband pieces for colored dice. Thankfully I own a couple copies of Tenzi so we have plenty of colored dice on hand and it was easy to prototype this option. Once we saw it all on the table, it was immediately clear that this was a bad path to go down. Not only was the board extremely hard to parse as colored dice far outnumbered territory control markers, it also changed the character of the game's presentation. It made the game look and even feel like a dice drafting game! Even as the numbers were starting to get closer to my goals, I knew this was dead-end almost immediately and went back to the drawing board.
As I iterated, I met with Josh, our staff editor and a key member of Oath's development team. He and I hand informal meeting Friday night that sparked an impromptu four hour game jam the following morning. He made several helpful suggestions, the most important of which was going back to a simpler dice system where the attacker would roll the dice just for their force, a single die per warband. Then, rather than have the defender roll, he suggested that I just add the number of defending warbands to the misfortune dice which would be restyled as defense dice. This new system was a little more complicated than some of my initial efforts, but still had a little more than half as many rules as the old system. It was also more expensive than any of the options I was looking at because it just added about 8 dice to the game's component ask. It also didn't offer any cute adjustment to the battle plan system on the face of it.
But, for all of that, it allowed us to keep the warband counts and granularity that we had built the rest of the design around and keep the game's rough board presence the same as it had ever been. Trusting that this was the right basic approach, I went into my script and over the next week generated about 15 different versions, working with Nick Brachmann to help me come up with some different approaches. Though the initial suggestion had several problems, but they were able to be adjusted by making fine tune adjustments to the different dice faces and the value of attacker sacrifices. By the end of the week I had a system I liked quite a bit:
It was time to test in person. After 3 weeks of work, I was ready for this system to be the one. This is a very dangerous frame of mind for a designer to be in! The test went fairly well, but it was clear something was off. Though the attacker still had a strong advantage, the players were acting more gun shy. Exhausted and a little foggy, I couldn't quite understand what was causing the problem, so I decided to take a short break and went for a walk. When I got back we played again and the problem was immediately obvious.
While the new system had precisely the odds and average losses that I had been targeting, the variance was dramatically too high. It was at this point I realized, embarrassingly, that I had neglected to measure the variance in average losses of the original system. I quickly ran some numbers for the old system to get a rough sense of what the variance looked like and realized that the new system had standard deviations of over twice that of the old system. It was clear that's where the problem was.
To fix this, I went to the attack dice, removing the blank faces and replacing them with half hits and re balancing the entire dice to have it have more-or-less the same average as the old die. This nicely tightened the variance. At this point the system was feeling quite close.
There was only one significant problem remaining. Basically the attacker wasn't losing enough forces. Once they got to 3:2 or 2:1 advantage in force size, their losses would plummet. This was a problem. I needed the attacker to have to pay some attrition rate in most every combat except the ones with the most favorable odds. Drawing again on Heaven's Mandate, the idea of the self-sacrificing die came to the rescue and the double hit face was restyled as the 1 hit + 1 forced sacrifice. This mean that attacking forces would generally suffer a 1/6 attrition rate on any combat. To stop this from getting silly I put in a rule that didn't require a player to roll for every war band if they wanted to hold a reserve and minimize losses.
Part 4: Marching through the Card List
With a strong core system in hand, I turned my attention to the game's many combat special powers. The old battle plan system had worked, but had always felt a little flat. This was a little disheartening because I had worked hard on their general framework. I had wanted to build a rock/paper/scissors style countering system in to the core of the design with certain battle plans being good against particular suits.
However, in practice the advantages players gained were boring (+1, +2, etc) and the liabilities were maddeningly difficult to track. Most often players would stumble into a liability, which doesn't allow anyone to feel particularly clever.
Thinking through this problem, I realized that my core problem was that I had been modeling my countering system off something like Pokemon rather than Starcraft.
In Pokemon games, counters are handled by a chart of relationships that spells out various strengthens and weaknesses.
Though Starcraft uses a chart like this too for establishing weakness to certain damage types, it's a very small part of the overall picture. For instance, zerglings in Starcraft are strong against stalkers because the stalkers have a large unit size with lots of surface area. A zergling can use its speed to surround the stalking, limiting their mobility and breaking a lot of force to bear. That relationship relies on how units in the game handle their physical position and movement. From a narrative standpoint, I find this a much more compelling relationship because it leans into the core system rather than a table of exchanges that floats above those systems.
I like R/P/S countering systems like the ones used in Pokemon or Fire Emblem, but thinking through this problem made me realize that these charts can also be understood as system failures. That is, in Pokemon there's no way that a bug type Pokemon expresses itself in the game outside of this chart, so the full measure of it's impact can only be felt through the chart. Imagine, for instance, if a water type pokemon created a moat that land-based couldn't move through! Obvious this sort of stuff would make Pokemon a mess. The countering chart works well because it's so simple. But, Oath's core gameplay is extremely expressive, and it made sense to try to fold combat into those systems rather than just having a big chart of counters.
Let's look at some examples:
The old Wolves card used the special beast icon (bottom left). This icon translates means “the number of beast cards at sites.” So, the card would get powered up the more beast cards were in play. This is a cool ability and we built it into a few other beast cards, but it was not good as a common icon because it just demanded players to pay attention to too many things at once. In revising the card, I started by asking myself, what kind of threat do wolves really present. I thought about all of the fantasy I had read
and it was immediately obvious. Wolves hunted. They lurked and they threatened you in a menacing, almost omnipresent way. The ability followed quickly: spend a secret and then kill a cohort that is on a player's board. Critically this only really affects players who are traveling with armies. Those with large kingdoms can protect their warbands by keeping them garrisoned on sites.
In addition, I should note that though this card is simple, it can be used in a wide variety of contexts. A chancellor can use it to menace exiles in the early game. An exile can make it a cornerstone of their guerrilla operation against the Chancellor. It can be played either as an adviser or a site card. And, it's not a battle plan so it has a very flexible timing window. In one game you could have a loyal pack of wolves and in the next you could find yourself visiting the leader of the wolf pack out in the hinterland in hopes that you can use them to turn against their masters.
The other special icon used in battles is the nomad icon which means “the total number of nomad cards you rule.” This icon was meant to give the nomad cards a tribal quality that rewarded players from doubling down on a single suit. Like the beast icon, I built this power into some other cards but was at a loss on how to fold it in to the battle plan system. Here I found myself working through the same sort of thematic logic as the wolves. What did the game's core system allow and how would a nomad operate within it? One of my favorite elements of Oath is the nearly closed card economy. It lends itself perfectly to what the nomad suit represents. It follows then that the nomad cards could be made stronger and then force themselves to be discarded.
One of the biggest losses in the new card system was the loss of the old liability system. I've talked previously about the problems with the liability system in terms of game state parsability. But, for all of that headache, it did create a set of vulnerabilities for every position. In revising the combat cards, we did our best to build liabilities directly into each battle plan in ways that resonated with both the game's design and it's narrative logic. Along those lines, Josh had an excellent suggestion of building the arcane cards around the Darkest Secret. Many of their most powerful spells required them to hold onto the Darkest Secret which could be fought over throughout the game.
All of these changes necessitated some revisions to old cards such as Mercenaries. The old power, which I had always liked, was now a feature of the nomad suit so a new liability had to be made. We made them a little more expensive and then tagged on a new discard condition. Coupled with Kyle's new art, it wasn't hard to build an entire narrative around a single card.
The Last Testing Battery
Once we had some battle plans built, I introduced the system to the testing Discord. Though quarantine had wrecked havoc on the game's schedule, at moments like this I found myself deeply grateful for the huge community of players that had gathered around the game. Within about an hour of announcing that I had a new combat system, I had 10 or so testers watching me live stream a game as I explained how it worked. Over the next week almost a dozen games were played, and many of their suggestions were incorporated into the wording of the rules and the precise design of some of the battle plans.
I also had a chance to watch a lot of office games and took every opportunity to teach new players. The response was extremely positive. In addition to fixing some of the balance problems, allowing new battle plan cards with much stronger thematic reasoning, and cutting down the rules, the new system also proved genuinely exciting. Though the odds of the new system were more difficult to calculate precisely, they were much easier to intuit. Previously the mechanical and emotional climaxes of the combat system were misaligned. Often the play of a card would shift the odds and when the battle actually happened it was simply a matter of the price that needed paid. But now they were largely in sync. Though a critical card play could shift the odds dramatically, the exact fate of a big battle wasn't known until both players had cast their dice.
This fit in beautifully with the game's other systems. As a general design, Oath is naturalistic, leaning hard into its thematic underpinnings to generate its challenges and stories. It's a political game told from the bottom up, with players spending their time traveling and visiting and sharing secrets to gain support. The revised combat system goes a long way in bringing that design ethos to an element of the game that was always more utilitarian than dramatic.
In retrospect, it's strange now that I very nearly didn't undertake this audit. In that way I think this process also revealed to me a critical challenge to working on large projects. With a small project, it's easy to stay nimble, to toss part of it in the trash and pivot to a new approach. But with a large project there is a tremendous amount of momentum that accumulates as it gets closer to completion. Here my shift in position, from more of a designer to more of a project manager, can present a serious threat to the quality of the game. As a designer, it's my job to be dissatisfied. As a project manager, it's my job to get the thing done on time. I know that Oath was a very good game even before this system was revised, but I also know that it benefited greatly from this added development time. The key, it seems to me, is to keep one's priorities straight and let the responsibilities of the designer check those of the manager when possible. When it comes to this revision, I think I'm guilty of letting those latter responsibilities trump the first. Thank goodness for those other delays! Sometimes a little misfortune leads to a lucky break.
---
This will probably be the last designer diary I write for awhile. Over the next few weeks, we're finishing our final internal pass on everything and taking the solo mode through it's external review. Basically all of the game elements have entered the layout stage or will enter it shortly. Sometime this week, we'll be updating the public TTS module with the new combat system as well and lots of little quality of life updates (including fixing some bugs in the mod). I'll also be releasing a patch kit for folks who built the physical game but want to use the new combat system. It's a pretty easy patch, basically just swapping out some cards across the full decklist. I should have that ready sometime next week.
- Cole Wehrle
You can find the original post and related discussion on Board Game Geek.