Last updated June 1, 2024
I’ve always loved doing the sudoku.
It started in the summer of 2015. I was a senior in high school and had just started the college application process. Like many teenagers, I signed up for the SAT. Like many teenagers, I compiled a list of safety and target schools. And, like many teenagers, I was beginning to panic about acceptances.
One particular afternoon, I found myself down a rabbit hole of self-doubt and insecurity. I felt like I wasn’t smart enough, or my grades would pull down my application, or that my essays wouldn’t stand out.
I needed to challenge this voice in my head. I knew I was smart and capable (right???). I looked around for a way to prove it and saw the Sunday newspaper on the coffee table in front of me, opened to the puzzles page. I told myself if I could solve that day’s sudoku, I had what it took to get acceptances. I solved that sudoku and haven’t stopped doing it since.
I got so into the puzzle that, in high school, I completed 100 sudokus in 100 consecutive days. I’d take our newspaper’s puzzles page to school and solve it during class. A classmate once said while most students had their phones out under their desks, I had the sudoku. When I moved to the US for college, I bought a book full of puzzles. I was crazy about it.
I loved that I could take things one puzzle at a time, and every puzzle, one cell at a time. The puzzle served as a constant to the ups and downs of everyday life.
And then I stopped doing it just as quickly as I had started.
In October 2020, a friend texted me: “Do u wanna have a sudoku race 👉🥺👈”
We solved that day’s New York Times’ easy puzzle and hundreds of sudokus since then to see who could complete it first.
We even started a spreadsheet to track our times. Our competitive natures pushed us to challenge each other almost daily. Like in high school, I was constantly thinking about how to get faster.
In January 2023, I made a breakthrough. I started tracking what grids I fill out and how I fill them out1. And in the weeks after, I started to analyze my performance to look for patterns and ways I could speed up. As a throwback to my high school self, I decided to analyze 100 easy sudokus to see what I could learn about my performance...
When I first see a sudoku grid, my instinct is to look for naked singles. Naked singles are cells where there can be only one possible value. They're the slam dunks. No-brainers. Easy pickings. (They’re also called sole candidates, but why would you call them that when you could say naked singles? I digress.)
Using the process of elimination, I identify naked singles and solve those cells. I only ever fill in a value when it’s a naked single, i.e., I never guess. So when I fill out the grid, I almost never redo cells once I fill them. This is a tradition I’ve digitized — high school me would fill out grids with a pen.
As you solve each cell in the sudoku, you eliminate possible values throughout the grid. For example, putting a 5 in one cell means you can’t play a 5 anywhere else in that row, column, or section. In the process of filling out cells, you create more naked singles.
Essentially, by getting the numbers of these naked singles, you're introduced to their other naked single friends.
And this is a great way to analyze my performance.
Based on the analysis above, I quickly solve cells that are towards the top, then take longer to fill out cells that are in the middle and bottom.
Why do I tend to take longer in these areas? I think it has something to do with how the cells come prefilled. Let's talk about that.
Like Jeopardy! and its daily doubles, the Times’ sudoku grids tend to have the same cells prefilled. For the 500 sudoku grids I have prefilled data for, the easy sudoku has always come with exactly 38 values prefilled.
From the grid above, we see that the typical NYT easy sudoku has more prefilled cells towards the top. This makes it easier to start from the top. In fact, of the 100 sudokus I've collected data for, my first move is in the top three rows in 73 of them.
The higher density of cells also makes it easier for my next strategy: slicing and dicing. Slicing and dicing is a common maneuver where, once you find a blank cell, you quickly scan its row and its column to see what numbers that cell can't be. In fact, in the primer on naked singles above, you could solve the empty cell by slicing and dicing, rather than cycling through all nine numbers.
Taken together, slicing and dicing and looking for naked singles are my primary strategies when solving the sudoku. I also look for something called hidden singles, but those are much harder to analyze. If you’re curious about them, read up here.
We've talked about my strategy and how the grids tend to come prefilled. When you add those two together, how do I tend to move through the grid? Let's take a look at the way I step through the puzzle.
Once I identify a naked single, how long do I take before solving that cell? Let’s visualize every cell again, this time coloring based on how long it took me to fill them out.
In the graphic below, each column is one puzzle I’ve solved, and each square within that column represents a cell I’ve filled out for that puzzle. Read from top to bottom, each square is a cell I filled out from start to finish. Some puzzles have more cells filled out because I made mistakes.
The chart above reinforces our insight from above: I tend to slow down towards the middle.
But wait. According to the chart above, there's more nuance. The lower half represents the final steps of solving a puzzle. And just before I finish each puzzle, it looks like I speed through the last few cells. This would make sense since there are fewer candidates towards the end.
In a future article, it could be interesting to segment and cluster puzzles based on my speed throughout them. For example, group together puzzles where I’m slow at the start, fast towards the middle, and slow at the end. This could reveal similarities between puzzles and their difficulties.
Even though there are 10 basic digits that all numbers are made of, they’re not always uniform. Benford’s Law, for example, says that the leading digit in numbers of a dataset is typically small.
I tend to start with the number “1” most of the time. While it isn't close to Benford's Law, it's still skewed towards smaller numbers. In fact, in the 84 sudokus I have data for3, I started with the number “1” for 32 of them.
Is it because the first three rows typically have the number 1 prefilled? Not really — 1’s are no less often to be prefilled than 8’s. Or 3’s, for that matter. Then why do I always start with the number 1?
Upon reflection, I feel like smaller numbers are easier to remember. When I solve the grid, sometimes I think “This cell can be a 5 or 6. Let me come back to this later.” I’ve noticed that when the numbers tend to be closer to nine, it’s harder for me to remember them. I can’t explain why this is, but it's so oddly specific, I’d be surprised if other people felt this way too.
This could be an interesting way for me to get faster. For future puzzles, I can write down potential candidates in the cell. The Times website allows players to play in Candidate mode — maybe I should use this feature more.
It might even be worth diversifying my opening move. I overwhelmingly start solving the grid by playing the number 1 in the first three rows.
Mistakes happen. Everybody makes them. But you don't want to repeat your mistakes and not learn from them. Is there a pattern with when I make mistakes? Let’s highlight all my mistakes, i.e., cells that I filled out more than once.
I’m more likely to make mistakes when I’m almost done solving a puzzle. This is most likely because I get impatient. Sometimes I also guess because it’s faster than slicing and dicing.
There are also two kinds of mistakes:
I made a mistake in 65% of the puzzles I solved. It’s not only important to make fewer mistakes, but also to reduce the amount of time it takes to correct them. When I make a mistake, it costs me a median of 5.1 seconds.
In the graphic below, each dark square represents a cell in the sudoku that I filled out again.
Since most mistakes take fewer than five seconds to correct, hopefully I'm making user entry errors, rather than actual errors.
When slicing and dicing, I first read across the cell's row, and then its column. I've noticed that I always take longer to scan its column because it's harder to track numbers when reading from top to bottom, as opposed to left to right. To overcome this, I'll probably have to develop quicker mental models of scanning.
And because I rely so heavily on scanning a cell's row and column, I sometimes forget to scan its section. Remembering to do this could save me a few seconds, especially when I move towards the middle and bottom of the grid.
Quicker scanning leads to fewer mistakes, and that leads to a faster finish time!
It may sound unbelievable, but I’m not as into the sudoku now as I was in my teenage years. While I still try to solve the puzzle once a day, teenage me would always have a grid on his person.
While it’s cool to track and analyze your data as you solve the sudoku, it’s also important to be present as you solve the puzzle. And something the data doesn’t show is that, since my first puzzle, the main message has remained the same: whatever life throws at you, take it one puzzle at a time, and one cell at a time.
I wrote a JavaScript browser extension to track how I filled out the Times’ easy sudoku grid. I filled out 100 sudokus and had my data sent to a server running on my Raspberry Pi. I used Jupyter notebooks to analyze my performance and React + D3 to visualize the data.
Check out my work on this project here or my portfolio here. Feel free to use the extension to analyze your performance — it only works on the NYT website, but can be edited to work on other websites too.
Lastly, please reach out if you have tips on getting faster!
Notes on data collection
1Desktop: To standardize data collection, I only filled out the sudoku from my laptop and not on my phone. Hence, this sample of data may not be wholly representative of my sudoku solving abilities. ↩
2Speed: Throughout this project, I reference the time it took me to solve a cell. To calculate this, I measured how long it took me to enter a value in that cell since the last filled cell. A limitation of this is if I’m looking at solving one cell, but end up solving another. ↩
3Puzzle data: Because of the occasional tech glitch, I don’t have grid data for every puzzle I filled out. Therefore, the chart showing my first play is only for 84 puzzles, instead of 100. ↩