Here's to the next 100 sudokus

Based on the analysis above, I quickly solve cells that are towards the top, then take longer to fill out cells that are in the middle and bottom.

Why do I tend to take longer in these areas? I think it has something to do with how the cells come prefilled. Let's talk about that.

Patterns

Like Jeopardy! and its daily doubles, the Times’ sudoku grids tend to have the same cells prefilled. For the 500 sudoku grids I have prefilled data for, the easy sudoku has always come with exactly 38 values prefilled.

Data above is from 500 Times sudokus. Hover over each cell above to see what percent of grids it came prefilled for.

From the grid above, we see that the typical NYT easy sudoku has more prefilled cells towards the top. This makes it easier to start from the top. In fact, of the 100 sudokus I've collected data for, my first move is in the top three rows in 73 of them.

The higher density of cells also makes it easier for my next strategy: slicing and dicing. Slicing and dicing is a common maneuver where, once you find a blank cell, you quickly scan its row and its column to see what numbers that cell can't be. In fact, in the primer on naked singles above, you could solve the empty cell by slicing and dicing, rather than cycling through all nine numbers.

A screenshot of a sudoku grid with one cell highlighted. The row and column of this cell is focused, while the rest of the grid is blurred out, to emphasize slicing and dicing.

When slicing and dicing, you look at a cell's column and row to determine what numbers that cell can't be.

Taken together, slicing and dicing and looking for naked singles are my primary strategies when solving the sudoku. I also look for something called hidden singles, but those are much harder to analyze. If you’re curious about them, read up here.

We've talked about my strategy and how the grids tend to come prefilled. When you add those two together, how do I tend to move through the grid? Let's take a look at the way I step through the puzzle.

Speed within a puzzle²

Once I identify a naked single, how long do I take before solving that cell? Let’s visualize every cell again, this time coloring based on how long it took me to fill them out.

In the graphic below, each column is one puzzle I’ve solved, and each square within that column represents a cell I’ve filled out for that puzzle. Read from top to bottom, each square is a cell I filled out from start to finish. Some puzzles have more cells filled out because I made mistakes.

The chart above reinforces our insight from above: I tend to slow down towards the middle.

But wait. According to the chart above, there's more nuance. The lower half represents the final steps of solving a puzzle. And just before I finish each puzzle, it looks like I speed through the last few cells. This would make sense since there are fewer candidates towards the end.

In a future article, it could be interesting to segment and cluster puzzles based on my speed throughout them. For example, group together puzzles where I’m slow at the start, fast towards the middle, and slow at the end. This could reveal similarities between puzzles and their difficulties.

Do I stick to some numbers more than others?

Even though there are 10 basic digits that all numbers are made of, they’re not always uniform. Benford’s Law, for example, says that the leading digit in numbers of a dataset is typically small.

I tend to start with the number “1” most of the time. While it isn't close to Benford's Law, it's still skewed towards smaller numbers. In fact, in the 84 sudokus I have data for³, I started with the number “1” for 32 of them.

Is it because the first three rows typically have the number 1 prefilled? Not really — 1’s are no less often to be prefilled than 8’s. Or 3’s, for that matter. Then why do I always start with the number 1?

Upon reflection, I feel like smaller numbers are easier to remember. When I solve the grid, sometimes I think “This cell can be a 5 or 6. Let me come back to this later.” I’ve noticed that when the numbers tend to be closer to nine, it’s harder for me to remember them. I can’t explain why this is, but it's so oddly specific, I’d be surprised if other people felt this way too.

This could be an interesting way for me to get faster. For future puzzles, I can write down potential candidates in the cell. The Times website allows players to play in Candidate mode — maybe I should use this feature more.

It might even be worth diversifying my opening move. I overwhelmingly start solving the grid by playing the number 1 in the first three rows.

An image of the Duolingo owl saying “Let's review your mistakes!”

$#*@ happens

Mistakes happen. Everybody makes them. But you don't want to repeat your mistakes and not learn from them. Is there a pattern with when I make mistakes? Let’s highlight all my mistakes, i.e., cells that I filled out more than once.

I’m more likely to make mistakes when I’m almost done solving a puzzle. This is most likely because I get impatient. Sometimes I also guess because it’s faster than slicing and dicing.

There are also two kinds of mistakes:

ones where I mistype a number despite wanting to put the right number (e.g., hitting a 3 instead of a 4), or
ones where I put in the wrong value and actually think it's right

I made a mistake in 65% of the puzzles I solved. It’s not only important to make fewer mistakes, but also to reduce the amount of time it takes to correct them. When I make a mistake, it costs me a median of 5.1 seconds.

In the graphic below, each dark square represents a cell in the sudoku that I filled out again.

Since most mistakes take fewer than five seconds to correct, hopefully I'm making user entry errors, rather than actual errors.

Other observations

When slicing and dicing, I first read across the cell's row, and then its column. I've noticed that I always take longer to scan its column because it's harder to track numbers when reading from top to bottom, as opposed to left to right. To overcome this, I'll probably have to develop quicker mental models of scanning.

And because I rely so heavily on scanning a cell's row and column, I sometimes forget to scan its section. Remembering to do this could save me a few seconds, especially when I move towards the middle and bottom of the grid.

Quicker scanning leads to fewer mistakes, and that leads to a faster finish time!

It may sound unbelievable, but I’m not as into the sudoku now as I was in my teenage years. While I still try to solve the puzzle once a day, teenage me would always have a grid on his person.

While it’s cool to track and analyze your data as you solve the sudoku, it’s also important to be present as you solve the puzzle. And something the data doesn’t show is that, since my first puzzle, the main message has remained the same: whatever life throws at you, take it one puzzle at a time, and one cell at a time.

Methodology

I wrote a JavaScript browser extension to track how I filled out the Times’ easy sudoku grid. I filled out 100 sudokus and had my data sent to a server running on my Raspberry Pi. I used Jupyter notebooks to analyze my performance and React + D3 to visualize the data.

Check out my work on this project here or my portfolio here. Feel free to use the extension to analyze your performance — it only works on the NYT website, but can be edited to work on other websites too.

Lastly, please reach out if you have tips on getting faster!

Notes on data collection

¹Desktop: To standardize data collection, I only filled out the sudoku from my laptop and not on my phone. Hence, this sample of data may not be wholly representative of my sudoku solving abilities. ↩

²Speed: Throughout this project, I reference the time it took me to solve a cell. To calculate this, I measured how long it took me to enter a value in that cell since the last filled cell. A limitation of this is if I’m looking at solving one cell, but end up solving another. ↩

³Puzzle data: Because of the occasional tech glitch, I don’t have grid data for every puzzle I filled out. Therefore, the chart showing my first play is only for 84 puzzles, instead of 100. ↩