Representation in Crosswords: A Fresh Look

Problem-solving-crossword

We live in a data-driven world these days. Everything is quantified, analyzed, charted, and graphed. Your social media use alone is an absolute treasure trove of data that tells businesses all sorts of information about your activities, spending habits, and more.

So it should come as no surprise to you that the world of crosswords is no different. In recent years, we have been able to analyze decades of crosswords like never before, drawing important conclusions and uncovering trends both intriguing and shocking.

Back in 2016, the data analysis of programmer Saul Pwanson and constructor Ben Tausig uncovered a pattern of unlikely repeated entries in the USA Today and Universal crosswords, both of which were then edited by Timothy Parker. Eventually, more than 65 puzzles were determined to feature “suspicious instances of repetition” with previously published puzzles in the New York Times and other outlets, with hundreds more showing some level of repetition.

crossword-finals-shady

This led to Parker’s removal from both the USA Today and Universal crosswords.

But the impact of data analysis in crosswords doesn’t stop there. In 2018, Erik Agard compiled stats on how often the work of female constructors appeared in the major crossword outlets across the first four months of that year. It was an eye-opening piece about gender disparity among published constructors, backed up by smart research.

And there has been a greater push for inclusion on the construction side of crosswords. Back in March, at the urging of constructor Rebecca Falcon, several outlets participated in Women’s March, a concentrated effort in the puzzle community to support, foster, and cultivate more minority voices in crosswords.

(It comes as no surprise that two of the voices encouraging female puzzle creators are Erik Agard and David Steinberg, both of whom stepped up massively in the wake of the Timothy Parker scandal and have been advocates for greater inclusiveness in crosswords.)

womensmarch

[The list of all of the female constructors involved in Universal’s Women’s March project.]

This does raise the question, however, of inclusiveness when it comes to cluing and crossword entries.

And that question has been tackled quite brilliantly by Michelle McGhee in an article for The Pudding.

Striving to “better understand who is being referenced in crossword puzzles,” McGhee made a strong point about the influence crosswords have as a reflection on society:

Crosswords tell us something about what we think is worth knowing. A puzzle that subtly promotes the idea that white men are the standard, the people everyone should know about, is a problem for all of us (yes, even the white men).

A less homogenous puzzle would be an opportunity for many solvers to expand their worldviews. But more importantly, if you’re a solver like me, it’s meaningful to see yourself and your experiences in the puzzle, especially if they are often unseen or underappreciated. When I see black women engineers, or powerful athletes, or queer couples centered in a puzzle, it makes me feel seen and significant. It’s a reminder that I can be the standard, not just the deviant.

And she put the data to work to prove her point.

hloq2v535n061

Sampling tens of thousands of crosswords from Saul Pwanson’s puzzle database, she and her fellow researchers sorted people mentioned in crossword clues and used as crossword answers by race and gender according to US Census categories.

And their conclusion, sadly, was hardly unexpected:

We recognize that this is an imperfect method, but it does not change our finding: crossword puzzles are dominated by men of European descent, reserving little space for everyone else.

Not only did they chart the percentages of representation, but they also created charts illustrating the most commonly referenced people in crossword answers in the New York Times puzzle.

The goal? They wanted to quantify the concept of “common knowledge” in crosswords in the hopes of redefining it in a way that better reflects a true common knowledge, one that represents everyone.

I’m only scratching the surface of this article, which is a fascinating exploration of the history of crosswords, what they say about society, and what they COULD say about society. I encourage you wholeheartedly to read McGhee’s full piece here.

It’s the sort of journalism, commentary, and data analysis that helps push a problematic aspect of crosswords into the spotlight and keep it there. Yes, there have been great steps forward for representation in crosswords, both within the puzzles and in the realm of constructors, but we can do better. We must do better.

And work by folks like Michelle McGhee and her graph-savvy data miners is a valuable part of the process.


Thanks for visiting PuzzleNation Blog today! Be sure to sign up for our newsletter to stay up-to-date on everything PuzzleNation!

You can also share your pictures with us on Instagram, friend us on Facebook, check us out on TwitterPinterest, and Tumblr, and explore the always-expanding library of PuzzleNation apps and games on our website!

Better Gaming With Math and Statistics!

[Image courtesy of ThreeSixtyOne.gr.]

Statistical analysis is changing the world. The wealth of available data on the Internet these days, combining with our ever-increasing ability to comb through that data efficiently using computers, has spawned something of a golden age in data mining.

You don’t need to look any further than the discovery of Timothy Parker’s plagiaristic shenanigans for USA Today and Universal Uclick to see how impactful solid analysis can be.

But it’s also having an impact on how we play games. Statistical analysis is taking some of the mystery out of games you’d never expect, making players more efficient and capable than ever.

We discussed this previously with the game Monopoly — specifically how some spaces are far more likely to be landed on than others — and today, we’re looking at two more examples: Guess Who? and Hangman.

Guess Who? gives you a field of 24 possible characters, and you have to figure out which character your opponent has before she figures out the identity of your character. Usually, if you end up with a woman or someone with glasses, your odds of winning are low, because some aspects are simply less common than others.

But is there an optimal way to pare down the options? Absolutely.

Mathematician Rafael Prieto Curiel has devised a strategy for playing Guess Who?, based on an analysis of the notable features of each character, breaking it down into 22 possible questions to ask your opponent:

Based on this data, he has even created a flowchart of questions to ask to maximize your chances of victory. The first question? “Does your person have a big mouth?”

Yes, not exactly a great first-date question, but one that yields the best possible starting point for you to narrow down your opponent’s character.

It’s certainly better than my first instinct, which is always to ask, “Does your person look like a total goon?”

Now, when it comes to Hangman, the name of the game is letter frequency. Just like a round of Wheel of Fortune, you’re playing the odds at first to find some anchor letters to help you spell out the entire answer.

But, as it turns out, letter frequency is not the same across all word lengths. For instance, E is the most common letter in the English language, but it is NOT the most common letter in five-letter words. That honor belongs to the letter S.

In four-letter words, the most common letter is A, not E. And it can change, depending on the presence — or lack thereof — of other letters.

From How to Win Games and Beat People by Tom Whipple:

“E might be the most common letter in six-letter words, and S the second most common, but what if you guess E and E is not in it?” In six-letter words without an E, S is no longer the next best letter to try. It is A.

In fact, Facebook data scientist Nick Berry has created a chart with an optimal calling order based on the length of the blank word.

For one-letter words through 4-letter words, start with A. For five-letter words, start with S. For six-letter words through twelve-letter words, use E. And for words thirteen letters and above, start I.

Of course, if you’re the one posing the word to be guessed, “jazz” is statistically the least-likely word to be guessed using this data. And your opponent will surely hate you for choosing it.


Thanks for visiting PuzzleNation Blog today! Be sure to sign up for our newsletter to stay up-to-date on everything PuzzleNation!

You can also share your pictures with us on Instagram, friend us on Facebook, check us out on TwitterPinterest, and Tumblr, and explore the always-expanding library of PuzzleNation apps and games on our website!