Representation in Crosswords: A Fresh Look

Problem-solving-crossword

We live in a data-driven world these days. Everything is quantified, analyzed, charted, and graphed. Your social media use alone is an absolute treasure trove of data that tells businesses all sorts of information about your activities, spending habits, and more.

So it should come as no surprise to you that the world of crosswords is no different. In recent years, we have been able to analyze decades of crosswords like never before, drawing important conclusions and uncovering trends both intriguing and shocking.

Back in 2016, the data analysis of programmer Saul Pwanson and constructor Ben Tausig uncovered a pattern of unlikely repeated entries in the USA Today and Universal crosswords, both of which were then edited by Timothy Parker. Eventually, more than 65 puzzles were determined to feature “suspicious instances of repetition” with previously published puzzles in the New York Times and other outlets, with hundreds more showing some level of repetition.

crossword-finals-shady

This led to Parker’s removal from both the USA Today and Universal crosswords.

But the impact of data analysis in crosswords doesn’t stop there. In 2018, Erik Agard compiled stats on how often the work of female constructors appeared in the major crossword outlets across the first four months of that year. It was an eye-opening piece about gender disparity among published constructors, backed up by smart research.

And there has been a greater push for inclusion on the construction side of crosswords. Back in March, at the urging of constructor Rebecca Falcon, several outlets participated in Women’s March, a concentrated effort in the puzzle community to support, foster, and cultivate more minority voices in crosswords.

(It comes as no surprise that two of the voices encouraging female puzzle creators are Erik Agard and David Steinberg, both of whom stepped up massively in the wake of the Timothy Parker scandal and have been advocates for greater inclusiveness in crosswords.)

womensmarch

[The list of all of the female constructors involved in Universal’s Women’s March project.]

This does raise the question, however, of inclusiveness when it comes to cluing and crossword entries.

And that question has been tackled quite brilliantly by Michelle McGhee in an article for The Pudding.

Striving to “better understand who is being referenced in crossword puzzles,” McGhee made a strong point about the influence crosswords have as a reflection on society:

Crosswords tell us something about what we think is worth knowing. A puzzle that subtly promotes the idea that white men are the standard, the people everyone should know about, is a problem for all of us (yes, even the white men).

A less homogenous puzzle would be an opportunity for many solvers to expand their worldviews. But more importantly, if you’re a solver like me, it’s meaningful to see yourself and your experiences in the puzzle, especially if they are often unseen or underappreciated. When I see black women engineers, or powerful athletes, or queer couples centered in a puzzle, it makes me feel seen and significant. It’s a reminder that I can be the standard, not just the deviant.

And she put the data to work to prove her point.

hloq2v535n061

Sampling tens of thousands of crosswords from Saul Pwanson’s puzzle database, she and her fellow researchers sorted people mentioned in crossword clues and used as crossword answers by race and gender according to US Census categories.

And their conclusion, sadly, was hardly unexpected:

We recognize that this is an imperfect method, but it does not change our finding: crossword puzzles are dominated by men of European descent, reserving little space for everyone else.

Not only did they chart the percentages of representation, but they also created charts illustrating the most commonly referenced people in crossword answers in the New York Times puzzle.

The goal? They wanted to quantify the concept of “common knowledge” in crosswords in the hopes of redefining it in a way that better reflects a true common knowledge, one that represents everyone.

I’m only scratching the surface of this article, which is a fascinating exploration of the history of crosswords, what they say about society, and what they COULD say about society. I encourage you wholeheartedly to read McGhee’s full piece here.

It’s the sort of journalism, commentary, and data analysis that helps push a problematic aspect of crosswords into the spotlight and keep it there. Yes, there have been great steps forward for representation in crosswords, both within the puzzles and in the realm of constructors, but we can do better. We must do better.

And work by folks like Michelle McGhee and her graph-savvy data miners is a valuable part of the process.


Thanks for visiting PuzzleNation Blog today! Be sure to sign up for our newsletter to stay up-to-date on everything PuzzleNation!

You can also share your pictures with us on Instagram, friend us on Facebook, check us out on TwitterPinterest, and Tumblr, and explore the always-expanding library of PuzzleNation apps and games on our website!

Puzzle History: Codebreaking and the NSA, part 3

[Image courtesy of NSA’s official Twitter account.]

At the end of part 2 in our series, we left off during the early days of the NSA, as American cryptographers continued to labor under the shadow of the Black Friday change in Russian codes.

You may have noticed that part 2 got a little farther from puzzly topics than part 1, and there’s a reason for that. As the NSA evolved and grew, codebreaking was downplayed in favor of data acquisition. The reasons for this were twofold:

1. Context. You need to understand why given encrypted information is important in order to put it toward the best possible use. As Budiansky stated in part 1, “The top translators at Bletchley were intelligence officers first, who sifted myriad pieces to
assemble an insightful whole.”

2. Russian surveillance and bugging continued to grow more clever and sophisticated, pushing attention away from codebreaking. After all, what good is breaking codes or developing new ones if they can just steal unencrypted intel firsthand by monitoring
agents in the field?

Moving forward, the NSA would continue to pursue all manner of data mining, eventually leaving behind much of the codebreaking and analysis that originally formed the backbone of the organization. But that was in years to come. Cryptography was still a major player in NSA operations from the ’50s and onward.

[The progression of “secret” and “top secret” code words.
Image courtesy of NSA’s official Twitter account.]

In May 1956, NSA cryptanalytic veterans pushed a proposal titled “Recommendations for a Full-Scale Attack on the Russian High-Level Systems,” believing that specially designed computers from IBM could provide the key for cracking the impenetrable Russian cryptography wall. Some cryptographers believed that ever-increasing processor speeds would eventually outpace even sophisticated codes.

By 1960, the NSA had spent $100 million on computers and analytical tools.

The problem? The NSA was collecting so much information that their increasingly small team of cryptoanalysts couldn’t dream of processing even a tiny portion of it.

But the quest for data access would only grow more ambitious.

In the wake of Sputnik’s launch in October of 1957, US signals intelligence would go where no man had gone before. The satellite GRAB, launched alongside Transit II-A in June of 1960, was supposedly meant to study cosmic radiation. (GRAB stood for Galactic Radiation and Background.)

[Image courtesy of NSA’s official Twitter account.]

But it was actually intended to collect radar signals from two Soviet air-defense systems. This was the next step of ELINT, electronic intelligence work. (The younger brother of SIGINT.)

The NSA would later find a huge supporter in President Lyndon Johnson, as the president was heavily invested in SIGINT, ELINT, and any other INTs he could access. This did little to quell the intelligence-gathering rivalry growing between the CIA and NSA.

Of course, that’s not to say that the NSA ceased to do any worthwhile work in codebreaking. Far from it, actually.

During the Vietnam War, NSA analysts pored over North Vietnamese signals, trying to uncover how enemy pilots managed to scramble and respond so quickly to many of the US’s airstrikes conducted during Operation Rolling Thunder.

Careful analysis revealed an aberrant character (in Morse code) in messages that appeared in North Vietnamese transmissions before 90 percent of the Rolling Thunder airstrikes. By identifying when the enemy used that aberrant character, the analysts
were able to warn US pilots whether they were heading toward a prepared enemy or an unsuspecting one during a given sortie.

Other NSA teams worked to protect US communications by playing the role of an enemy analyst. They would try to break US message encryptions and see how much they could learn from intercepted US signals. Identifying flaws in their own procedures — as well as members of the military who were cutting corners when it came to secured communications — helped to make US communications more secure.

[Image courtesy of NSA.gov.]

In 1979, Jack Gurin, the NSA’s Chief of Language Research, wrote an article in the NSA’s in-house publication Cryptolog, entitled “Let’s Not Forget Our Cryptologic Mission.” He believed much of the work done at the agency, and many of the people
hired, had strayed from the organization’s core mission.

The continued push for data acquisition over codebreaking analysis in the NSA led to other organizations picking up the slack. The FBI used (and continues to use) codebreakers and forensic accountants when dealing with encrypted logs from criminal organizations covering up money laundering, embezzlement, and other illegal activities.

And groups outside the government also made impressive gains in the field of encryption, among them IBM’s Thomas J. Watson Research Center, the Center for International Security and Arms Control, and even graduate student programs at universities like MIT and Stanford.

For instance, cryptographer Whitfield Diffie developed the concept of the asymmetric cipher. Joichi Ito explains it well in Whiplash:

Unlike any previously known code, asymmetric ciphers do not require the sender and receiver to have the same key. Instead, the sender (Alice) gives her public key to Bob, and Bob uses it to encrypt a message to Alice. She decrypts it using her private key. It no longer matters if Eve (who’s eavesdropping on their conversation) also has Alice’s public key, because the only thing she’ll be able to do with it is encrypt a message that only Alice can read.

This would lead to a team at MIT developing RSA, a technique that implemented Diffie’s asymmetric cipher concept. (It’s worth noting that RSA encryption is still used to this day.)

[Image courtesy of Campus Safety Magazine.com.]

The last big sea change in encryption came when the government and military realized they no longer had a monopoly on codebreaking technology. Increased reliance and awareness of the importance of computer programming, greater access to computers with impressive processing power, and a groundswell of support for privacy from prying government eyes, led to dual arms races: encryption and acquisition.

And this brings us to the modern day. The revelations wrought by Edward Snowden’s leak of NSA information revealed the incredible depth of government data mining and acquistion, leading some pundits to claim that the NSA is “the only part of government that actually listens.”

Whatever your feelings on Snowden’s actions or government surveillance, there is no doubt that the National Security Agency has grown and changed a great deal since the days of cracking the ENIGMA code or working with the crew at Bletchley Park.

Where will American codebreaking go next? Who knows? Perhaps quantum computing will bring codes so complicated they’ll be impenetrable.

All I know is… it’s part of puzzle history.


I hope you enjoyed this multi-part series on the history of 20th-century codebreaking in America. If you’d like to learn more, you can check out some of the valuable sources I consulted while working on these posts:

Code Warriors: NSA’s Codebreakers and the Secret Intelligence War Against the Soviet Union by Stephen Budiansky

Whiplash: How to Survive Our Faster Future by Joichi Ito

The Secret Lives of Codebreakers by Sinclair McKay


Thanks for visiting PuzzleNation Blog today! Be sure to sign up for our newsletter to stay up-to-date on everything PuzzleNation!

You can also share your pictures with us on Instagram, friend us on Facebook, check us out on TwitterPinterest, and Tumblr, and explore the always-expanding library of PuzzleNation apps and games on our website!

Puzzle History: Codebreaking and the NSA, part 2

[Image courtesy of NSA’s official Twitter account.]

At the end of part 1 of our look at the history of the NSA and American codebreaking, we left off with the pivotal Black Friday event.

On November 1, 1948, all intel coming from monitored Soviet signals went quiet. All traffic on military, naval, and police radio links was replaced with dummy messages. It was such an unprecedented and alarming event that London and Washington briefly considered that it might’ve been the first indication of preparations for war.

According to Code Warriors author Stephen Budiansky:

The full extent of the disaster only became apparent the following spring when real traffic started reappearing on the radio nets, now employing greatly improved — and completely unbreakable — technical and security procedures. The keying errors or other mistakes that had allowed most of the Soviets’ machine-enciphered military traffic to be routinely read by US and British codebreakers for the last several years had been corrected, and the much more disciplined systems that now replaced them slammed the cryptanalytic door shut.

Even the one-time pads that had offered some hope to attentive American codebreakers were updated, eliminating the ability to sort messages by which organization they originated from.

Codemakers had suddenly outpaced codebreakers.

[The Kryptos sculpture outside CIA Headquarters. The NSA cracked
several of its codes before the CIA did. Image courtesy of Slate.com.]

The Office of Naval Intelligence wanted to take over from Signals Intelligence (SIGINT), demanding to see “everything” so they could do the job. They claimed SIGINT should limit their work to message translation, leaving interpretation to “the real experts.” This sort of territorial gamesmanship would continue to hamper government organizations for decades to come.

And that demand to see everything? That probably sounds familiar, in light of the revelations about government data collection and the PRISM program that were revealed in Edward Snowden’s leaks.

Black Friday was the start of all that, a shift from codecracking to the massive data collection and sifting operation that characterized the NSA for decades to come.

More amazingly, there was SO MUCH information collected during World War II that SIGINT was still poring over it all in 1949, decrypting what they could to reveal Soviet agents in the U.S. and England.

The fact that a high-ranking member of British Intelligence at the time, Kim Philby, was actually a Soviet double agent complicated things. After a decade under suspicion, Philby would flee to the Soviet Union in 1963, stunning many friends and colleagues who had believed in his innocence.

[The spy and defector, honored with a Soviet stamp.
Image courtesy of Britannica.com.]

Although the Russians had flummoxed SIGINT, other countries weren’t so lucky. The East German police continued to use ENIGMA codes as late as 1956. Many of the early successes in the Korean War were tied to important decryption and analysis work by SIGINT. Those successes slowed in July of 1951, when North Korea began mimicking Russia’s radio procedures, making it much harder to gain access to North Korean intel.

Finally, the chaotic scramble for control over signal-based data gathering and codebreaking between the government and the military resulted in the birth of the National Security Agency on November 4, 1952, by order of President Truman.

One of the first things the NSA did? Reclassify all sorts of material involving historical codebreaking, including books and papers dating back to the Civil War and even the American Revolution.

[The actual report that recommended the creation of the NSA.
Image courtesy of NSA’s official Twitter account.]

The creation of the NSA had finally, for a time at least, settled the issue of who was running the codebreaking and signals intelligence operation for the United States. And they were doing fine work refining the art of encryption, thanks to the work of minds like mathematician and cryptographer Claude Shannon.

One of Shannon’s insights was the inherent redundancy that is built into written language. Think of the rules of spelling, of syntax, of logical sentence progression. Those rules define the ways that letters are combined to form words (and those words form sentences, and those sentences form paragraphs, and so on).

The result? Well, if you know the end goal of the encoded string of characters is a functioning sentence in a given language, that helps narrow down the amount of possible information contained in that string. For instance, a pair of characters can’t be ANYTHING, because letter combinations like TD, ED, LY, OU, and ING are common, while combos like XR, QA, and BG are rare or impossible.

By programming codecracking computers to recognize some of these rules, analysts were developing the next generation of codebreakers.

Unfortunately, the Russian line was holding. The NSA’s failure to read much, if any, Soviet encrypted traffic since Black Friday was obviously becoming more than just a temporary setback.

Something fundamental had changed in the nature of the Russian cryptographic systems, and in the eyes of some scientific experts called in to assess the situation, the NSA had failed to keep up with the times.


I hope you’re enjoying this look at the early days of America’s 20th-century codebreaking efforts. Part 3 will continue next week, with the sea change from active codebreaking to data mining, plus Vietnam, the space race, and more!


Thanks for visiting PuzzleNation Blog today! Be sure to sign up for our newsletter to stay up-to-date on everything PuzzleNation!

You can also share your pictures with us on Instagram, friend us on Facebook, check us out on TwitterPinterest, and Tumblr, and explore the always-expanding library of PuzzleNation apps and games on our website!

Better Gaming With Math and Statistics!

[Image courtesy of ThreeSixtyOne.gr.]

Statistical analysis is changing the world. The wealth of available data on the Internet these days, combining with our ever-increasing ability to comb through that data efficiently using computers, has spawned something of a golden age in data mining.

You don’t need to look any further than the discovery of Timothy Parker’s plagiaristic shenanigans for USA Today and Universal Uclick to see how impactful solid analysis can be.

But it’s also having an impact on how we play games. Statistical analysis is taking some of the mystery out of games you’d never expect, making players more efficient and capable than ever.

We discussed this previously with the game Monopoly — specifically how some spaces are far more likely to be landed on than others — and today, we’re looking at two more examples: Guess Who? and Hangman.

Guess Who? gives you a field of 24 possible characters, and you have to figure out which character your opponent has before she figures out the identity of your character. Usually, if you end up with a woman or someone with glasses, your odds of winning are low, because some aspects are simply less common than others.

But is there an optimal way to pare down the options? Absolutely.

Mathematician Rafael Prieto Curiel has devised a strategy for playing Guess Who?, based on an analysis of the notable features of each character, breaking it down into 22 possible questions to ask your opponent:

Based on this data, he has even created a flowchart of questions to ask to maximize your chances of victory. The first question? “Does your person have a big mouth?”

Yes, not exactly a great first-date question, but one that yields the best possible starting point for you to narrow down your opponent’s character.

It’s certainly better than my first instinct, which is always to ask, “Does your person look like a total goon?”

Now, when it comes to Hangman, the name of the game is letter frequency. Just like a round of Wheel of Fortune, you’re playing the odds at first to find some anchor letters to help you spell out the entire answer.

But, as it turns out, letter frequency is not the same across all word lengths. For instance, E is the most common letter in the English language, but it is NOT the most common letter in five-letter words. That honor belongs to the letter S.

In four-letter words, the most common letter is A, not E. And it can change, depending on the presence — or lack thereof — of other letters.

From How to Win Games and Beat People by Tom Whipple:

“E might be the most common letter in six-letter words, and S the second most common, but what if you guess E and E is not in it?” In six-letter words without an E, S is no longer the next best letter to try. It is A.

In fact, Facebook data scientist Nick Berry has created a chart with an optimal calling order based on the length of the blank word.

For one-letter words through 4-letter words, start with A. For five-letter words, start with S. For six-letter words through twelve-letter words, use E. And for words thirteen letters and above, start I.

Of course, if you’re the one posing the word to be guessed, “jazz” is statistically the least-likely word to be guessed using this data. And your opponent will surely hate you for choosing it.


Thanks for visiting PuzzleNation Blog today! Be sure to sign up for our newsletter to stay up-to-date on everything PuzzleNation!

You can also share your pictures with us on Instagram, friend us on Facebook, check us out on TwitterPinterest, and Tumblr, and explore the always-expanding library of PuzzleNation apps and games on our website!