Saturday, June 24, 2017

Some Statistical Analyses of Hear the Screams of the Butterfly

I am currently reading Nabokov's Favorite Word is Mauve by Ben Blatt, which analyzes literature using statistics. It's actually a pretty fun book, pointing out the linguistic DNA of each individual writer, and the linguistic differences between men and women, among other things.

I thought, for fun, I would do a few analyses on my own book, Hear the Screams of the Butterfly, and see what happens.

A great example is "he" vs. "she." Men use "He" much more often than they use "she," while women balance the two, with a slight bias toward "she" over "he." How, then, do I stack up?

He = 611
She = 273
That makes a 69% "he" imbalance.

However, when I do "him/himself" vs. "her/herself":

Him/Himself = 75
Her/Herself = 306
That makes for a very clear 18% "him/himself," meaning a striking imbalance in favor of "her/herself."

Since Blatt only analyzes him vs. her, this might suggest either a flaw in his methods, or an idiosyncrasy in my own writing. If we combine the two analyses above, we get:

He/Him/Himself =  686
She/Her/Herself =579
That brings us to a 54% "he/him/himself," meaning an almost perfect balance between the two.

In case you're wondering, 54% puts me on par with The Wonderful Wizard of Oz and O Pioneers!, if we accept that my he/him/himself is identical to Blatt's he-only listing. If not, my he-only percentage of 69 makes me more on par with The Great Gatsby, One Flew Over the Cuckoo's Nest, and A Passage to India.

Another thing Blatt analyzed was the use of -ly adverbs. I discovered that I used a total of 225 -ly adverbs in my novel of about 27,000 words. That gives us about 83 -ly adverbs per 10,000 words (the ratio Blatt uses), meaning I'm on par with Mark Twain (81) and Hemingway (80), between For Whom the Bell Tolls (75) and The Old Man and the Sea (92). Still, we all three get our butts kicked by Faulkner at his best: As I Lay Dying (31) and The Sound and the Fury (42). All in all, though, I think I'm in pretty good company when it comes to adverb use.

Many of the other analyses are too complex for me to do to my novella. And there was an interesting set of words where men and women differed in the uses of those words depending on whether they were describing women or men that I ended up not using at all, or only using once or twice. In a novella ostensibly about love, the word "kissed," for example, comes up exactly once. But then, my novella is about unrequited love, so that may in fact preclude much kissing.

So from the perspective of the use of gender pronouns, my novella is actually close to gender-neutral. And the lack of -ly adverbs suggests I did a pretty good job of going over and over and over the novella before it was published. As I read more in Blatt's book, I'll do further analyses of my novella. Should be fun.

