Breaking Down the Myth: Why Headline Analyzers Don’t Work
Does our data science reveal the real purpose behind these free analyzers online?
Here’s a fascinating guest post by one of my favorite Substack/Medium writers. Finn Tropy knows a lot about writing and even more about data. He is the master of FinnSights on Substack. This story is a deep dive into a scientific theory about online headline analyzers. Finn’s conclusion will make you happy.
I've used headline analyzers to check my headline candidates and select the highest-scoring ones for my stories. These analyzers offer convincing-looking numbers, fancy graphics, and guidelines that make intuitive sense, like “use power words” or “keep it under 60 characters.”
Some of these analyzers even provide you lists of Power Words, Emotion Words, and Uncommon Words so that writers can “tweak and refine the headline based on suggestions until they find the best one for increased engagement.”
Given all these excellent marketing claims, a higher score from the analyzer should correlate with higher engagement for the story. Huge mistake. Please don't waste your time like I did.
I decided to test a headline analyzer using 18,758 headlines from the top 30 Medium.com writers with known engagement metrics. What I found changed my perspective on these analyzers.
Science-based testing method
We formulate a hypothesis and use basic statistical testing to find the truth.
The hypothesis
I categorized the stories based on the Z-score of log(claps) to examine whether the average clap values are influenced by the analyzer scores. A Z-score tells me how far away a number is from the average of a group of numbers. I tested the hypothesis against null and alternative hypotheses to be accurate.
Let’s review the data
I plotted two histograms and determined that most stories have less than 1,000 claps and the distribution of log(claps) resembles a bell curve.
Then I ran all the 18,758 headlines through the headline analyzer using a simple Python script and saved the analyzer scores. It took several hours so we could compare original claps to what would be expected by virtue of the analyzer's determination of headline quality.
Next, we organize the data into buckets based on each story's claps value (on Medium.com claps are akin to likes). We can again count the stories in each bucket by rounding the Z-score to the closest integer value.
Perform the test
Let’s group the claps and analyzer score mean values (mean value=average of the numbers) in the same buckets by the Z-score. We can already see the big problem. The analyzer score is surprisingly close to the mean value of 62.97, regardless of how many claps the particular story has received. There is no significant relationship between mean claps and mean scores.
See the "Analyzer Score vs Claps - Scatter Plot by Z-score" graph below. That has all 18,758 data points plotted - vertical axis has the analyzer score and horizontal axis has the claps from each story.
If there would be some correlation, we could see a clear pattern of 45-degree line from bottom left to top right corner. Instead, we see a randomly distributed dots everywhere.
The colors represent how many standard deviations the respective claps value is from the mean (0.0 is purple color).
So, the hypothesis, “The mean values of claps depend on the mean analyzer scores across different Z-score groups,” is FALSE.
If you want more detail, I created a scatter plot with the analyzer score on the vertical axis and the claps on the logarithmic horizontal axis. To make different groups more visible, I color-coded the claps by Z-score. On the left, the blue dots indicate fewer than four claps; on the right, the grey dots indicate more than 24,000 claps. If the analyzer score correlates with engagement (claps), we should see high score values (over 80) only in the top right corner and low score values (less than 30) in the bottom left corner. However, the analyzer scores are distributed randomly across all clap values.
The "shocking" results
At this point, I had spent many evenings during this week trying to prove myself wrong because I firmly believed that these headline analyzers worked as advertised. I plotted correlations, ran through different ANOVA and Kruskal-Wallis tests, and looked into other data that the headline analyzer provided, such as
common word percentage
power word percentage
emotional word percentage
sentiment
length in characters
word count
I tweaked the numbers to show weak correlations between claps and other data, such as the word count and the number of characters in the title. I studied the original 18,758 stories dataset I had extracted from Medium to understand if I was missing something fundamental. I ditched the analyzer and tried random numbers as scores. Guess what? The random numbers outperformed the analyzer scores with a slightly stronger correlation!
Yep, you’d be better off rolling dice than wasting time on this headline analyzer.
After all this testing, I am convinced that the analyzer score has nothing to do with story success and engagement. These marketing slogans are misleading.
What is the real purpose of these free headline analyzers?
So why would web developers go through this effort to build this kind of headline analyzer? I did a bit of research and found out that the “headline analyzer” search term surfaced around 2013 and the interest has grown significantly worldwide ever since.
It sure looks to me that the headline analyzer is a clever marketing ploy to drive traffic to their site with minimal costs. Building a beautiful infographics page with a random number generator shouldn’t take longer than one day.
I’m pretty sure that a small fraction of users will convert to paying customers, or the website owner will monetize the traffic through advertising or affiliate links. That is why you can find so many of these free headline analyzers. You are the product, and they were created to monetize your attention.
Conclusion
Headlines are essential, and you must create a title that will draw readers’ attention to your story. These free headline analyzers were created to monetize your attention. You may find some useful “power words” or “emotional words”, but don’t trust the score.
You are better off throwing a pair of dice or using a random number generator to see a number you can imagine to be your score. It will be as significant as the generated score.
Finn Tropy is a storyteller, note-taker, and engineer passionate about weaving narratives and sharing data-driven insights. Blending creativity and technical expertise, he inspires growth through practical wisdom and actionable advice drawn from decades of leadership and mentoring. After retiring twice, Finn now enjoys a serene life in New England with his wife, daughter, and dog, Luke, while crafting stories, analyzing data, and building digital tools that empower others.
Join Finn Tropy’s subscriber chat
Available in the Substack app and on web
For definitive guide on writing titles, read Writing Titles that Demand Attention.
WOW! "A clever trick to Monetize your attention." This was mind blowing
Wow, thank you. I have never used the analyzers having felt suspicious about their value all along, but your research allows me to put the niggling feeling I should be using them to rest.