One of your early analysis was about popularity and word count and ratings on AO3. What got you interested about this topic? What were the questions that you were trying to answer?
I was curious what the proportions of the different ratings were. I actually expected, because I was reading more works in the mature and explicit category, that there were more explicit stories than non-explicit ones. So the very first thing I did was look at the actual numbers of fanworks with each rating on AO3.
I was playing around a lot with different types of visualizations, so I did this both as a bar graph, showing the number of fanworks, and as a pie chart, showing the percentages of the whole.
As I looked at the results, I realized that my basic assumptions that I started out with were incorrect. Gen and teen were a lot more common than the more explicit ratings.
I had a sense of maybe explicit works were not common, but maybe they were more popular, which may have been part of why I saw more of them. If I sorted by kudos or by hits in Sherlock, I saw more of the explicit and mature works at the top of the list. So, I thought maybe the amount of attention that a fanwork gets is correlated with its rating. I was just starting to realize how to find the average number of kudos, hits, and other popularity metrics for a set of works, so I obtained the median number of hits and kudos for each of the ratings (median just means the one in the middle if you sort by kudos, e.g.), and I looked at how popular each rating was.
As it turns out, hits and kudos generally follow a very similar pattern, but I didn’t know that at the time, so I looked at both.
In doing so, I saw that the increase in ratings was indeed corresponding to an increase in the number of kudos that the fanworks were getting. So it looked like my hypothesis might be correct. But I also had another possible explanation. I remembered I had recently looked at the word count and popularity on and had found that longer fics were more popular. So it occurred to me that the word count might be responsible here, too — maybe higher-rated fics are also usually longer, and that is what’s responsible for them being more popular.
I looked at the average word count for each rating and sure enough there was the same an upward trend. Which is not what I was expecting, actually — because I was thinking of PWP and shorter explicit fic; I had assumed there was a lot of that. There are certainly some of those, but it also turns out that there are a whole lot of drabbles and very short fanworks that are rated gen and teen — brief vignettes that are mostly about mood or a single moment — and that had never occurred to me.
So I saw this new graph, and I thought, how do I know whether or not the higher popularity that I’m seeing for the explicit fic is due to a preference for explicit fic or for long fic? So I tried to think of ways to separate those two factors. I realized that if I could look just at fanworks where the word count is approximately the same, I could remove the word count factor and see whether there was any remaining preference for explicit works. So I did compared ratings for a few sets of short, medium, and long fic — each with about the same number of words.
As you can see, most of the difference went away, especially for short fic. There’s no longer much of a difference between gen, and teen, and mature, but explicit is still substantially more popular. (There’s also Not Rated, but that’s a smaller bucket with a lot less data, and it’s harder to conclude anything about the content of that bucket.) So in fact explicit fics do get rewarded more!
It’s an interesting finding. I guess it boils down to fanfiction is not all porn, but people really want to read the porn.
Yeah, I think that’s the right way to put it. And I think it counters some assumptions by the mainstream media about what most fanfic is like.
I kept looking into variants of this question in part because there are a lot of presumptions about fanfic being mostly porn — and there were some academic writers and professional writers discussing fanfiction who wanted to cite this finding to show otherwise. (I think for the Sherlock Seattle presentation I finally had nicely boiled all of this down to two slides.) It took me a while to iterate through this and figure out how to explain it — the set of graphs we just went through was a very early attempt. I’m really proud of this post as an investigation, but I don’t think I presented it in the most intuitive way.
Nowadays, I try to make my posts have a little more narrative flow — here’s my question, here are the first graphs, and here’s what you can tell from this. I walk people through the steps — a bit like we just did here — and that seems to work better for explaining my thinking and my findings. So part of my process is learning over time what helps people understand what I’m researching. Over time I can see what makes sense to the reader and what doesn’t.