I don’t like rating games, video or board. A good review should manage to encapsulate how you feel about a game without stamping a score at the bottom. Numeric ratings attract attention away from the writing, and have neither the subtlety or nuance to express wider ideas about the value of the game beyond its play, or the reviewer’s tilt.
But I don’t always have the pleasure of writing just as I’d like to, and many of the editors I’ve worked for want scores. Out of five, ten or, worst of all, a hundred. So I dutifully assign a number and try to move on. But I remain haunted by past scores. Is game X really two stars better than game Y? Was I really right to give game Z that score out of a sense of quality, even though I, personally, disliked it?
Because people put store by those numbers, and therefore so must I. In spite of my misgivings I try to get it right, and end up in ludicrous situations like giving a game a higher rating than I feel it entirely deserves just because it’s marginally better than another game I previously scored slightly generously.
The most corrosive effect of scores, though, is what we’ve come to know as the weak seven effect. You play a game, enjoy it, but aren’t blown away by it’s quality. Perhaps you frequent one of the many online communities which allow you to rate your collection, and decide to give it a score. What do you go for? Seven.
It’s a weasley number, is seven. The people who wrote disquieting thriller se7en clearly knew it, as did the antediluvians who came up with the seven deadly sins that inspired it. When it comes to rating games, eight looks like a solid score for a quality game, while six feels fairly negative. Seven sits in the middle like a spare part, uselessly indicating that the game is fun, but not quite fun enough.
It’s particularly pernicious when those ratings are compiled into an average. Averaging out forces down the score, because you’ll never get an average of ten and even nines are extremely rare. But because it’s still a rating out of ten, you don’t see that, you just compare the compiled rating with the maximum, ten, to get a sense of the game’s quality. Then all those sevens, which aren’t actually that high as individual ratings, suddenly add up to a much more appealing average.
And average is the right word. Because that’s really what we’re saying with those sevens. Fun, but not quite fun enough. That’s average, really. But of course out of ten, seven isn’t the average. The average should actually be five or six. And so we end up with over-inflated ratings for games and punters end up spending good money on games they rarely, if ever, play and the world economy keep on going round.
The problem is that games are supposed to be fun. They’re supposed to be exciting, give pleasure to the gamer. So you end up feeling overly positive about a game that’s actually fairly run of the mill, because it’s partially fulfilled its purpose. Think about it. Finding a game I actually dislike it pretty rare. Finding one I really hate is almost unheard of. I don’t think I’ve ever played one in which I can’t see a single redeeming feature that might appeal to another gamer.
As the design industry has got better, this problem has got worse. Designers learned from the awful train wrecks of the board gaming 70s and video gaming 80s not to repeat those mistakes. It’s pretty unusual nowadays to see something as dreadful as SimCity or Colonial Marines – indeed this column was partially inspired by the rarity of having two bad big-studio games back to back.
But although the quality of the majority of games has increased overall, to the point where unfun games have become a thankfully hunted breed, the top notes haven’t changed at all. In any given year there are still as many stellar titles, games that thrill you, demand repeat plays, stay in your head for months afterwards, as there were in the 70s and 80s.
Those are the games you want to pick out. Ideally as soon as they come out. But to get the proper perspective on their quality takes long hindsight, by which time you may have wasted your money on those horrible, bland, average sevens that have been pushing the baseline up and making themselves look better than they really are.
This is my fault. And I’m sorry.
Not mine alone, of course, but I must shoulder the blame. Along with every other critic who forgot that seven isn’t really the average out of ten. Along with every other reviewer who neglected to realise that good is the new average. Along with all the other journalists who needlessly pushed up a game’s score just for the sake of consistency.
The solution is relatively simple. If I have to rate games I like a five point scale the best. It gives you the chance to differentiate the good from the average, and make sure your readers properly understand what you thought about a game in case you botched your review a little and failed to make that subtle distinction.
When I first started using social media book-tracking site Goodreads, I was therefore pleased to see it used a five-point scale. But I was momentarily puzzled by the names. Three, the middle point, was “liked it” which seems distinctly above average. Two, which looks below par, is “it was ok”. Confusing, until you remember that books are like games. Generally only decent books make it through the publishing process. Really bad books are rare. Therefore, the media is actually good..
So from now on when I rate games, that’s the scale I’m going to use. I’ll just double it or quintuple it for higher ceilings. May I humbly suggest you try and do the same. Good is the new average. Long may it remain so.
This sounds a great deal like the scale Tom Chick uses for Quarter to Three. Based on how he described it to Ed Del Castillo on his Games podcast, a 3/5 means he liked it, 4/5 means he really liked it, and 5/5 means STACK_OVERFLOW_ERROR::game.like.Error.
There are equal problems in the opposite direction, where a 2/5 means he didn’t like a game very much, and a 1/5 alternately means he was quite bored by a technically sound game (Halo 4), or literally unable to even play the game and he realized, if he could, none of the mechanics would function as advertised.
I enjoy reading your articles on No High Scores quite a lot and frequently wish you were able to participate in the podcast. This specific idea, however, is rated “parsnip” out of a total “zest.”
That’s harsh. Don’t I even get a “Fricassée”?
Here’s the thing: I like five-point scoring systems the best because they offer a clear and intuitive distinction between terrible, poor, mediocre, good, and excellent. That’s a pretty thorough range of ways a reviewer can relay his or her personal experience with a game, as Tom recommends, without getting bogged down in flowcharts explaining the differences between a 3.25 and 3.38. Simple!
No one needs to rejigger this scale to ensure an equal distribution of scores. Off the top of my head, SimCity, Assassin’s Creed III, The Cave, XCOM, and Bioshock Infinite would cover it neatly. Bad-to-middling games are more than common enough to preserve a Gaussian distribution without artificially deflating scores. Tom Chick only gets away with it because he was bitten on the hand by a radioactive standard deviation.
I’ll even put my money where my mouth is. I’ve got copies of Infinite Undiscovery, F.3.A.R, and Star Ocean: The First Departure right here. You can play them while listening to Ace of Base and sipping Zima with January Jones, and if that isn’t the most three-out-of-five Tuesday afternoon you’ve ever had, I’ll buy you a box of Corn Flakes.
I hate having review scores on games. I think we’d be better off without them. In fact, when I become supreme ruler of Earch, my first act as your benevolent dictator will be to ban review scores for all media. It’s for your own good, trust me.
And my second act as supreme ruler will be to hire an editor to proof read my forum posts for spelling errors…
Or change the planet’s name to Earch. You know, whatever you have time for.
I feel like 7/10 being average comes from the same vein as school grading scales, where a C (70-79%), is considered an average grade. While yes, 50%, might technically be the average of those numbers, 70% or so is where you actually see the average because that’s the general level of work being put out. Not great, but also not bad.
Same with video games. Its not necessarily “fair” to give an average game a 5/10 because an average game is actually a decent one, as you’ve pointed out. It gets more right than it gets wrong, and the score should reflect that.
Of course, in a perfect world, none of this would matter and reviews and games would be considered more for their content, not their scores.
The issue you bring up about scores and trying to base them on past scores you’ve given is an issue I’ve had many times when grading assignments as a TA. Every time a student does something different wrong I spend the next twenty minutes determining how much should be taken off for that mistake compared to how much I took off for other mistakes earlier. Is this mistake worse? No? Can they be compared at all? Well, that does not matter because the assignment needs a score. I just want to tell them what they did wrong and move on…
I wish we lived in a world where I could just let the assignment stand on its own. They do not need a score, the score was not the point of the assignment, unfortunately that’s not the system we’ve created. Students will work for a certain score, rather than work to understand the material.
I feel like that is a huge issue in games. They work for a score, not to make a game that they find most enjoyable.
Actually, statistically the average choices for picking a number between 1-10 is 3 or 7. This is believed to happen because we see these numbers as the middle ground witch is also why 7 is used as the “lucky” number. I also don’t think it’s as misleading as you seem to feel it is as I think when most people see a 7 they think there is a good chance that it can fall on either side of the line, that it could be good or bad and tend to look more into it than if it was a higher or lower number.
I do agree though that scores tend to pull readers away from the writing and that they shouldn’t be used but I think reviewers being paid off by companies to give higher scores is more of an issue than number creep caused by this.
Review scores are highly political- not just for publishers and game industry folks who use them very, very foolishly as a metric of quality and/or success but also for readers who want some kind of effing diplomatic, non-comittal, and “objective” review. So an “OK” game will get a 7 (when it really should be a 5), a good game will get an 80, and a GOTY candiate will get 90+. Anything below 80 is pretty much “bad” for some reason.
The ONLY review scores should be 0 and 100. Thumbs up, thumbs down. Either a game is good or it isn’t, and the text explains the degree.
These days though, folks are more likely to quote a Metacritic score than a review they’ve read. “Oh, this game got a 95, it’s good”. Not “I hear this game is really good because X, Y, Z”. The critical, analytical narrative has decayed, and the “games journos” that peddle this shit and are complicit with the industry’s destructive trends are writing to the score because they know 8 out of 10 readers scroll straight down to the score. Or, if it’s at the top, they stop right there.
I think my “scores mean nothing” moment came when I was writing for a British tabletop magazine…I gave Memoir ’44 8 out of 10 stars (so an 80). The editor changed it to 10 out of 10, telling me that it should be higher. I actually felt that I had given it one or even two stars too many. So WTF was this game’s “real” review score?
‘The ONLY review scores should be 0 and 100. Thumbs up, thumbs down. Either a game is good or it isn’t, and the text explains the degree.’
sniff. It’s always a proud moment when you see ’em grow up. This was like listening to cat’s in the cradle.
You are absolutely right. This is why I don’t get reviews from places that aren’t NHS and QT3.