Applying Assessment Strategies to Popular Games

Posted: December 4th, 2017 under Filament Games, Opinion.

In a previous article I talked about assessment strategies in games. Throughout that article I discussed games in a very abstract sense using examples of Filament Games’ work. In this article I’m going to assign assessment strategies to some of the most popular games ever made.

Classifying Games
First, let’s take a moment to talk about how we classify games. Other media, like books and movies, are classified by genre of their story: drama, comedy, horror, romance, etc. In video games, story is just one component, the differentiator is the interaction. Video games have their own genres based on perspective, number of players, and style of interaction: first/third person shooter, real time strategy, sports, racing, fighting, casual, puzzle, turn based, survival, role playing game, simulations, massively multiplayer role playing game, and multiplayer online battle arena. However, typically when we are talking about games we talk about verbs. Commonly referred to as the mechanics of the game, verbs are the actions the player takes. The user having control over the experience is what makes games more interesting and compelling than other media.

The mechanics of the game are the most important factor in assessment. Story is still an important component in crafting games. The game story engages users, gives the world consistency, and makes the experience compelling. However, in terms of mechanics, and therefore assessment, it does not necessarily matter if a game is science fiction or medieval fantasy. The Legend of Zelda is a coming of age epic, comparable to Star War episode IV (the first one made). Despite the fact that Star Wars is science fiction and Zelda is medieval fantasy the actions the player would take in a game are very similar. For example, lightsabers are comparable to Link’s master sword while the terms magic and “the force” could be used interchangeably. Both of these tools are employed to restore peace and order to the land/universe. They are mechanically similar despite having different aesthetics. Story isn’t even a component of some games. The board game Chess does not have a theme, story, or characters. In terms of classifying games and understanding assessment, the focus is on the actions the player takes, the specific obstacles they encounter, and how they overcame those obstacles on their journey through the game.

I do not know of any complete list of game mechanics or ludology that exists because there are most likely infinite possibilities. There are also many examples of games that break all preconceived conventions. Just remember the mechanic is the verb – the action that takes place. Popular examples of mechanics include rolling dice, playing cards, allocating resources, navigating an environment, building structures, shooting, or any other verb imaginable.

Assessment Challenges
When we think about assessment, It’s important to keep in mind that a videogame can only measure actions, not thoughts. This is why it is critical to think about verbs when considering assessment.

When I looked through lists of best selling games, I immediately realized some of them would be impossible to assign an assessment strategy. Take the game Minecraft for example. Minecraft is an open ended experience where each player decides how to give the game meaning. The player determines how they want to play and defines their unique personal goals. It is impossible to write an assessment if we do not know the player’s objective. With self-defined goals there isn’t anything to measure against because too much of the game space exists in the player’s brain. Being unable to construct assessment is not the same as saying there is nothing to learn or that you cannot use that technology to teach. It simply means that it is hard for us to measure something that is intrinsically linked to the players individual experience. Minecraft is still a worthwhile teaching tool, but in order to be successful educators need to create specific content and provide external structure to the gameplay.

In addition to open-ended experiences, I struggled to find assessment strategies that worked well with fighting games. Any member of this category can serve as an example. Let’s take a look at the game Street Fighter. Street Fighter has one of the longest running competitive gaming followings and was instrumental in the growth of the eSports movement. The top players are well known, have written books and have been the subject of multiple documentaries. At the highest levels, these games revolve around predicting your opponent, bluffing, feints, baiting, positioning, eliminating your opponent’s options, and trying to design an unpredictable fighting strategy. It is not always best to make the optimal move if it becomes predictable. David “Low Strong” Sirlin was notorious for repeating the same move, or even doing nothing, frustrating players into acting first or preying upon the phenomenon of gambler’s fallacy. It’s difficult to assign an assessment strategy to fighting games because there’s no way for a line of code to determine a player’s intent. As we mentioned before, we can only assess what actually happens in the game. It is impractical to attempt an assessment of the breadth of human psychology.

Completion Assessment
Super Mario Bros, Mega Man, Sonic the Hedgehog, and Star Fox are all fundamentally the same in terms of assessment. Your primary goal is to reach the end of the level while avoiding obstacles. Occasionally there is a boss that you have to defeat and destroying minions along the way might give you a higher score, but it isn’t required. Even Star Fox has several areas that just test your flying skills. Since they are very linear experiences the best assessment mechanism is levels completed. You might also want to track score, obstacles avoided/hit, and resources gathered. I would avoid the temptation to track individual mechanics because mastery of these mechanics (jumping, powered-up shots) is required to complete the level. More important metrics to track would be attempts versus success and the point in a given level where players fail. Knowing this information helps better determine where players are struggling.

Difficulty Assessment
The one style of assessment I struggled to find among a popular title was adaptive difficulty. While not popular, perhaps the best example is Sin Episodes. Their game continually monitors your performance across more than a dozen metrics and the game adapts in several dimensions accordingly. The degree of difficulty presented by the game is directly correlated to the player’s skill level. As a game developer it is challenging to make a game anyone can play, continually hold players’ interest, and offer replayability. That is the value of adaptive systems commercially. From an educational perspective, adaptive difficulty offers a way to directly measure and a player’s skill over time. Again, the hard work of measuring the player against the game mechanic is handled by the engineering and design of the game.

Scaffolding Assessment
Among the popular games, the simplest form of assessment can be found in Guitar Hero. Guitar Hero measures how accurately you press the buttons associated with a particular song. It really is just a skill-based test. Assessment in this game could go beyond just your percentage score. This could become a great example of scaffolding. If we were designing Guitar Hero as a learning game, we could add features to break down particularly difficult riffs and create customized practice sessions based on player performance.

Big Data/External Assessment
Let’s look at two of my favorite games, StarCraft II and Resident Evil 4. Like many games these two are about resource management; however these games take it to an extreme. Just look at the data Blizzard provides at the end of a StarCraft match. There is a complete breakdown of your economy, build order, and unit counts. Individual units also keep kill counts that you can see in game and during replays. It is a system where the game provides you with every tool you could want to analyze your play. It literally records every action the player takes. Resident Evil is a linear third person shooter, so the efficiency calculation comes down how effectively you process enemies. In a game with severely limited resources it’s important to not only be accurate, but use the best resource in your arsenal for the task at hand. Resident Evil doesn’t provide you with every metric, but it does give you some important ones. While these are very different kinds of games, they utilize the same evaluation process as trading pieces in Chess. In Starcraft and Resident Evil players must execute “moves”, though in the big picture you are trading resources for enemy units (units can be thought of as a proxy for resources). The best assessment metric in these games is analyzing how expensive it was to remove enemy units from the game.

Many times assessment and artificial intelligence go hand-in-hand. If at any moment of gameplay we can rank the value of verbs at the player’s disposal, we can write AI or assessment for that game. Historically, one of the biggest artificial intelligence challenges was Deep Blue. In an effort to beat a Chess champion player, IBM had to consult with Chess masters to understand how pieces work together and the value each piece holds. This was a challenge because the possible number of moves grows astronomically as you take turns. This is similar to the game Go and (while on a smaller scale) the video game Tetris. Writing AI for Tetris is a formidable challenge. Calculating the permutations and placement of pieces is significant work. If you watch masters play the game you see that it isn’t about clearing individual rows, but clearing four rows at once. In order to go for a high score you make an open column (known as a “gap” or “channel”) and use the long piece to acquire as many “four line clears” aka a “Tetris” as possible. The fact remains that if we can write AI for a game, we can compare an individual’s performance against optimal performance. After that we just need to compare the value of moves made and develop a score.

Conclusion
There is a large divide between commercial games and educational games because verbs and goals tend to be very different. What is interesting is that most commercial games do include scaffolded material and significant tutorials at the beginning of their games. While learning is an intrinsic part of any successful or competitive game, educational games strive to be concise, targeted experience with specific learning outcomes. When you think about your favorite games, how might you assess the “learning” you might do in those spaces? What is the measure of success, and how might that be applied to learning games of a similar nature? Sound off in the comments below!

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment