Two Systems at War
9:57 AM
Posted by Reygahnci on Tuesday, June 10, 2008
So, guntir responded to my previous posts concerning the sad state of affairs in the rating system of WoW, and I started writing a rather lengthy reply, but I have since decided that such a reply could contain some nuggets of information. So, I am simply making a post out of the discussion.
Essentially, the conclusion drawn from guntir's post is that the Elo system employed by Blizz for rating arena teams in WoW is flawed, and while neither of us could come up with a better system, we still think that improvements could be made. Strive for perfection with full understanding that such is impossible. Reach for the sun and all that.
I pose the situation in which a 1500 team (a true 1500 team, in blues and some budget pvp gear) beats a 1700 team (a true 1700 team in mostly purples and last season's shoulders/weapon). The Elo system states that the 1500 team is better than 1500 and should be compensated for their achievement, and not only that, but they also defeated a team 200 rating higher than them, so their change in rating should be rather larger than if they were to beat a comparably geared 1500 team. On the other side of the coin, the 1700 team should lose a lot because they were bested by a team of lesser standing. In this scenario, the Elo rating system works perfectly well, except that the 1700 team may be a bit angry due to counter-comp or luck, or whatever. However, there exist cases where the Elo system is at a detriment.
Imagine the same two teams, except the 1500 team is a rerolled 2k team with full main season gear. This is where the Elo system has trouble, it doesn't fail, but it certainly is being gamed. In my opinion, there has to be a separation in process between the match-making system, and the rating exchange system. The Elo system definitely relies (rather, was constructed) on the fact that a 1500 rating MEANS 1500 skill level (and gear level... in WoW... chess doesn't have gear obviously). Essentially, the system relies on wins and losses to determine a given team's skill level, so putting a 1500 skill label on a team who is quite obviously geared towards the upper 1900-2000 range is a flaw in the system, which would let, if the 1500 team were treated as the 2000+ team they probably are, the 1700 team face a team more suited to their gear/skill level. In truth, some of the blame can be placed on these teams that reroll just to screw with people (or to sell points or rating). However, most, if not all, the blame resides with the developers for not foreseeing such a situation ever occurring when they were implementing the system originally.
Okay, okay. That is quite enough slandering developers. I am a developer, and I do not like my faults brought into such a public light. So, let us discuss possible ways to improve the system such that we close off all our loose threads, shall we? There are a few flaws that need to be addressed before the system is passable in terms of fairness:
- Rerolled teams should start appropriate to their gear/skill level, rather than 1500.
- Point selling needs to be done away with.
- Rating selling needs to be done away with (in the situation where gaming the system is involved).
In essence, there are two main parts to the first problem: a 2v2 team is reset to 1500 after getting to 2k and buying all their main-season gear, or a 5v5 team is 2000+ and two members decide to start a 2v2 team at 1500. Of course, these are just two examples of all the combinations of ways to game the system, but the point is clear: there are a number of ways to get a 1500 rated team when one really should be rated higher from the get-go. To dissect the second part of this problem, we would need to assume that a player's skill usually has more to do with the player, rather than their team makeup and teammates, and therefore assert each player with their own single personal rating.
Rather than having a personal rating for each type of team (2v2, 3v3, 5v5), we would state that if you have a 2200 5s team, and that is all you play at the moment, then your personal rating overall should be about 2200. So, when he forms a new 2s team his personal rating is 2200, but his team rating is 1500, which causes the gap clause that Blizzard put into place to occur at the time of match making. Instead of facing a 2s team that is 1500, he would face (if his team mate was from his 5s team as well and comparably rated) a 2200 rated 2s team. If he won, his personal rating would change a little bit, but his team rating would change a lot. The idea here is that a player is the defining factor, not the team. While the team may play a heavy roll in whether or not a player can win against another team (in the case of counter-comps), one would have to assume that a 2200 5s pally would do better than 1500 on a 2s team when teamed with almost any other class. Perhaps he wouldn't be 2200 rated skill on that 2s team, but then his play will determine that as he is teamed against other 2200 2s teams. In this way, we give players "the benefit of the doubt" that they will inevitably be a high rated team because of gear from another team, but they will still have to prove that they can win, after a certain point, by doing just that to maintain their high stature. Losses in this system would cause the personal rating to shift marginally (15 plus or minus) but the team to change very slightly. In this way, we would expect the two vastly different ratings (personal and team) to converge if this player only played the 2s team from then on; the idea being that they would converge to their actual 2s skill-level, rather than dramatically out-gearing their first many opponents and getting to that point with relatively few losses.
To do away with point selling, the system would be changed so that instead of getting points via the team's rating, one would get points via their personal rating at the calculation of the 2s team. That is to say, it would not matter if a person was on a 2200 rated 5s team and a 1300 2s team with an 1800 personal rating, they would earn 578 points (as the current system would pay that many points to a 2s team with an 1800 rating). This will, undoubtedly, upset many of the higher rated 5s teams as they are usually the first to fill their entire gear-set with the main season's rewards, but it will essentially level the playing field so that 2s teams are rewarded as well as 5s teams for good play, rather than good play in the 5s bracket. Under this system, a player gains points based on their ability to play well and get a good personal rating, which makes selling spots on teams that much harder since a player has to win a lot of games to get their personal rating up, which would make it harder for a 5s team to win by putting a green player into their midst, as they would still be facing teams at or around 2000 (if they were trying to sell points for 2000). On the other hand, if we wanted to keep the "5s gets more points than 2s" idea, but keep point selling out of access, then we could employ a system that earns the player points based on their personal rating using the calculation from their most played team. If they play their 5s team the most, then they will earn their personal rating's equivalent of the 5s team calculation under the current setting. Again, this will keep people from simply putting someone on their 5s team and just try to win with an anchor, considering that their own personal ratings will fall if they lose matches, and in the proposed system, their personal ratings are more important than in the current system, so it would "cost" them much more to lose.
The last part of the problem is the rating sellers. Currently, if a rogue can get to 2000 and buy their gear, they can be teamed with almost any other member and win in the 2s bracket. This does not mean, necessarily, that rogues are too powerful, but it does mean that it is easy for a given player to sell a rating plateau to another player. A balance druid could team with a 2200 rogue in full main season gear and easily win matches all the way to 1850 where the balance druid will buy their weapon and either go on their way, or make a run for 2k to get the shoulders (obviously these numbers change in season 4, but the idea is still the same). In the system I described to combat problem one, the team would be formed at 1500, however the rogue would still have his 2200 personal rating, which would cause the balance druid (let's go extreme and say that the druid already has a 1700 personal rating) and the rogue to be teamed as if they were 1950. Likely, this team has played few (if any) matches together because the rogue is simply trying to turn a profit, so their cohesion will be poor and they will probably have bad communication which will likely lead to losses against teams of this caliber. This will effectively cause the balance druid (who is just trying to get to 1850) to likely be at a detriment going with the rogue as opposed to another team mate around his skill level or a bit lower, and going with that team mate (the one around his skill level) is no big advantage as that is what we expect the system to be: fair.
There are a few ways to game this system, but they require so much from a player that it is almost impossible to guess that someone would do it. For instance, if a player gets full main season gear, then they could try and force their personal rating to plummet. If they actually got their personal rating to an absurdly low point, say 900, then they could team with someone at 1500 and get matched against teams who are lower than they are. This will effectively still be rating selling, in that a rogue could completely destroy his personal rating with the idea being to easily pump someone else to 1850 (or whatever) to get their various gear easily. However, this would be so easily detected, that we could declare it a violation of the ToS and a ban-able offense, which would probably keep people from doing it. Additionally, this system still does not combat win-traders, but Blizzard is also noticing that such is easily detectable, and therefore a ban-able offense. Therefore, these changes would make the Elo system more accurately describe the skill-level of the teams playing.