Congress proposal 2021-033 Develop a Bespoke Rating System for ICCF – [Conflicts with proposal 2021-031]

Congress proposal 2021-033 2021-033 Develop a Bespoke Rating System for ICCF posted on the ICCF website in the congress proposal area.

Proposed by Austin LockwoodICCF Services Director

Abstract

A plan to develop a new, bespoke, rating system, suitable for correspondence chess, is proposed.

The pattern of results in modern correspondence chess has changed significantly since the current Elo rating system was introduced to ICCF. The rise in powerful engines has produced a strong tendency towards drawn games, particularly at elite level of play.

The proposers of Proposal 2021-031 are quite correct to note that the current rating system is no longer fit for purpose. As well as not meeting one of the fundamental assumptions of a probabilistic rating system (that the difference in ratings is a predictor of game outcome), the proposers are correct to note that there has been some deflation at the extreme right-hand side of the distribution which has made it increasingly difficult for strong players to reach GM standard.

Unfortunately, Proposal 2021-031 was not subject to expert review until July 2021; we were able to recommend some corrections to the more obvious errors (for example the sum of winning expectancies not equalling one) and to their credit, the proposers responded to this feedback and made some corrections, however some serious methodological flaws remain in the proposal, which to fix would take some considerable time. Proposal 2021-031 also fails to address the general problem of rating inflation, which has been an issue for some years, and indeed were we to implement this proposal, this inflation would be accelerated and if not corrected would have a profound effect on the relative values of future ICCF titles.

We therefore propose an ambitious project to develop a bespoke rating system for correspondence chess based on both the requirements of our players and on sound professional consultation.

 

Proposal

ICCF will develop a new system which is suitable for generating correspondence chess ratings for the foreseeable future, considering future engine development and the strong possibility of the draw rate increasing further at the highest level.

It is envisaged that the features of this system will be:

  • The system will be tuned by optimising all parameters to “best fit” previous results so that the difference in ratings of two players best predicts the outcome of a game between those players.
  • The system will have a stable mean and interquartile range, supressing inflation, and deflation, and ensuring that the relative strength of a player achieving a past GM norm is comparable to that of a player achieving a future norm. We do not wish to devalue our titles through excessive inflation.
  • A process for monitoring the changing pattern of game outcomes will be included, so future tuning in response to a future increase (or a decrease) in the draw rate will be baked into the system.

 

Phase One: Establ​ishing Requirements

An important feature of a rating system is that it meets the needs and expectations of our players. In addition to the three requirements above, we will conduct an extensive consultation with the player base, in the form of an online “Delphi” study. Delphi methodology can be used to reach agreement between players about the most important features of a CC rating system. The new system will be designed with the expectations of players in mind.

Phase Two: Theoretical Struc​ture

No prior assumptions are made by this proposal about the eventual theoretical structure of the new rating system; it will most likely be based on the current Elo system, or an advanced probabilistic system, for example Glicko, but most importantly it will be based on the requirements of our players and consultation with a professional expert in statistics on how the system can best meet those requirements.

Phase Three: Modelling and Tuni​ng

As part of the Services Committee review of Proposal 2021-031, we built a digital “ratings test bench”; this allows us to investigate the effects of real game results on any rating system. Our methodology is to use results from 2006 to 2011 to “burn in” hypothetical ratings for all players, and then to use the system to generate rating lists for all periods from 2011 to the present day.

Using this modelling, we can examine several features of any hypothetical system:

  • We can plot the overall distribution of player ratings at multiple time points, and check for inflation or deflation.
  • We can plot a single player’s rating development over time.
  • We can examine the extent to which a player’s rating predicts game outcomes.

Because we can generate ten years of rating data, based on over half a million games, almost instantly, we can easily adjust the parameters of the system to best fit the real-world data. This will allow us to create a bespoke rating system for ICCF which has been finely tuned to meet the requirements of correspondence chess players.

Phase Four: Implementation Planning and Proposal for 2022 Congress

Once the details of the new rating formulae have been finalised, we will work with members of the ICCF Services Committee and our software development consultant to prepare a full technical specification for server development. This specification will include estimates of implementation costs.

Once the technical specification has been complete, we will prepare a detailed proposal for delegates to consider at the 2021 ICCF Congress; this proposal will include a summary of all our work to date.

Phase Five: Implementation and Testing

If delegates accept the proposed new rating system, we will implement this on the ICCF test server to enable technical testing. We will use the output from the digital bench as a baseline for this testing.

Finally, we will make the new system live on the server, all being well in time for the for the 2023/1 or the 2023/2 rating list.

Ongoing Monitoring

The new system will include a protocol for monitoring changes in the rating distribution and a method for applying corrections; this will be ongoing.

 

Rationale

The changing nature of modern correspondence chess has resulted in a clear need for a new rating system for ICCF. We believe that Proposal 2021-031 would represent an ill-considered ‘sticking plaster’ with future damage such as hyperinflation and devaluation of future ICCF titles not fully considered by the proposers. A full review and systematic development cycle, informed by a leading expert, and with a forward-looking plan is the minimum standard required for such a project.

 

Assessment

A proposal to introduce the new rating system will be submitted to the 2022 ICCF Congress.

The new system will include built in checks and tuning for stability which will be applied at regular intervals going forward.

 

Effort

A budget not exceeding €13,000 Euros is requested for statistical consulting. This seems like a high figure; however, it represents the cost of employing the world’s experts in this field. The flaws in Proposal 2021-031 exposed by expert review highlight the folly of trying to develop a system which is central to ICCF by enthusiastic and well meaning, but statistically naïve, volunteers. If we want the best possible rating system for ICCF, we must consult the most knowledgeable experts.

There will be some cost to update the server with the proposed new system in 2022/23, this will be estimated and presented in the 2022 Services Committee report.

 

Considerations

Professor Mark Glickman is Senior Lecturer on Statistics in the Department of Statistics at Harvard University. His position involves teaching, research, advising undergraduate and graduate students, and performing administrative duties within the university. He is also Senior Statistician at the Center for Healthcare Organization and Implementation Research, a Veterans Administration Center of Innovation. Professor Glickman received his B.A. in Statistics in 1986 from Princeton University (Summa Cum Laude), and his Ph.D. in Statistics from Harvard University in 1993. He has substantial experience in authorship, refereeing peer-reviewed papers, editorship, and leadership within the sports analytics community at both the local and international level. He has been a member of the US Chess ratings committee continuously since 1985, having served as chair of the committee from 1992 to 2019. Professor Glickman invented the Glicko and Glicko-2 rating systems, both of which are used in rating players in organized chess (e.g., chess.com and lichess.org) and for rating players in various online gaming systems involving head-to-head competition. He is also co-inventor of the Universal Rating System which has been adopted for rating players in the Grand Chess Tour.

It is fair to say that Professor Glickman is one of the world’s leading experts in the field of chess ratings.

This proposal conflicts with Proposal 2021-031; Clause 1.12 of the ICCF Voting Regulations will therefore apply.

 

Documentation

No changes to official documentation in 2021 or 2022. Section 1.4 and Appendix 1 of the 2023 rules will be updated to reflect the new procedure if accepted by the 2022 Congress.

 

Comments

15.07.2021 Gino Franco Figlio

I fully support this plan. It does not have all the details described in Dr. Glicko’s letter who apparently agrees to start with one of the ideas included in proposal 2021-031 and making necessary adjustments based on prospective observations. ICCF is lucky to have all these parties coming together to develop a customized rating system. Great job Austin!

 

Voting Summary

A vote of YES will mean that ICCF will develop a “best in class” rating system, which has been informed by one of the world’s leading experts.

A vote of NO will mean that either the current rating system will remain, or the system proposed by 2021-031 will be implemented.

A vote of ABSTAIN is not a vote but means the vote holder has no opinion and does not wish to represent the correspondence chess players of his or her federation in this matter.

Comments are closed.