News:

Herr Otto Partz says you're all nothing but pipsqueaks!

Main Menu

Elo-like ratings for ZakStunts: The Folyami Project

Started by Duplode, January 28, 2023, 07:24:25 AM

Previous topic - Next topic

Duplode

Quote from: Argammon on August 31, 2023, 09:19:31 AMIt Would be nice if you could look into the issue more thoroughly then I did.  :)

Sure! Later I will rerun some diagnostics involving global properties of the ratings. These can be tricky to interpret given how the pool of pipsqueaks is always changing, but they might provide some signal on whether there's something unusual going on. Meanwhile, here's my preliminary take on the things you've noted above:

On the monthly sum of changes: The ranking shown in the site doesn't give a full picture, as active pipsqueaks that still haven't reached five races aren't included. Once they are accounted for, the sum of changes in ZCT265 goes down from +54 to +32. For the preceding couple of races, the effect is larger still: +81 to -214 in ZCT264 (the sum now dominated by first entries, which I'm regarding as rating changes from the 1500 default value), and +73 to -96 in ZCT263. This kind of fluctuation stems from the provisional factors, which mean the changes involving pipsqueaks who have recently entered the ranking won't add up to zero, and they don't necessarily lead to meaningful inflation or deflation in the long run.

On our relative positions: You had reached the top of the ranking in ZCT263 by a single point (2192 to 2191). Your win in ZCT264 opened a bit of distance (2213 to 2196), which I have now clawed back (2219 to 2220 -- a single point again, but now in may favour). After one win each, we're almost where we were started (not exactly at the same place, as the gap opening up slightly in ZCT264 led to a slightly larger swing in the opposite direction in ZCT265.)

On Alan: Given how the current rating tends to reflect recent form, looking a bit further back to Alan's personal best reached in 2022 (ZCT254, 2213) might provide a fuller picture. It's also worth noting that Alan lost more than 100 points in ZCT262 alone, even if, a few races later, the overall effect looks less dramatic: given that Alan's rating has remained broadly stable since, it appears likely that, assuming a similar form were to be maintained, his rating would have converged to ~2040 even without the ZCT262 outlier, though that would perhaps happen closer to the end of the season. (And of course, a win or two by Alan would have him narrowing the gap quite a lot.)

Cas

How far back can this be done?  Is it possible to have a condensed graph of the whole 21st century?  (That is, beginning with the opening of ZakStunts in 2001)
Earth is my country. Science is my religion.

Duplode

#32
Quote from: Cas on February 15, 2024, 10:05:30 PMHow far back can this be done?  Is it possible to have a condensed graph of the whole 21st century?  (That is, beginning with the opening of ZakStunts in 2001)

The data does go all the way back to 2001. That graph might become a little busy, but if eloratings.net can do it, why can't we? :)

Cas

Yep, probably it'd be better if it were interactive and you can enable or disable pipsqueaks or groups of pipsqueaks and zoom in the time.... Or... it could just be a very high resolution image that we can zoom in and out offline.
Earth is my country. Science is my religion.

Duplode

Quote from: Duplode on February 16, 2024, 12:25:41 AM
Quote from: Cas on February 15, 2024, 10:05:30 PMHow far back can this be done?  Is it possible to have a condensed graph of the whole 21st century?  (That is, beginning with the opening of ZakStunts in 2001)

The data does go all the way back to 2001. That graph might become a little busy, but if eloratings.net can do it, why can't we? :)

All right, here is the full history graph! I'll eventually tidy it up a bit and add to the site proper.

This uses the same chart library than the aforementioned eloratings.net, and so the controls are the same: hover to show the legend and highlight individual pipsqueaks, click to the left of the vertical axis to clear highlights, drag horizontally to zoom, double-click to reset zoom, drag the lower right corner to resize. All races up to ZCT274 are covered, with ZCTP1, ZCTP2 and ZCTP3 being displayed as 14.5, 15.5 and 16.5 for the sake of expediency.

(Cc @Cas , @Argammon , @alanrotoi and everyone else curious about seeing more of the historical data  :))

2025-05-08 edit: The interactive graph has since been integrated with the site, with monthly updates. It is now available on this page.

Frieshansen

Great work! Very impressive to see the whole history in one big interactive chart.

Duplode

#36
Quote from: Argammon on August 31, 2023, 09:19:31 AMI have the feeling there is rating inflation going on.

Quote from: Duplode on September 01, 2023, 04:32:15 AM
Quote from: Argammon on August 31, 2023, 09:19:31 AMIt Would be nice if you could look into the issue more thoroughly then I did.  :)

Sure! Later I will rerun some diagnostics involving global properties of the ratings. These can be tricky to interpret given how the pool of pipsqueaks is always changing, but they might provide some signal on whether there's something unusual going on.

I've finally managed to get back to this. Inflation can be a tricky thing to diagnose, as there are so many ways to slice the data, so here are a few different ways to look at the global evolution of the ratings.

The most intuitive thing to check, perhaps, is the mean rating at each race of everyone in the current ranking (which covers those pipsqueaks with a included race entry over the previous four rounds):



While closer analysis could be merited, at first glance there aren't obvious signs of systemic inflation or deflation. The long-term shifts that do exist can be attributed to the evolution of the competition as a whole:

  • Once the ratings stabilise in the first few seasons, the mean rating initially settles around the upper 1600s, and for the most part stays there up to 2007.
  • After that, somewhere around 2008 we enter what I like to call The Middle Ages, an era of few newcomers and relatively small scoreboards, mostly occupied by experienced pipsqueaks that under today's rules would have already graduated from the Am League. To pick just one example, ZCT133 in August 2012 featured twelve pipsqueaks, all of them already having reached the podium by that time. Under such circumstances, the mean rating went up to the 1700s and at times even beyond, mostly due to a shrinking pool of newbies and upcoming competitors.
  • A long transitional period follows in the late 2010s. Once we reach the current decade, things have settled down again, and the mean rating is back to the 1600s range.

Another way of looking at the evolution is focusing on a specific position in the current ranking. For instance, below is the rating of the sixth-placed pipsqueak (starting from 2003 so that the unrepresentative low values from the first few races don't mess up the chart):



This gives us a different perspective, which has more directly to do with the depth of the field. Still, there is no continued growth of the sixth-place rating, which typically oscillated around 1900 (up to the Middle Ages) or 1850 (in the current era).

Looking at the current ranking probably gives us a perspective better attuned to the action on the racetrack at any given moment. What about the overall pool of pipsqueaks, though? Below is the evolution mean rating of all pipsqueaks, including those that haven't made it to the ranking due to having fewer than five included races:



Rating changes in the Folyami system don't add up to zero because of provisional status, which makes ratings change faster for pipsqueaks with less than twelve races. The direction of the global change depends on which share of those newcomers succeed in getting strong results and improving their ratings. That being so, a slow decrease of the global mean is expected.

Lastly, for something of a compromise between looking at the current ranking mean and the global mean, we might try looking at the mean rating across pipsqueaks that have made it to historical ranking (that is, which have at least five included races and thus made it at least once to the current ranking):



That looks remarkably stable, with the mean having remained pretty much static throughout the Middle Ages, and seeing only a very mild downward shift in the last few seasons.

Duplode

A small but useful update to the Folyami graph: the legend is now sorted according to the current ranking. That helps with something @alanrotoi asked for a long time ago: a convenient way to check past rankings. Now you can hover the mouse over (or touch) any chosen race (to make that easier, use the handles under the chart to zoom into a smaller range), and the legend below will, in effect, display the ranking:


Duplode

#38
Some Forum bookkeeping: I have split the monthly updates of the ratings into a separate thread on the ZakStunts area of the Forum; the general discussion of the rating system and the site features remains here on the original topic. By the way, the I plan to post the ZCT285 update tonight ;)

Edit: Of course I was going to do all that and then announce the split on the thread I didn't mean to  :D Now fixed.