News:

Herr Otto Partz says you're all nothing but pipsqueaks!

Main Menu

Elo-like ratings for ZakStunts: The Folyami Project

Started by Duplode, January 28, 2023, 07:24:25 AM

Previous topic - Next topic

alanrotoi

Cool! Where is the last month list? It would be cool to keep every month list too, isn't it? It's the basement of more statistics   :)

Duplode

#16
Quote from: Overdrijf on February 12, 2023, 04:28:05 PMI'm not sure if it would be a bit silly here, but one of the best bits about elo systems is that little dopamine hit from winning a match and gaining 30 elo. Would that be something that could work here, like a collumn for last month or for how much you went up since last month?

While I'm being deliberate with adding comparison features to the page, I agree some way of quickly having a sense of recent progress is desirable, and the most basic way of adding that is through the rating change from the previous round. So yes, I'll add this one  :) (Rating changes rather than ranking position changes, as the latter might not be all that meaningful given how often people leave and rejoin the current ranking.)

Quote from: alanrotoi on February 12, 2023, 04:40:18 PMCool! Where is the last month list? It would be cool to keep every month list too, isn't it? It's the basement of more statistics   :)

This might take a little longer, as I'll have to soup up the generation of pages for the mini-site. But I'll work on it as well! (Doing that will also help with Argammon's suggestion of pipsqueak-wise data/charts.)

Duplode

@Overdrijf The rating change from the previous round is now shown with the current ranking.

Overdrijf

Haha, I'm on my way up! I overtook a person! No 7th place against many talented drivers driving great laps can stop my advance!

... Yes, this is working for me. Thanks!

alanrotoi

#19
I may sound hard or angry but it is just a bad sleepy english, don't get it wrong please :-*

I want to discuss a case: juank_23 vs Eddie Brother.
They raced in the same track only twice and both Eddie was beaten.
Juank pipsqueak in 4 tracks and the positions were:
11/14
9/12
12/17
13/15

And Eddie raced in 5 tracks:
14/17
14/15
10/12
10/12
10/16

How in the hell Eddie has 1419 points and Juank_23 1415? I mean I understand there is some kind of calculations but from my guts I know it is a bit unfair for juank.

I repeat, I'm a big fan of this project so I want to see it as precise as possible.


And about exponential distribution, do I understand correctly if I say when you win more thsn once in a row you get less and less points? You know how hard is to win or keep a podium position for more than few races. It should be rewarded. Maybe I get it wrong. I see the numbers and the large explanation with calculations and graphs but I just can say how many point I'll get/loose if I finish in x position in the next track. Maybe the page needs also a less technic explanation and a point simulator.

Overdrijf

In this type of rankings you get points for beating people with a good position on the board compared to you. If you win a race you beat everyone, so you gain points and the average other pipsqueak loses points. Do if you beat all the same people again the next point you get less points. Let's say the first time Duplode was 100 points above you, CTG was 50 points above you and I was 200 points below you, then on the second race Duplode was only 75 points above you, CTG 25 points above you and I 210 points below you. That's a less impressive victory, in an elo-like rating.

It also makes sense because the main point of these ratings is not to reward you for winning, but to end up ranking you well compared to the others. As you close in on your rightful position the changes should get smaller and smaller, not bigger, or you'd just end up with a bazillion points and the tanking would be useless.



As for the case of Eddie and Juank, apparently Eddie had some impressive rankings compared to drivers with good point totals. Maybe there were some.extra good drivers driving that month driving the midfield closer to his position or something? Actually, I should study how this ranking works better bwfore trying to answer rhis one. It is just one example though, out of lots of years of Zakstunts.

alanrotoi

Quote from: Overdrijf on March 08, 2023, 08:10:57 AMIn this type of rankings you get points for beating people with a good position on the board compared to you. If you win a race you beat everyone, so you gain points and the average other pipsqueak loses points. Do if you beat all the same people again the next point you get less points. Let's say the first time Duplode was 100 points above you, CTG was 50 points above you and I was 200 points below you, then on the second race Duplode was only 75 points above you, CTG 25 points above you and I 210 points below you. That's a less impressive victory, in an elo-like rating.

It also makes sense because the main point of these ratings is not to reward you for winning, but to end up ranking you well compared to the others. As you close in on your rightful position the changes should get smaller and smaller, not bigger, or you'd just end up with a bazillion points and the tanking would be useless.


So maybe is better way to win a race then second and then win again that to win 2 in a row? :o

Overdrijf

Quote from: alanrotoi on March 08, 2023, 12:37:49 PMSo maybe is better way to win a race then second and then win again that to win 2 in a row? :o

Win-lose-win will probably net you a higher current rating than win-win-lose, yes, but lose-win-win might get you even higher, and a long record full of win streaks and loss streaks should give you the same average and a higher top than the same number of wins and losses without streaks.

(That's what I think happens anyway.)

Duplode

#23
Those are good questions, @alanrotoi ! The comments by @Overdrijf make for a good outline of the issues (the basic ideas of the Elo system do apply, in spite of the substantial changes Folaymi incorporates); I'll try to help with the details.

Let's begin with Eddie and Juank. Here are their full rating histories:

+-----++--------------------+
|     ||      Eddie Brother |
+=====++====================+
| C14 || 1429.7218603558094 |
+-----++--------------------+
|  P1 || 1417.3501392006976 |
+-----++--------------------+
| C15 || 1366.7000422242472 |
+-----++--------------------+
| C16 ||  1434.054239919096 |
+-----++--------------------+
|  P3 || 1419.3351433565706 |
+-----++--------------------+

+-----++--------------------+
|     ||           Juank_23 |
+=====++====================+
|  C9 || 1498.4609688295145 |
+-----++--------------------+
| C10 || 1447.4398384088088 |
+-----++--------------------+
| C11 || 1453.4838164837072 |
+-----++--------------------+
| C14 ||  1450.633155577332 |
+-----++--------------------+
| C15 || 1415.8621307766819 |
+-----++--------------------+

Both of them have exactly five counting races, the minimum needed to enter the rankings (before that, the ratings are calculated and accounted for, but not shown in the rankings as they are too volatile and unreliable). That being so, both show up on the historical ranking with their final rating, from their only entry in the ranking ever, which is a pretty unusual situation. Eddie is only ahead of Juank due to Eddie having a strong (relatively to his own performances) result at C16 (10/16, ahead of Usrin and Ben Snel), Juank having a weak (again, relatively to his own performances) result at C15 (14/16). Juank was still ahead at C15, by about 50 points, but wasn't around to bounce back in the following races (also note that the provisional factors make the ratings more volatile than usual over the first few races of a pipsqueak). In summary, Juank had really bad luck with the timing of what became his historical rating, and we'd probably see a different picture had the two of them raced a few rounds more than the bare minimum needed to enter the rankings.

Quote from: alanrotoi on March 08, 2023, 05:06:21 AMdo I understand correctly if I say when you win more thsn once in a row you get less and less points?

Basically yes. Repeated wins will make the rating difference between you and your opponents larger (more positive), so your expected score (i.e. the winning probability given by the model) will be larger (that's what the exponents in the formulas for EEloX and WX in the expected score section do), which in turn reduces the points gained from the matches. The reduction is pretty gradual, though. Marco's winning streak in late 2016 and early 2017 gives a decent illustration of the effect:

+------++--------------------+
| C181 || 1912.5228198049917 |
+------++--------------------+
| C182 || 1962.9674878144833 |
+------++--------------------+
| C183 ||  1998.219045817359 |
+------++--------------------+
| C184 || 2029.4221198139737 |
+------++--------------------+
| C185 || 2059.1942707075377 |
+------++--------------------+
| C187 ||  2082.898975327112 |
+------++--------------------+

Total gains per race were: +50 (C182), +36 (C183), +33 (C184), +30 (C185), +23 (C187). In summary, the idea is that surprising results should change the ratings faster than unsurprising ones, so that the ratings reflect how performances evolve, and repeated results get less and less surprising. It is also important to note, as Overdrijf points out, that a lot depends on what the field of opponents was like. For a somewhat extreme illustration, we can look at my ratings in the early 2018 races:

+------++--------------------+
| C197 || 2064.9343648958343 |
+------++--------------------+
| C198 || 2069.9753551755457 |
+------++--------------------+
| C199 ||   2073.89726641674 |
+------++--------------------+
| C200 ||  2080.930573062527 |
+------++--------------------+
| C201 || 2087.8455341750982 |
+------++--------------------+
| C202 || 2067.5222038971915 |
+------++--------------------+

C198 (+5) and C199 (+4) had some of the smallest fields in ZakStunts history, which helps explaining how my wins there gave me less points than my second places in C200 (+7) and C201 (+7).

Quote from: alanrotoi on March 08, 2023, 12:37:49 PMSo maybe is better way to win a race then second and then win again that to win 2 in a row? :o

Assuming there's a loss after the two consecutive wins, it will likely be a bit better. Here are rating changes for a few scenarios -- to simplify things, for these calculations I'll use a pure Elo system (without the extra Folyami modifiers) with K = 18 and just two pipsqueaks starting from equal ratings:

  • Win-Win-Lose: +7.6
  • Win-Lose-Win: +8.6
  • Lose-Win-Win: +9.4

It goes just as Overdijf predicted. For individual races, there's somewhat of a "the higher they are, the harder they fall" effect, which is part of the reason why there are a few measures to limit a bit the effect of unrepresentative bad results. Still, those differences are expected to even out over a longer series of races.

Quote from: alanrotoi on March 08, 2023, 05:06:21 AMI see the numbers and the large explanation with calculations and graphs but I just can say how many point I'll get/loose if I finish in x position in the next track. Maybe the page needs also a less technic explanation and a point simulator.

There's a little bit of that, at least as far as getting a sense of how large the values are, in the first table at the end of the article. For instance, let's suppose you have 2200 rating, and Renato 2300. If you win and Renato ends in fourth, you will gain 9.7 points from him (row "-100" for the rating difference, column "3" for the position difference). Keep in mind that the 9.7 is just for your match with Renato; the full picture requires considering the matches against everyone in the scoreboard.

I like the simulator idea, by the way! Later I'll look into at least adding at least a JavaScript calculator for 1v1 matches  :)

Duplode

The ratings have been updated for ZCT259! In this round, I move into the lead after a run of 13 races at the top by Alan; Argammon (for the third consecutive race) and Frieshansen improve their personal records; and Erik makes his first appearance in the rankings.

@alanrotoi and @Argammon : As a first step towards showing the evolution of ratings, below is a chart with the rating changes for ranked pipsqueaks over the past twelve races (a workbook with the values is attached to the post). If all goes to plan, I'll add a chart along these lines to the site in an update in the near future.


Overdrijf

It looks like the site itself hasn't actually updated yet...

Daniel3D

After reading some bit about the history of Elo. I realised that when I was a member of the chess club the club ranking was also an Eli rating. I never realised that because any ranking would have placed me in the same position. Since in the two years I was there I have solved every chess problem that was presented but I did not win any matches. So with 0 wins, i had a solid last place. Since I knew and expected to be last i never looked twice at the scoreboard or point system.
Edison once said,
"I have not failed 10,000 times,
I've successfully found 10,000 ways that will not work."
---------
Currently running over 20 separate instances of Stunts
---------
Check out the STUNTS resources on my Mega (globe icon)

Duplode

@Overdrijf If it's still not loading the updated tables, try Ctrl + Shift + R to bypass the cache. (I wonder if I should adjust anything on the site with respect to that.)

@Daniel3D I'm now wondering what a chess problem leaderboard could be like... chess.com seems to have such a thing, but I have no idea what the success percentage is meant to be. (To my eyes, timed competitive problem solving doesn't look like it would be an enjoyable pastime, but I'm not actually a chess player, so who knows.)

Daniel3D

The chess problems were for learning, they had no leaderboard. But I had no problems with them. I just couldn't think ahead well enough to win a match. Teacher's syndrome they called it.
Edison once said,
"I have not failed 10,000 times,
I've successfully found 10,000 ways that will not work."
---------
Currently running over 20 separate instances of Stunts
---------
Check out the STUNTS resources on my Mega (globe icon)

Cas

When I play chess, I've noticed I am much better if I've played many times against that person. I usually suck the first time I play with a certain player. It looks like my game intuition is heavily based on reaction and anticipation to other player's game style instead of pure strategy.

I remember I used to have an Elo rating a long time ago, when I played chess over Yahoo. I don't remember my numbers, though.
Earth is my country. Science is my religion.