Thoughts on ranking systems

Duplode · May 05, 2008, 05:21:17 AM

Well folks, lately I reflected a bit about the various World Ranking schemes we had been using lately. While it may be not a priority topic at this moment, even because most possible managers are too busy right now, it might be useful to have some discussions already so that when the time for reviving the rankings comes (maybe on time for the possible next editions of USL, ISM or similar special events?) our ideas are clearer.

The main issue I want to put up is related to the different kinds of contests we have. It seems by the time SWR was being consolidated you guys had a few debates on whether IRC or NoRH contests should be included on the ranking. Well, back on the start of 2007 there were six regular full-season contests: ZakStunts, USC and SDR-RH (monthly races, free rules), Kalpen (6-week races, free rules, Indy only), WSC (monthly races, IRC rules, Indy only) and SDR-NoRH (monthly races, NoRH). Eventually, it was decided that NoRH was different enough from RH racing to justify the non-inclusion of SDR-NoRH in SWR, while WSC was included despite the IRC restrictions. By that time, WRL included only the three free-rules monthly contests, deeming Kalpen as too irregular and WSC as not "strong" enough.

One year later, however, the situation changed a lot: ZakStunts is the only active contest with regular rules. The other contests are WSC, FTT (3-week races, NoRH) and JACStunts (weekly races, ISA and NoRH). In addition, WSC has eventually grown into a significantly "stronger" contest after all the hardships of 2007. All of the four contests differ very significantly not only in terms of rules but also on the duration of the races. And in my view there would be no option to an organizer but to include at least WSC and FTT beside ZakStunts in a world ranking circa 2008. So the question becomes how to manage the differences... solving issues such as:

Do we really need to make adjustments for "comparing" RH and NoRH within a ranking?
Should 3-week races be weighed down in relation to full-month races?
Is there a reasonable way to account for very short races, or even special event results?

My initial answers to those would be No, Yes (but not by the whole 25% though) and (probably) Yes. But those are not exactly simple decisions, so the more arguments we throw in, the better...

Krys TOFF · May 05, 2008, 09:55:39 AM

Regarding world ranking, to be honest I had planned to follow Mark's steps and update regularly his file, but I was quite late and busy with FTT preparation so I let it down for the moment.

Regarding the contests, we have right now 2 RH and 2 no-RH, 2 with shorcuts and 2 without. And these 2 parameters are mixed in the 4 competitions.
Apart from competition length, mixing RH and noRH is not a good idea I think, best would be 2 rankings (1 for RH, one for noRH) I think, and an overall combination of them if you want.

Very short races like JACStunts are a problem too. For some pipsqueaks like me, who spend a few time of racing per competition per month, it's an advantage because my level of racing will be the same for 1 to 4 weeks of competition. For pipsqueaks that always race a lot of time and/or need time to reach good result progressively (like, I may say, Aburaf) it's an inconvenient. Same comment apply for 3 weeks competition like FTT even if impact is less important between 4 and 3 weeks than with 4 and 1 week.

So, saying that FTT and JACStunts should be weighed down is logical for pipsqueaks like Aburaf, but not for pipsqueaks that need a few runs to perform a good result like Mark or you, Duplode.
Also, would you increase the impact of current Z84 race just because Zak will be away and so race will last 10 more days than usual ?
No, I think each competition has its rules and schedule. Some gives you advantage, some gives you disadvantage. Overall, it's balanced I may say according to current mix of competitions.

One last point : the fact that WSC is indy-only competition makes also the result biased. Slow cars / countach specialists like Zak or Argammon are disadvantaged.

To be honest, according to current mix of competition, it's difficult to set up a proper world ranking combining all competitions...

BonzaiJoe · May 05, 2008, 11:39:45 AM

Difficult, but you just gotta try anyway

My suggestion is to imitate other world ranking lists, like the tennis ranking list for example. I think shorter races do need to give less points, in proportion with duration relation, and I think it's also a problem that some people (me!) have a hard time participating in FTT because of technical problems. But now that noRH is no longer trust-based, I think it's no problem to include it in a list like this. And Indy bias is okay - one can always say that an Indy competition is actually an "any car" competition like the Kalpen competition.

Duplode · May 06, 2008, 05:01:48 AM

Good, we have diverging opinions

That's good because we'll have to reflect carefully before reaching a consensus. Your posts did clear my ideas enough so I can now follow up with a longish post...

In a general sense, I feel any global ranking must average over some reasonable spectrum of required skills. Keeping up with BJ's tennis comparison, ATP rankings account for "all" top level tournaments, be them played on clay, hard or grass surfaces, and it is a fact tennis players are usually more effective at some surface of their preference. Maybe in our case the differences between, say, RH and NoRH are somewhat more pronounced (because they involve a whole change of "philosophy"), but I believe the argument still holds.

As for Indy vs. slow cars, most of the time it just evens up, since non-Indy-exclusive contests usually have a higher number of races with the slower cars than with IMSA/Indy. Any remaining bias can be pretty well managed by Mark's participation log-scale, which has as a vital characteristic not taxing too heavily top pipsqueaks with very good results but below-average participation rates. Proof of that is how Dottore managed to hold positions #3 / #4 on SWR during most of 2007 racing (and winning...) only at his favourite contest, USC. Even in a playing field with far more active (top) pipsqueaks than 2007 I do not believe he would drop from top 10 either.

Weighing competitions by their lengths seems to be a more delicate issue. Let's start pick the standard 30-day contest as reference. Should we raise Z84 importance due to the "bonus" 10 days? Surely not, as the impact on pipsqueaks' racing will be minimal, as Krys pointed out. On the other hand, giving the weekly races at JACStunts the same weight as a full month ZakStunts does not feel right, because this time is not only a question of different characteristics resulting in different advantages, for there are more quantifiable parameters involved. Should a set of four victories in JACStunts, each probably taking two or three focused attempts within a week, have four times (or three, or even two) the weight of a presumably long-fought 30-day ZakStunts victory? That would look like a strong distortion in favour of JACStunts occasioned exactly by the different character of each competition - exactly the situation we're trying to avoid.

Now, what about FTT and its novel (

) 3-week deadlines? Now we head into a sort of gray zone, as the subtracted week induces a change in racing style that is less about effort/time spent and more about different habits and tactics. For that reason, I believe while there's little escape of putting 25% weights on individual weekly races, FTT certainly shouldn't get a 75% weight - maybe 90%, or even 100% would be more appropriate. It seems the log-scale solution might also find some application on these issues, as additional or subtracted weeks are much more relevant to short-race contests than to long-race ones.

Summarizing, I believe it is possible to have a reasonably reliable ranking mixing very different sorts of competitions, even if we may have to wonder a lot over the fine details of the system, and that SWR success serves, at least partially, as an argument in favour of this theory.

Veterans, managers, newbies, keep voicing your opinions...

Chulk · May 06, 2008, 06:24:01 AM

Duplode's last post sums up all of my thoughts. The way Mark used those "Quality points" was excellent. If an average pipsqueak won a race against many top seeded (to keep the Tennis comparison

) pipsqueaks, he got more points than for winning a race where none of the top seeded pipsqueaks took part. I think that would work perfectly.
About NoRH-RH I have strong doubts about putting them together. It's like comparing in a ranking a Tennis player and a Table tennis player. They look alike, but they are completely different in my opinion.

Krys TOFF · May 06, 2008, 09:10:09 AM

One limitation of quality points is that 90% of JACStunts pipsqueaks don't race any other competition, so they will never get well ranked and so JACStunts will remain anecdotic compared to other competitions...

BonzaiJoe · May 06, 2008, 12:14:29 PM

That reminds me: what is JACStunts?

Krys TOFF · May 06, 2008, 12:58:01 PM

Follow the link from the portal, and see "english FAQ for foreign pipsqueaks" in competition menu.

Chulk · May 06, 2008, 10:08:22 PM

Quote from: Krys TOFF on May 06, 2008, 09:10:09 AM
One limitation of quality points is that 90% of JACStunts pipsqueaks don't race any other competition, so they will never get well ranked and so JACStunts will remain anecdotic compared to other competitions...

Exactly! If they only play a competition where only 1 or 2 of the ranked drivers play, how can they be ranked compared with the others? They should be over those ranked pipsqueaks they beat at JAC. If those pipsqueaks and 1st and 2nd (for example) they will have lots of points.
Besides, if a tennis player only plays on Clay and always wins, he will have a good rank, but definitely won't be number 1.

BonzaiJoe · May 06, 2008, 11:57:26 PM

Some kind of measure has to be taken towards the amount of pipsqueaks in a competition. When I made a Worldranklist really long ago, I simply rated the competitions like in tennis or cycling, based on participants, race frequency and replay strength, and used the ratings as multiplication factors.

Mark L. Rivers · May 07, 2008, 08:38:02 AM

My thoughts is that the only contests can be join for a unique global rank are Zak's Stunts and WSC.
One of them is with IRC rules and the other no, but I see both as two multiformity of a single specialty, that is RH mode. Even if IRC rules don't permit flying, drivers have to race in the similar way, trying and re-trying in every point of the tracks to make the desired paths.
NORH races is all another story, another kind of race, and I think it's not comparable wiht RH races. As I said more than one time, NORH races need to manage emotions and risk, need firm nerves and the ability to drive almost perfectly from the beginning to the end, so even for 90" consecutively.
And I think that FTT can't be join with JacStunts for a unique global rank, because there's not way to check the reliability about JacStunts' races. I don't want to hurt anyone with my declaration, but, about this point there's an abyss between FTT and JacStunts. If we assume JacStunts' replays reliable as FTT's replays, then Krys could eliminate the rule that demand to the drivers to send the video. But I think that rule has a good reason to be active. Not for all drivers but sure for someone, above all if there would be a world rank that could stimulate dreams of glory. That rule has contributed to rescue a forgotten wonderful specialty, and we need that rule to preserve it,avoiding every kind of compliants that could damage the right atmosphere we are inspiring.

Krys TOFF · May 07, 2008, 09:09:00 AM

There have been different noRH attempt before SDR (ISA, IRC, ZakStunts), but there always was issues or suspicions regarding real noRH racing of some pipsqueaks (CTG, pArAnO, ...). I don't care if this suspicious atmosphere was due to real cheating or not, all I care about is the trustness earned by the video recording, thanks to Mark for finding it.
I won't get rid of the video proofs for FTT, that's for sure.

And as I said before, I'm also not convinced that a ranking mixing RH and noRH racing is relevant.

Quote from: BonzaiJoe on May 06, 2008, 11:57:26 PM
Some kind of measure has to be taken towards the amount of pipsqueaks in a competition. When I made a Worldranklist really long ago, I simply rated the competitions like in tennis or cycling, based on participants, race frequency and replay strength, and used the ratings as multiplication factors.

Oh yes, I remember this website :

Sadly, website is dead since a long time now.

Duplode · May 07, 2008, 07:07:07 PM

Quote from: Mark L. Rivers on May 07, 2008, 08:38:02 AM
My thoughts is that the only contests can be join for a unique global rank are Zak's Stunts and WSC.
One of them is with IRC rules and the other no, but I see both as two multiformity of a single specialty, that is RH mode. Even if IRC rules don't permit flying, drivers have to race in the similar way, trying and re-trying in every point of the tracks to make the desired paths.
NORH races is all another story, another kind of race, and I think it's not comparable wiht RH races. As I said more than one time, NORH races need to manage emotions and risk, need firm nerves and the ability to drive almost perfectly from the beginning to the end, so even for 90" consecutively.

As implied before, I believe that, despite the fundamental differences between them, RH and NoRH may be jointly used in an overall ranking. In any case, it seems I'm in, alongside with BJ, on a minority about this issue, but nevertheless I would like to add a few more arguments. It's true the racing itself gets very different in both modalities, requiring different skills from the pipsqueaks, but nevertheless it's still the same game... the closest analogy I can draw (although not accurate, I reckon) are Artistic Gymanstics different apparatus (floor, vault, balance beam...). They have very different procedures and demand partially distinct skill sets, but nevertheless are related to each other by the requests of "power, grace, coordination and flexibility" (I quoted this from somewhere else), even if in various proportions for each apparatus. And thus the competitions have medals both for individual apparatus and overall performance... In our case, it seems reasonable pipsqueaks that are very strong in both RH and NoRH have a better claim to overall #1 than ones that are very strong in only one of the disciplines - even if a SWR-style ranking would probably allow all "masters" of a specific discipline to approach the top (i.e. top five) in equal conditions. Furthermore, in my opinion NoRH skills can be useful in RH in some circumstances - for instance, allowing a pipsqueak to do the "easy" parts of a track more fluently and quickly, thus having more time to do RH work on the "hard" parts.
Another issue that could be raised is that a World Ranking with only two contests might be a bit too thin... but since this issue is particular to our current, May 2008, circunstances, it is largely secondary.

Quote from: Mark L. Rivers on May 07, 2008, 08:38:02 AM
And I think that FTT can't be join with JacStunts for a unique global rank, because there's not way to check the reliability about JacStunts' races. I don't want to hurt anyone with my declaration, but, about this point there's an abyss between FTT and JacStunts. If we assume JacStunts' replays reliable as FTT's replays, then Krys could eliminate the rule that demand to the drivers to send the video. But I think that rule has a good reason to be active. Not for all drivers but sure for someone, above all if there would be a world rank that could stimulate dreams of glory. That rule has contributed to rescue a forgotten wonderful specialty, and we need that rule to preserve it,avoiding every kind of compliants that could damage the right atmosphere we are inspiring.

The video proof argument is a valid (and strong) one agains JACStunts inclusion. I didn't mentioned it because my arguments in the other post aimed a more general discussion, appliable to other short-length races we may see in the future. Anyway, there can't be possibly any correlation between the "necessity" of video proof use in FTT and JACStunts: video proof is obviously the ideal method (and its concept was surely a big step forward), and is not employed in JACStunts mostly due to pragmatic reasons, as there we tend to have longer tracks - about double the usual FTT/SDR length, thus meaning large files, and the competition aims at a rather broader audience - including some complete newbies to even Stunts. Even if we decided to include JACStunts in a ranking alongside with FTT it wouldn't change the value of video proof; or similarly, packing RH and NoRH in a ranking would not devalue either speciality. Assuming otherwise would give way too much importance to the ranking in my opinion, as no objective criteria can assign the quality of a contest (or a pipsqueak, for that matter) with 100% certainity - what we do is to build approximations with a certain goal.

Chulk · May 08, 2008, 01:34:44 AM

We could have an overall ranking list, and also the both RH - NoRH separated as in Gymnastics. So we know who's the best in every specialty and who's the best overall.

Stunts Forum

News:

Thoughts on ranking systems

Duplode

Krys TOFF

BonzaiJoe

Duplode

Chulk

Krys TOFF

BonzaiJoe

Krys TOFF

Chulk

BonzaiJoe

Mark L. Rivers

Krys TOFF

Duplode

Chulk