[NSRCA-discussion] World F3A contest

rcmaster199 at aol.com rcmaster199 at aol.com
Mon Aug 19 08:45:34 AKDT 2013


Scott Smith recently reached out to Don Ramsey and me regarding TBL and it's first born, FPS, (Fair Play System) which is used in IAC competitions. TBL essentially changes the scores of whole flights in toto for any given judge, while FPS takes it down a step to the individual maneuver, and changes a judge's given scores, maneuver for maneuver, as assessed by the statistical package. 

To me, it is fundamentally wrong to artificially change a judge's score based on statistics, just to "normalize" the whole panel of judges and reduce the scoring variation, judge to judge and judge panel to judge panel. In the examples we discussed, pilots all placed the same with straight up scores and with TBL applied. But I am not convinced that that will always be true. In one example in fact, not only did the average scores each judge gave became constant, so did each judge's standard deviation. TBL is that powerful and that augmenting.....

I want to keep an open mind about statistical augmentation of the scores a judge gives. I am leaning towards FPS more but still, it will be judging by statistics so to speak. BTW---For this to work, Perceived Zeroes will require input from the Chief Judge in every case before the zeroes become Hard Zeroes, for all judges. Part of the existing Judge's Rules will require revision....National Comps are one thing, and local comps something else??.....

I think there should be lots more conversation before any changes are implemented. The main question to answer is "Is the present system broken?"

Regards

MattK



-----Original Message-----
From: Ryan Smith <smaragdz at comcast.net>
To: 'General pattern discussion' <nsrca-discussion at lists.nsrca.org>
Sent: Sun, Aug 18, 2013 7:51 pm
Subject: Re: [NSRCA-discussion] World F3A contest



TBL are intials, standing for Tarasov, Bauer, Long; the people that came up with it. Below is an explanation scalped from a post/email that Derek Koopowitz wrote a while back.
 
The Tarasov-Bauer-Long (TBL) Scoring method has been around since the 1970's.
It has been used in the full size arena since 1978 and has been used at every full size IAC World Championship since 1980. The TBL method applies proven statistical probability theory to the judge's scores to resolve style differences and bias, and to avoid the inclusion of potential faulty judgements in contest results.
 
Why we need TBL
To understand just why we need TBL, and how it works, is of considerable importance to us all. It is important to the pilots because it is there to reduce the prospect of unsatisfactory judgements affecting their results, and it is important for judges because it will introduce a completely new dimension of scrutiny into the sequence totals, and it will also discreetly engage the attention of the Chief Judge, or Contest Director, if the judges conclusions differ sufficiently from all those other judges on the same panel.
 
When people get together to judge how well a pre-defined competitive task is being tackled, the range of opinions is often diverse. This is entirely natural among humans where the critique of any display of skill relies on the interpretation of rapidly changing visual cues. In order to minimize the prospect of any "way out opinions" having too much effect on the result, it is usual to average the accumulated scores to arrive at a final assessment, which takes everybody's opinion into account.
 
Unfortunately this averaging approach can achieve the opposite of what we really want, which is to identify, and where needed, remove those "way out opinions" because they are the ones most likely to be ill-judged and therefore should be discarded, leaving the rest to determine the more appropriate result. In aerobatics the process of judging according to the rulebook normally leads to a series of generally similar personal views. However, one judge's downgrading may be harsher or more lenient than the next, their personal feelings toward each competitor or aircraft type may predispose toward favor or dislike (bias), and they will almost certainly miss or see things that other judges do not.
 
How then can we "judge" the judges and so reach a conclusion, which has good probability of acceptance by all the concerned parties? The key word is probability, the concept of a perceived level of confidence in collectively viewed judgements has entered the frame. What we really mean is that we must be confident that opinions pitched outside some pre-defined level of reasonable acceptability will be identified as such and will not be used. This sort of situation is the daily bread and butter of well established probability theory which, when suitably applied, can produce a very clear cut analysis of numerically expressed opinions provided that the appropriate criteria have been carefully established beforehand.
 
What has been developed through several previous editions is some arithmetic which addresses the judge's raw scores in such a way that any which are probably unfair are discarded with an established level of confidence. To understand the process you need only accept some quite simple arithmetic procedures, which are central to what is called "statistical probability". The TBL scoring system in effect does the following:
* Communizes the judging styles.
* Computes TBL scores
* Publishes results
 
Communizing the judging styles involves remodelling the scores to bring all the judging styles to a common format and removing any natural bias between panel members. Following some calculations, each judge's set of scores are squeezed or stretched and moved en-bloc up or down so that the sets all show the same overall spread and have identical averages (bias). Within each set the pilot order and score progression must remain unaltered, but now valid score comparisons are possible between all the panel judges on behalf of each pilot.
 
Computing the TBL score involves looking at the high and low scores in each pilot's set and throws out any that are too "far out" to be fair. This is done by subtracting the average for the set from each one and dividing the result by the "sample standard deviation" - if the result of this sum is greater than 1.645 then according to statistical probability theory we can be at least 90% confident that it is unfair, so the score is discarded.
 
This calculation and the mathematically derived 1.645 criteria is the key to the correctness of the TBL process, and is based on many years of experience by the full size aerobatics organization with contest scores at all levels.
 
The discarding of any scores of course changes for a pilot the average and standard deviation of their remaining results, and so the whole process is repeated. After several cycles any "unfair" scores will have gone, and those that remain will all satisfy the essential 90% confidence criteria.
 
Publishing the results is derived by averaging each pilot's scores. The final TBL iteration therefore has any appropriate penalty/bonus values applied and the results are then sorted in order of descent of the total scores to rank the pilots first to last.
 
These final scores may, or may not, be normalized to 1000 points, depending on the setting for the selected class. Educating and improving the judges is a useful by-product of this process in that it provides all the bells and whistles how each judge has performed by comparison with the overall judging panel average and when seen against the 90% level of confidence criteria.
 
The TBL system will produce an analysis showing each judge the percentage of scores accepted as "OK", and a comparison with the panel style (spread of score) and bias (average). Unfortunately TBL, by definition, brings with it a 10% possibility of upsetting an honest judge's day. The trade-off is that we expect not only to achieve a set of results with at least 90% confidence that are "fair" every time, but that the system also provides us with a wonderful tool to address our judging standards. TBL will ensure that every judge's opinion has equal weight, and that each sequence score by each judge is accepted only if it lies within an acceptable margin from the panel average.
 
TBL, however, by necessity takes the dominant judging panel view as the "correct" one and it can't make right scores out of wrong ones. If 6 out of 8 judges are distracted and make a mess out of one pilots efforts, then for TBL this becomes the controlling assessment of that pilots performance, and the other 2 diligent judges who got it right will see their scores unceremoniously zapped. In practice this would be extremely unusual - from the judging line it is almost impossible to deliberately upset the final results without collusion between a majority of the judges, and if that starts to happen then someone is definitely on the wrong planet.
 

From: nsrca-discussion-bounces at lists.nsrca.org [mailto:nsrca-discussion-bounces at lists.nsrca.org] On Behalf Of Jeff and Claire
Sent: Sunday, August 18, 2013 6:47 PM
To: 'General pattern discussion'
Subject: Re: [NSRCA-discussion] World F3A contest

 
What does TBL stand for?
 

From: nsrca-discussion-bounces at lists.nsrca.org [mailto:nsrca-discussion-bounces at lists.nsrca.org] On Behalf Of Jon Lowe
Sent: Sunday, August 18, 2013 3:54 PM
To: nsrca-discussion at lists.nsrca.org
Subject: Re: [NSRCA-discussion] World F3A contest

 

After all pilots fly in front of each set of judges on day 4 of prelims, I think.  That would be part of the normalization process. At least that is how I remember it from previous WC's.

Jon

-----Original Message-----
From: John Fuqua <johnfuqua at embarqmail.com>
To: 'General pattern discussion' <nsrca-discussion at lists.nsrca.org>
Sent: Sun, Aug 18, 2013 4:40 pm
Subject: Re: [NSRCA-discussion] World F3A contest


When do they do the TBL?

 


From: nsrca-discussion-bounces at lists.nsrca.org [mailto:nsrca-discussion-bounces at lists.nsrca.org] On Behalf Of Jon Lowe
Sent: Sunday, August 18, 2013 3:33 PM
To: nsrca-discussion at lists.nsrca.org
Subject: Re: [NSRCA-discussion] World F3A contest


 


A click on the Team USA logo on the NSRCA home page takes you to the Team USA website.  That has a link to Cindy Wickhizer's  page:


 


https://2013worldsteamusa.shutterfly.com/, 


 


which has been where Mark is sending info.  He and I both previously announced that on this list.


Jon


-----Original Message-----
From: Gordon Seeling <gseeling at q.com>
To: nsrca-discussion <nsrca-discussion at lists.nsrca.org>
Sent: Sun, Aug 18, 2013 2:35 pm
Subject: [NSRCA-discussion] World F3A contest

is there a computer man in the NSRCA ??if there is ,please post the 
results on the nsrca web site, so rank & file will know what is going on.
_______________________________________________
NSRCA-discussion mailing list
NSRCA-discussion at lists.nsrca.org
http://lists.nsrca.org/mailman/listinfo/nsrca-discussion



_______________________________________________
NSRCA-discussion mailing list
NSRCA-discussion at lists.nsrca.org
http://lists.nsrca.org/mailman/listinfo/nsrca-discussion



_______________________________________________
SRCA-discussion mailing list
SRCA-discussion at lists.nsrca.org
ttp://lists.nsrca.org/mailman/listinfo/nsrca-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nsrca.org/pipermail/nsrca-discussion/attachments/20130819/2873d9e0/attachment.html>


More information about the NSRCA-discussion mailing list