Sometimes doing uni work can be practical

Discussion in 'Australian Motorcycles' started by John Littler, Nov 28, 2004.

  1. John Littler

    John Littler Guest

    Some of you may recall I said I had to do a stats assignment this
    semester for the master's, so I decided to do it on motogp (cos I was
    planning on changing jobs and hence didn't want to use work data and
    besides doing it on work stuff is boring)

    So for anyone who cares(1), the model which best predicts the
    championship rankings for motoGP turning out to be a straight forward
    linear model (none of the fancy stuff added much of any value).

    The variables considered included:
    Prev Yr Championship pts
    2yrs Prev Championship pts
    No. of Motogp podiums
    No. of junior class (125/250) podiums
    No. of cylinders
    Vee vs Inline
    Dummy variables for:
    Honda, Yamaha and Ducati
    Rookie Year
    Low Performing team
    Non GP Championship winner (ie WSBK etc)

    These whittled down in the parsimonious model to:

    Final Championship Pts = 34.5707 + 0.4723(Last yrs points) + 0.1983(yr
    before last's pts) + 28.0384(Dummy for rookie yr) + 57.3952 (Dummy for
    Honda) -41.7039(dummy for low performer)

    For those not familiar with stats a dummy(2) is a binary yes/no variable
    - so if the rider is in their rookie year then it would be 1 for yes and
    0 if it's not their rookie year.

    Or to put it another way the best predictor of how you'll go next year
    is how you went this year, but if you're on a Honda you have an
    advantage (yes I know that's intuitively obvious but now we have
    statistical "proof"(3)), and the interesting thing is a rookie is likely
    to go better in their first year (keenness factor) although I'm thinking
    that result may be a bit warped by Bayliss' bad trot in his second year
    with ducati.

    I'll tot up the podium finishes for this year and add to the data set
    and predict the 05 outcome for laughs, it'll be interesting to see how
    good the model actually turns out to be.

    JL
    1 I've tried to translate this into layman's terms, for those serious
    maths heads out there I can send you the data and regression stats if
    you want
    2 I just KNOW Postman Pat or Clem are going to have a field day with
    that one !
    3 You PhD maths types bugger off, I know it's not a proof per se, it's
    statistically significant etc etc <yawn> :)
     
    John Littler, Nov 28, 2004
    #1
    1. Advertisements

  2. John Littler

    TB Guest

    and people say I have too much time on my hands..

    TB
     
    TB, Nov 28, 2004
    #2
    1. Advertisements

  3. John Littler

    John Littler Guest

    Maaaate... I don't have nearly enough time on my hands 'cos I'm doing
    stuff like this :)

    JL
    (it's a uni assignment, you don't often get a choice to at least do it
    on something mildly interesting)
     
    John Littler, Nov 28, 2004
    #3
  4. John Littler

    TB Guest

    I built a brake system for a Formula SAE car for my final year thesis...

    TB
     
    TB, Nov 28, 2004
    #4
  5. John Littler

    John Littler Guest

    Hmm the difference between doing engineering and business :)

    JL
     
    John Littler, Nov 28, 2004
    #5
  6. John Littler

    John Littler Guest

    Stuck the updated podium data into the model and re-ran the regression
    and got slightly different outcome, 02 ranking got dropped and overall
    motogp podiums got put back in so now there's a slightly controversial
    result - it's predicting that 05 will finally be Biaggi's year !!

    New Model top 10
    Rider Predicted 05

    BIAGGI Max 246.56
    ROSSI Valentino 243.98
    GIBERNAU Sete 225.79
    BARROS Alex 195.35
    TAMADA Makoto 157.93
    HAYDEN Nicky 145.64
    BAYLISS Troy 128.51
    MELANDRI Marco 127.45
    CAPIROSSI Loris 123.41
    CHECA Carlos 119.59

    Submitted model (the one I posted earlier) gives the more predictable
    result of a Rossi win (although interestingly it's suggesting that Checa
    will beat his new team mate)

    Rider Pred 05
    Submittedmodel
    ROSSI Valentino 248.54
    BIAGGI Max 237.08
    GIBERNAU Sete 223.46
    BARROS Alex 210.34
    TAMADA Makoto 162.81
    HAYDEN Nicky 147.22
    MELANDRI Marco 127.38
    BAYLISS Troy 125.49
    CHECA Carlos 117.79
    CAPIROSSI Loris 111.44

    Don't worry too much about the actual numbers the standard error is
    about 15 points per variable so the outcome numbers could be
    significantly farther apart (plus or minus 40 or 50 points) it's meant
    to rank them rather than predict a specific number outcome.

    Personally, intuitively, I think both rankings are wrong, I reckon
    Bayliss is going to come out higher up the rankings than predicted, and
    Checa is going to crash his arse out of contention. Capirossi will
    remain consistent and come in about 5th overall.

    JL
     
    John Littler, Nov 28, 2004
    #6
  7. John Littler

    Jules Guest

    That's cool.

    Did the variables: "No. of junior class podiums" and "Non GP
    Championship winner" exhibit a high degree of multicollinearity?

    How do you objectively define a low performing team?


    Jules


     
    Jules, Nov 28, 2004
    #7
  8. John Littler

    alx Guest

    You could have at least put SPOILER in the subject line ehehe.

    I haven;t had a chance to watch the tape of next years events yet so I'll look away now.
     
    alx, Nov 28, 2004
    #8
  9. John Littler

    John Littler Guest

    Could I have a copy please ? I'll just setup my online betting account
    so I can put some money down ;-)

    JL
     
    John Littler, Nov 29, 2004
    #9
  10. John Littler

    John Littler Guest

    No, none at all - because I didn't include winning the 250's or 125's as
    a non GP champ - it was BSB WSBK etc only. I figured the high quantity
    of jnr class podiums was a good enough descriptor, and I don't think it
    really matters too much in the scheme of things for GP success as to
    whether they won the 250s or came second.
    Well, that's the difference between a Pass and a HD isn't it ? :) When
    I get the marks back I'll tell you whether I succeeded in being
    objective <grin>

    My argument was around relative team budget size and gross turnover over
    of the underlying factories - you can see that Yam, Duc, Suzuki and
    Honda have big numbers there, and Aprilia, Kwaka, Harris WCM, Proton etc
    don't. I also bought up the number of factory teams in other racing
    series (ie Ducati are somewhat small but have a strong racing heritage
    and lineage and a big presence in WSBK etc). Dunno if it'll fly :)

    If the lecturer knew more about bikes & GP he might catch me out on the
    relative size of Kawasaki Heavy Industries vs Kawasaki Motorcycles,and
    the Aprilia dominance in 125/250s but I think I'm safe there <grin>

    JL
     
    John Littler, Nov 29, 2004
    #10
  11. John Littler

    Jules Guest

    the Aprilia dominance in 125/250s but I think I'm safe there <grin>

    Until the professor Googles your name / topic looking for plagarism ;-)
     
    Jules, Nov 29, 2004
    #11
  12. John Littler

    Moike Guest

    google is soooooo last year.

    We use www.turnitin.com these days.

    Don't even think about incorporating a little snippet you found on the web.


    Moike
     
    Moike, Nov 29, 2004
    #12
  13. John Littler

    John Littler Guest

    DOH ! :)

    Well there's no plagariasm at least,now THAT would be a drama ! If he
    thinks to google up this and finds me mentioning the weakness in my
    argument, well hey, good luck to him :)

    JL
     
    John Littler, Nov 29, 2004
    #13
  14. You can also select your own topics. I designed a racebike chassis
    around my IT490 engine for my thesis.
    It was a great excuse to learn all about bike handling and chassis
    design.

    Mark
     
    allgoodnamestaken, Dec 1, 2004
    #14
  15. How does it go predicting this years results or the 2003 results?
    That might provide some validataion without having to wait a year. It
    does seem that the usual suspects are up front though so it can't be
    too far off.

    Mark
     
    allgoodnamestaken, Dec 1, 2004
    #15
  16. John Littler

    TB Guest

    My third year dynamics thesis was on a suspension system based on the Hossack/ telelever
    system.

    TB
     
    TB, Dec 1, 2004
    #16
  17. John Littler

    Nev.. Guest

    If that is the measure of it's validity then designing a formula is easy.

    last_season_result = next_season_result.

    Nev..
    '03 ZX12R
     
    Nev.., Dec 1, 2004
    #17
  18. John Littler

    John Littler Guest

    Not as easy as it sounds Nev, if you derive a descriptor of behaviour
    based on this year's results and then substitute last years (and N lags
    before that if required) you aren't necessarily going to 100% predict
    this year's.

    FWIW, last season's = this season's IS a significant predictor of
    outcome, but if you think about it the rankings in 03 don't 100% match
    the 04 results - which is what he was suggesting.

    JL
     
    John Littler, Dec 1, 2004
    #18
  19. John Littler

    John Littler Guest

    It's usefulness to predict 04 wasn't too bad, you can't go much further
    back though - think about how often riders and teams come into and leave
    GP or at least have done in the last few years - you end up with
    insufficient degrees of freedom to be able to draw stat sig conclusions,
    given a full field is about 25 and there's been a turnover of at least 6
    or 7 a year, go back more than 2 years and 2/3rds of your field isn't
    there.

    Realistically it's a snapshot of a couple of years, there's too much
    volatility to expect a long term consistent model (IMHO, and not being a
    serious stats/maths head, it's just a subject in a course work Masters)

    JL
     
    John Littler, Dec 1, 2004
    #19
  20. John Littler

    Nev.. Guest

    But the general structure of the end of season points table is about the
    same. You wouldn't expect the end of season results to match perfectly
    because minor fluctuations in the results are inevitable, and while your
    complex formula would have taken propensity to crash into consideration (being
    a function of past years championship points) it can't predict severity of
    crashes which will lead to even greater fluctuations in results... etc

    Nev..
    '03 ZX12R
     
    Nev.., Dec 2, 2004
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.