![]() |
![]() ![]() |
![]() |
Guest_Dr. Roffles_* |
![]()
Post
#1
|
Guests ![]() |
Hey, everybody! Remember how I said I'd eventually put that 13 page study I wrote on the internet?
Well, now I did! Part I - Introduction, background, and a primer on the data. Part II - Pretty hardcore data analysis up ins Part III - Some discussion and my archived predictions Aww yeah. |
|
|
![]()
Post
#2
|
|
Advanced Member ![]() ![]() ![]() Group: Emeritus Posts: 2,062 Joined: 25-March 18 Member No.: 1 ![]() |
So what were the odds of Burke coming out of nowhere like they did?
|
|
|
Guest_Widget!_* |
![]()
Post
#3
|
Guests ![]() |
|
|
|
Guest_AK_WDB_* |
![]()
Post
#4
|
Guests ![]() |
I'm a bit confused...you only talk about the top 10 teams for each given year. Wouldn't the school, and the scores from that school in the past, be an important determinant in predicting a national score?
|
|
|
![]()
Post
#5
|
|
Advanced Member ![]() ![]() ![]() Group: Nazgul Posts: 1,519 Joined: 25-March 18 Member No.: 16 ![]() |
Great job on this!
But do the "indicator variable columns for certain states" count as a factor for ANY school from the big 5 (TX, AZ, WI, IL, CA)? In 2008, that extra variable shouldn't have counted for Pearland. Going in, they had just as much prior experience as a lolstate team going to nationals for the first time. Same for any other team from the big 5 attending nationals for the first time. This post has been edited by madcap: Aug 4 2009, 04:23 AM |
|
|
Guest_Dr. Roffles_* |
![]()
Post
#6
|
Guests ![]() |
So what were the odds of Burke coming out of nowhere like they did? Before the state competition this year? Rather low. After state? Extremely high, actually. On my pre-Nats stat review, I put Burke lower than 3rd, but in any straight translation models (that is, the models that seem to be more generally accurate than the ones that use indicator states and other little tics) they got 3rd. The tier models generally had them in 4th or 5th, but any non-tier dependent model had them at 3rd, and there was good reason to think that the tier model was too weighted towards the beginning of the decade, not to mention the fact that I was hoping to figure out a way to incorporate a time series analysis into it that I missed out on due to time constraints. I'm a bit confused...you only talk about the top 10 teams for each given year. Wouldn't the school, and the scores from that school in the past, be an important determinant in predicting a national score? Not really. The dataset for each individual school from state to nationals is, for most teams, extremely sparse. A three score translation dataset for Pearland, for instance, is hardly enough for a statistically significant statement about Pearland's specific propensity for improvement from state to nationals. While it would be nice to do a school-by-school stratified measure of improvement, that just isn't an option when you're working with datasets that are dealing with 80 or fewer complete items. Also, yes, I only talked about the top ten for each year. The general statistics reason for that is that scores become a lot more random under the top ten, and we aren't really all that interested in a model that's going to predict with large intervals the scores of teams 1-20 -- I'd rather have a model that predicted with small intervals the scores of teams 1-10. Also, there was again the problem with data availability. A lot of years, we couldn't even find the scores for all the top ten. Extend the top ten, and you risk having a dataset with too many holes to be particularly viable in any real research... not to mention the added time it'd take to track it down. Great job on this! But do the "indicator variable columns for certain states" count as a factor for ANY school from the big 5 (TX, AZ, WI, IL, CA)? In 2008, that extra variable shouldn't have counted for Pearland. Going in, they had just as much prior experience as a lolstate team going to nationals for the first time. Same for any other team from the big 5 attending nationals for the first time. I'm not entirely sure I worded that clearly. There were five columns, one for each of the big five. If the school was representing AZ, they'd have a 1 in the AZ column and a 0 in all the other columns. Same with TX, CA, WI, etc. The goal was to see if there were significant relationships the model could extract from certain states. If you still think Pearland should not have gotten marked as a Texas school, I'd have to respectfully disagree. This sort of an argument could be made all the time, but the key items that those indicator variables are looking for mainly are focused on the overarching state trends -- that is, big states that traditionally drop in score because their subjectives are inflated, or visa versa. I'm pretty sure the experiments involving the Texas variable all had pretty slim correlations or effects, so I tended to avoid using it -- the only two indicator variables with real value of those five were (predictably) WI and IL, both schools with a one-team AcaDec system where year to year results tend to be similar in composition. The TX/CA/AZ variables were diluted by the fact that the teams representing the state changed throughout the decade, which lessened the home-state effect a bit. As Will stated, this would be at least slight evidence that I should try counting for schools... that is, if that was at all feasible. Unfortunately, it really isn't. Still, nice to dream... |
|
|
![]()
Post
#7
|
|
Advanced Member ![]() ![]() ![]() Group: Emeritus Posts: 2,062 Joined: 25-March 18 Member No.: 1 ![]() |
You're hired.
![]() |
|
|
![]()
Post
#8
|
|
Advanced Member ![]() ![]() ![]() Group: Coach Class Posts: 406 Joined: 26-March 18 Member No.: 10 ![]() |
Way over my head. The last math class I took was Geometry in high school about one hundred years ago. But it seems very accurate. Great job!
![]() |
|
|
Guest_overly_critical_man_* |
![]()
Post
#9
|
Guests ![]() |
|
|
|
![]()
Post
#10
|
|
Advanced Member ![]() ![]() ![]() Group: Emeritus Posts: 2,062 Joined: 25-March 18 Member No.: 1 ![]() |
Zombies can reply to threads already started. We haven't had spammer problems lately, so it snuck through.
|
|
|
![]() ![]() |
Lo-Fi Version | Time is now: 20th April 2018 - 06:20 PM |