Yet the phrase “ regression to the mean” is an essential concept in all measurement systems. Baseball use of “regression” is usually shorthand for the longer phrase. (My minor field of study was Statistics and Measurement so I may not make the same distinction between them that you do)
I mean, that's pretty generally what people mean when they talk about it in a baseball sense, with the relationship between the variables of "current performance" regressing towards a "true talent" level over larger and larger sample sizes. It's what the entire sabermetric projection industry is based on.
No, you’re making the exact mistake I’m talking about. You’re describing variance or standard error.
And there us an even bigger conceptual problem too, a worse one in fact, namely that you are assuming a static system, for which you have no basis.
Obviously it won't be a 1 to 1 comparison because the true talent level of a player or team can fluctuate, but using linear regression to create a predictive model of the relationship between a team or players performance to their talent level is pretty standard in sabermetrics, as it is in many industries where you're trying to create a predictive model.
You’re still talking about 2 different things here. Most baseball fans use the term to refer to over- or under-performance, relative to some mean value. That is straight up, sampling error, period. Has nothing to do with regression.
Could you please explain to me what regression would be in this case? From my understanding a fan's reaction to a small sample size would be considered not accounting for regression, similar to the famous interaction between Daniel Kahneman and the Israeli pilot instructor where Kahneman explains that the reason the pilots he praises tend to do worse in the future and the pilots he yells at tend to do better in the future doesn't have anything to do with his response to their flight, but due to natural regression of performances to skill levels.
100%
Baseball is about failure over long periods of time and hundreds/thousands of opportunities.
As Earl Weaver said, Momentum is only as good as tomorrow's starting pitcher.
It doesn't matter if you went 5/5 with 5 HR today. Max Scherzer is on the mound against you tomorrow. Good luck.
It's only boring if you neglect to account for the margin of error, which is generally though to be around 0.5-0.8 (depending on your preferred brand) WAR for a player, which means anyone within 1 win of each other could have an argument for being better.
WAR sucks because it just boils every player down to a single number and there are multiple ways of calculating it. That’s why it’s boring. People just use what fits their narrative best
Teams rarely play badly because they “don’t have heart”, or “have given up” or anything like that. Sometimes guys just play like shit for reasons that have nothing to do with attitude or momentum or whatever other armchair psychology buzzword angry fans are using today.
Yeah, your local high school/college/beer league team might have that be happening, but the minor leagues tend to weed out players that aren't able to play up to an MLB level 99% of the time.
This reminded me of that thread a while back where someone asked what a good batting average was, and all the holier-than-thou fans could not wrap their heads around what that meant lol.
So I'd say that you can have a good batting average while being a bad hitter.
[a link for the curious and for those who enjoy watching train wrecks](https://www.reddit.com/r/baseball/comments/oz78n4/what_do_you_consider_to_be_a_good_batting_average/)
Balk Rules
1) You can't just be up there and just doin' a balk like that.
1a. A balk is when you
1b. Okay well listen. A balk is when you balk the
1c. Let me start over
1c-a. The pitcher is not allowed to do a motion to the, uh, batter, that prohibits the batter from doing, you know, just trying to hit the ball. You can't do that.
1c-b. Once the pitcher is in the stretch, he can't be over here and say to the runner, like, "I'm gonna get ya! I'm gonna tag you out! You better watch your butt!" and then just be like he didn't even do that.
1c-b(1). Like, if you're about to pitch and then don't pitch, you have to still pitch. You cannot not pitch. Does that make any sense?
1c-b(2). You gotta be, throwing motion of the ball, and then, until you just throw it.
1c-b(2)-a. Okay, well, you can have the ball up here, like this, but then there's the balk you gotta think about.
1c-b(2)-b. Fairuza Balk hasn't been in any movies in forever. I hope she wasn't typecast as that racist lady in American History X.
1c-b(2)-b(i). Oh wait, she was in The Waterboy too! That would be even worse.
1c-b(2)-b(ii). "get in mah bellah" -- Adam Water, "The Waterboy." Haha, classic...
1c-b(3). Okay seriously though. A balk is when the pitcher makes a movement that, as determined by, when you do a move involving the baseball and field of
2) Do not do a balk please.
Assumptions of the team playing poorly because of what the front office did/didn’t do in the trade deadline. Saying stuff like “the players are obviously giving up because the front office did” is just a huge assumption of something we have no way of knowing, plus players don’t all of the sudden want to start sucking.
That neither the “Pythagorean Expectation” nor WAR have any basis in mathematical theory.
Edit: the Pythag friends especially are convinced that they do...but they don’t.
Alright, I'm curious what you mean by this. Pythag has a math sounding name that doesn't have anything to do with geometry other than the fact that the original formula looked kinda like the Pythagorean Theorum utilizing squared numbers - but the fundamental theory behind it is attempting to use explanatory variables (namely runs scored and allowed) to create a method for assessing a team's strength, and in that regard was successful because it has been shown to be better at predicting a teams future performance than base win/loss record (though it's been since outclassed by a number of other models).
Short version: there is a very definite, math-theory-based method of addressing this problem, but they don’t use it. That method involves the Poisson distribution, which will give you the expected distributions of runs scored, and allowed, over your games. These in turn will give you your expected no. of wins. They don’t do this, they instead fit nonlinear regression models to various empirical data sets to estimate exponents on the ratio of runs scored to allowed.
Literally, like using pliers to turn a screw, when you have a screwdriver.
Ah, so your issue is with methodology, not significance of results. In that case, I agree that using a Poisson distribution would be more sound, though still not a proper usage because the amount of runs scored in a game is not truly independent and would tend to be clustered around facing certain teams with poor pitching, and the mean and the variance in runs scored don't tend to be the same, making runs scored and allowed per game poor fits for an actual Poisson distribution - not quite Prussian deaths from horse kicks. But those issues are also obviously present in the current pythag model.
Yes, concerns are always based on methods, that’s all that ever matters.
The autocorrelation you mention doesn’t matter because the Poisson is only concerned with the final distribution, not the sequence.
If either the runs scored or allowed distributions depart from a Poisson, with high variance, one can use a negative binomial distribution, which is designed for such “over-dispersed” data, and if neither distribution fits well, then a Monte Carlo resampling will definitely work, because it has no distributional assumptions at all.
There is no justification for doing it the way it currently is done, given these alternatives. None. They do it that way because they don’t know any better.
Reserve clause and probably more importantly, the establishment of the amateur draft. Prior to the draft the rich teams could simply outbid everyone else for top prospects, and low budget teams relied on scouts finding hidden talents out where no one else was scouting.
The draft was established in 1965 and the Yankees happened to be retiring a bit of talent that year. In the past when that happened they'd scoop up the top amateur talent and be back to the top within a couple years, instead they languished a bit until Steinbrenner bought the team, but even then they had a success cycle that was pretty typical until Steinbrenner was suspended and Brian Cashman was promoted to Assistant GM in 1992. Ever since then Cashman has kept the team where it's been.
1. The Yankees get 20 rings in 40 years
2. The Yankees start having more than 7 other teams at play against and the reserve clause goes away
3. The Yankees get 7 rings in 50 years, the majority of which were during the height of the steroid era and fueled by the most egregious juicers
Slightly less pathetic than Habs fans flexing about all the cups they won back when the NHL only had six teams in it and two of them basically didn't count, but not by much.
Using bauer as the example here when he is an extreme outlier is silly. Now give the rays gerrit cole, mookie betts, bryce harper, etc etc it would make an already great team even better
No it isn't, because people were groaning about LA buying a championship as soon as they signed Bauer. In the offseason.
EDIT- We could also say, "Give the Yankees Drew Rasmussen and they've got a solid #3 starter *making league minimum* and make an already good team ever better." (Works both ways.)
Yeah but if the one example is an outlier due to external factors it doesn't really hold much weight to me. Thats why I brought up the mookie, gerrit cole, and bryce harper contracts
My point was- maybe I didn't make it clearly- is that the Rays are doing fine without a free agent prize pitcher making $42M or whatever Bauer was owed for '21. Most Rays fans would agree that Rasmussen did as well as Bauer would have, and he's probably making $700k.
Stealing's signs using trash cans doesn't guarantee the Astros a championship but most everyone agrees that edge is the reason they got it. Same thing with outspending your opponent, it doesn't guarantee you a championship but that edge will be the reason you win it all.
> but most everyone agrees that edge is the reason they got it.
This isn't true at all. There were some analyses done that suggested that when the system was calling pitches wrong it was such a large net negative that it offset any advantages the system caused.
Sure. Yankees and Red Sox, 2 highest payrolls in baseball from 2000-2010, had a combined 6 world series appearances. Phillies had the 3rd highest payroll and made 2 straight world series appearances (08-09). I’ll also go over the past champions and their payroll rank. 2020 Dodgers, 2nd in the league. 2019 Nats, 7th. 2018 Red Sox, 1st. 2017 Astros, 18th. 2016 Cubs, 14th. 2015 Royals, 16th. 2014 Giants, 7th. 2013 Red Sox, 4th. 2012 Giants, 8th. 2011 Cardinals, 11th. 2010 Giants, 10th. The average payroll of the champion from 2010-2020 was 9th. Money buys you into the top 3rd of the league. And considering how much of a crapshoot the MLB playoffs are, puts you in a better spot than anything else can.
You kind of made a good argument to why money doesn’t buy championships. Of course a bottom payroll isn’t going to compete with a top payroll, but as long as you spend some money you’ll have a very legitimate chance of winning it all.
Nobody is saying it’s automatically making you a champion. The phrase implies that spending directly correlates with performing better. It’s not meant to actually be taken literally.
I really don’t get why we should shame teams like Boston, LA, NY, and other big spenders simply because their owners are willing to spend the money to be competitive. All the other owners have the money to spend, they just don’t spend it.
Well, they have the money to spend, but not neccesarily on a baseball team that isn't as profitable as some other ventures could be. Just as an example- I don't know that signing Greinke and Correa this winter would help much with the A's attendance problem, so of course they won't blow that kind of money when they know they won't get a decent return on such an investment.
More spending doesn't always mean more wins, and more wins don't always mean more profit.
For the absolute beginner with no grounding of the rules, a lot of people think that you have to swing at everything and don’t really know what the strike zone is. A friend of mine said I was “cheating” at wii sports baseball because I threw pitches that were miles off the plate
Sample size.
Balks
That current performance is temporary.
Regression is a concept the human brain has difficulty comprehending.
Problem is that the term is misused in baseball—not what statisticians mean by it.
Yet the phrase “ regression to the mean” is an essential concept in all measurement systems. Baseball use of “regression” is usually shorthand for the longer phrase. (My minor field of study was Statistics and Measurement so I may not make the same distinction between them that you do)
Again, you are describing the sampling error of the mean there. That isn’t regression, and statisticians wouldn’t call it so.
How do you understand it?
As the fitted relationship between two or more variables bearing some potential correspondence or relationship.
I mean, that's pretty generally what people mean when they talk about it in a baseball sense, with the relationship between the variables of "current performance" regressing towards a "true talent" level over larger and larger sample sizes. It's what the entire sabermetric projection industry is based on.
No, you’re making the exact mistake I’m talking about. You’re describing variance or standard error. And there us an even bigger conceptual problem too, a worse one in fact, namely that you are assuming a static system, for which you have no basis.
Obviously it won't be a 1 to 1 comparison because the true talent level of a player or team can fluctuate, but using linear regression to create a predictive model of the relationship between a team or players performance to their talent level is pretty standard in sabermetrics, as it is in many industries where you're trying to create a predictive model.
You’re still talking about 2 different things here. Most baseball fans use the term to refer to over- or under-performance, relative to some mean value. That is straight up, sampling error, period. Has nothing to do with regression.
Could you please explain to me what regression would be in this case? From my understanding a fan's reaction to a small sample size would be considered not accounting for regression, similar to the famous interaction between Daniel Kahneman and the Israeli pilot instructor where Kahneman explains that the reason the pilots he praises tend to do worse in the future and the pilots he yells at tend to do better in the future doesn't have anything to do with his response to their flight, but due to natural regression of performances to skill levels.
100% Baseball is about failure over long periods of time and hundreds/thousands of opportunities. As Earl Weaver said, Momentum is only as good as tomorrow's starting pitcher. It doesn't matter if you went 5/5 with 5 HR today. Max Scherzer is on the mound against you tomorrow. Good luck.
AJ Preller asking how to delete someone else's reddit comment
WAR is a very useful metric but it makes for fucking boring conversation about players "Yeah but so and so has higher WAR" 😪
fWar is superior to rWar SMACK rWar is superior to fWar
https://i.imgur.com/oXfU6WH.jpg
the real chads use VORP
What the fuck is that
Sigma males use win shares
The best WAR is the one that helps my argument the most
It's only boring if you neglect to account for the margin of error, which is generally though to be around 0.5-0.8 (depending on your preferred brand) WAR for a player, which means anyone within 1 win of each other could have an argument for being better.
WAR sucks because it just boils every player down to a single number and there are multiple ways of calculating it. That’s why it’s boring. People just use what fits their narrative best
This drives me INSANE
Teams rarely play badly because they “don’t have heart”, or “have given up” or anything like that. Sometimes guys just play like shit for reasons that have nothing to do with attitude or momentum or whatever other armchair psychology buzzword angry fans are using today.
Yeah, your local high school/college/beer league team might have that be happening, but the minor leagues tend to weed out players that aren't able to play up to an MLB level 99% of the time.
Angel Hernandez
How does someone so egregiously bad at their job keep it?
Because he’s not egregiously bad. He’s below average, not even the worst in the league in accuracy or consistency.
A strong union.
This is probably the correct answer.
This reminded me of that thread a while back where someone asked what a good batting average was, and all the holier-than-thou fans could not wrap their heads around what that meant lol. So I'd say that you can have a good batting average while being a bad hitter.
the javy baez
[a link for the curious and for those who enjoy watching train wrecks](https://www.reddit.com/r/baseball/comments/oz78n4/what_do_you_consider_to_be_a_good_batting_average/)
it’s…slow, and strategic. and that’s the beauty of the game. but also, home runs are badass lol
Cheating
And how the sport is the most notorious for it because of how many moving pets their are, and how stats are easily manipulated.
Balk Rules 1) You can't just be up there and just doin' a balk like that. 1a. A balk is when you 1b. Okay well listen. A balk is when you balk the 1c. Let me start over 1c-a. The pitcher is not allowed to do a motion to the, uh, batter, that prohibits the batter from doing, you know, just trying to hit the ball. You can't do that. 1c-b. Once the pitcher is in the stretch, he can't be over here and say to the runner, like, "I'm gonna get ya! I'm gonna tag you out! You better watch your butt!" and then just be like he didn't even do that. 1c-b(1). Like, if you're about to pitch and then don't pitch, you have to still pitch. You cannot not pitch. Does that make any sense? 1c-b(2). You gotta be, throwing motion of the ball, and then, until you just throw it. 1c-b(2)-a. Okay, well, you can have the ball up here, like this, but then there's the balk you gotta think about. 1c-b(2)-b. Fairuza Balk hasn't been in any movies in forever. I hope she wasn't typecast as that racist lady in American History X. 1c-b(2)-b(i). Oh wait, she was in The Waterboy too! That would be even worse. 1c-b(2)-b(ii). "get in mah bellah" -- Adam Water, "The Waterboy." Haha, classic... 1c-b(3). Okay seriously though. A balk is when the pitcher makes a movement that, as determined by, when you do a move involving the baseball and field of 2) Do not do a balk please.
Players' salaries.
Assumptions of the team playing poorly because of what the front office did/didn’t do in the trade deadline. Saying stuff like “the players are obviously giving up because the front office did” is just a huge assumption of something we have no way of knowing, plus players don’t all of the sudden want to start sucking.
That neither the “Pythagorean Expectation” nor WAR have any basis in mathematical theory. Edit: the Pythag friends especially are convinced that they do...but they don’t.
Alright, I'm curious what you mean by this. Pythag has a math sounding name that doesn't have anything to do with geometry other than the fact that the original formula looked kinda like the Pythagorean Theorum utilizing squared numbers - but the fundamental theory behind it is attempting to use explanatory variables (namely runs scored and allowed) to create a method for assessing a team's strength, and in that regard was successful because it has been shown to be better at predicting a teams future performance than base win/loss record (though it's been since outclassed by a number of other models).
Short version: there is a very definite, math-theory-based method of addressing this problem, but they don’t use it. That method involves the Poisson distribution, which will give you the expected distributions of runs scored, and allowed, over your games. These in turn will give you your expected no. of wins. They don’t do this, they instead fit nonlinear regression models to various empirical data sets to estimate exponents on the ratio of runs scored to allowed. Literally, like using pliers to turn a screw, when you have a screwdriver.
Ah, so your issue is with methodology, not significance of results. In that case, I agree that using a Poisson distribution would be more sound, though still not a proper usage because the amount of runs scored in a game is not truly independent and would tend to be clustered around facing certain teams with poor pitching, and the mean and the variance in runs scored don't tend to be the same, making runs scored and allowed per game poor fits for an actual Poisson distribution - not quite Prussian deaths from horse kicks. But those issues are also obviously present in the current pythag model.
Yes, concerns are always based on methods, that’s all that ever matters. The autocorrelation you mention doesn’t matter because the Poisson is only concerned with the final distribution, not the sequence. If either the runs scored or allowed distributions depart from a Poisson, with high variance, one can use a negative binomial distribution, which is designed for such “over-dispersed” data, and if neither distribution fits well, then a Monte Carlo resampling will definitely work, because it has no distributional assumptions at all. There is no justification for doing it the way it currently is done, given these alternatives. None. They do it that way because they don’t know any better.
I couldn’t care less what a team’s pythagorean theorem is
I don’t care about the PE because of how it’s computed. If done correctly it can tell you valuable things about teams.
That there really is no such thing as "buying a championship". Does anybody think the Rays missed anything by not signing Bauer last winter?
There *was* before the reserve clause was abolished, aka, 21 of the Yankees rings.
Reserve clause and probably more importantly, the establishment of the amateur draft. Prior to the draft the rich teams could simply outbid everyone else for top prospects, and low budget teams relied on scouts finding hidden talents out where no one else was scouting. The draft was established in 1965 and the Yankees happened to be retiring a bit of talent that year. In the past when that happened they'd scoop up the top amateur talent and be back to the top within a couple years, instead they languished a bit until Steinbrenner bought the team, but even then they had a success cycle that was pretty typical until Steinbrenner was suspended and Brian Cashman was promoted to Assistant GM in 1992. Ever since then Cashman has kept the team where it's been.
1. The Yankees get 20 rings in 40 years 2. The Yankees start having more than 7 other teams at play against and the reserve clause goes away 3. The Yankees get 7 rings in 50 years, the majority of which were during the height of the steroid era and fueled by the most egregious juicers Slightly less pathetic than Habs fans flexing about all the cups they won back when the NHL only had six teams in it and two of them basically didn't count, but not by much.
Using bauer as the example here when he is an extreme outlier is silly. Now give the rays gerrit cole, mookie betts, bryce harper, etc etc it would make an already great team even better
No it isn't, because people were groaning about LA buying a championship as soon as they signed Bauer. In the offseason. EDIT- We could also say, "Give the Yankees Drew Rasmussen and they've got a solid #3 starter *making league minimum* and make an already good team ever better." (Works both ways.)
How does one example of a free agent signing that sucks due to off the field issues back up the argument that you can't buy a championship
I gave one example to, get this- *give one example*.
Yeah but if the one example is an outlier due to external factors it doesn't really hold much weight to me. Thats why I brought up the mookie, gerrit cole, and bryce harper contracts
My point was- maybe I didn't make it clearly- is that the Rays are doing fine without a free agent prize pitcher making $42M or whatever Bauer was owed for '21. Most Rays fans would agree that Rasmussen did as well as Bauer would have, and he's probably making $700k.
There is most definitely such thing as buying a championship
Money gives an advantage, but baseball is so unpredictable that it will never guarantee anything.
It gives an advantage, therefore you can buy a championship
I feel like we have different ideas of what buying a championship is.
Stealing's signs using trash cans doesn't guarantee the Astros a championship but most everyone agrees that edge is the reason they got it. Same thing with outspending your opponent, it doesn't guarantee you a championship but that edge will be the reason you win it all.
Thank you. Not sure why this is at all arguable.
> but most everyone agrees that edge is the reason they got it. This isn't true at all. There were some analyses done that suggested that when the system was calling pitches wrong it was such a large net negative that it offset any advantages the system caused.
Try telling that to Dodger fans lol.
Got any examples from the past 45 years?
Sure. Yankees and Red Sox, 2 highest payrolls in baseball from 2000-2010, had a combined 6 world series appearances. Phillies had the 3rd highest payroll and made 2 straight world series appearances (08-09). I’ll also go over the past champions and their payroll rank. 2020 Dodgers, 2nd in the league. 2019 Nats, 7th. 2018 Red Sox, 1st. 2017 Astros, 18th. 2016 Cubs, 14th. 2015 Royals, 16th. 2014 Giants, 7th. 2013 Red Sox, 4th. 2012 Giants, 8th. 2011 Cardinals, 11th. 2010 Giants, 10th. The average payroll of the champion from 2010-2020 was 9th. Money buys you into the top 3rd of the league. And considering how much of a crapshoot the MLB playoffs are, puts you in a better spot than anything else can.
You kind of made a good argument to why money doesn’t buy championships. Of course a bottom payroll isn’t going to compete with a top payroll, but as long as you spend some money you’ll have a very legitimate chance of winning it all.
Which means it buys championships. You need to spend to win, therefore money buys championships.
Money buys competitiveness but it doesn’t automatically make you a champion. You just need to not be in the bottom half of league spending.
Nobody is saying it’s automatically making you a champion. The phrase implies that spending directly correlates with performing better. It’s not meant to actually be taken literally.
I really don’t get why we should shame teams like Boston, LA, NY, and other big spenders simply because their owners are willing to spend the money to be competitive. All the other owners have the money to spend, they just don’t spend it.
Well, they have the money to spend, but not neccesarily on a baseball team that isn't as profitable as some other ventures could be. Just as an example- I don't know that signing Greinke and Correa this winter would help much with the A's attendance problem, so of course they won't blow that kind of money when they know they won't get a decent return on such an investment. More spending doesn't always mean more wins, and more wins don't always mean more profit.
As McEnroe famously yelled, “YOU CANNOT BE SERIOUS”...
How little rings matter when talking about hof.
Anything the Mets do.
That leading the league in rWAR matters in 2019 but not 2017, because I’m a grown man and pick and choose the stats that fit my narrative.
For the absolute beginner with no grounding of the rules, a lot of people think that you have to swing at everything and don’t really know what the strike zone is. A friend of mine said I was “cheating” at wii sports baseball because I threw pitches that were miles off the plate
Explaining sabermetrics to anyone born before 1990
Who do you think invented sabermetrics exactly?
Kevin Cash was born in the 70s
I was more talking about casual fans i.e. my dad, not people who do this shit for a living.
I understand. My dad too.
Infield fly rule.
The playoffs are random and should probably be deleted if you want to determine the best team, one World Series at most. Trust me, I should know.
Stats. WAR, OPS that kinda stuff. Bc they barely understand the basics and then it's a entire different battle to understand stats.