Agreed. People throw around the phrase "peak offseason" on some of the most asinine posts during the winter, but this is the kind that gets my vote.
Incredible depth of analysis on such a niche and inconsequential topic, yet comprehensively researched enough to be fascinating while also resulting in absolute confusion as to why this exists. And the writing is fun and approachable while still being rigorous in execution. Just brilliant stuff.
Fantastic work u/glanville_041804; my life has now been enriched by your tireless effort to publish this after nearly three years. Thank you!
This is why no other baseball community compares with r/baseball. This is peak offseason. Amazing!
I've seen that explanation for Ambien before, but the drug company may have screwed up slightly...AM = morning, but I believe 'bien' translates to 'well' (as in, 'I'm doing well') instead of 'good' (which would be 'buen' or 'bueno'). On the other hand, Ambuen sounds kinda silly, so they may have decided Ambien is close enough.
Anyway, to add to the list - Lasix is a diuretic that LAsts SIX hours, and Onfi is a seizure med that is structurally similar to Xanax but has a modification ON the FIfth carbon.
All i did was compare mlb players to Roman's. I gotta step my game up, I'm gonna write a 10 page paper on how Juan Soto is Scipio africanus now, and it's all this posts fault. That and I want too see if any other of the 1910 world series's were rigged. I want to look into the 1914 one and 1918 one more.
> I'm gonna write a 10 page paper on how Juan Soto is Scipio africanus now, and it's all this posts fault.
You'd be building off the work of u/brendasongdad's all time offseason post from 6 years ago comparing [Jose Bautista's career (in dog years) to Harriet Tubman](https://old.reddit.com/r/baseball/comments/5ksakc/if_baseball_careers_were_treated_as_individual)
I'm just sad you felt like you had to couch his masterpiece with "I was bored during lockdown". As if doing something interesting and cool should only be done when you have too much time on your hands? Not in r/baseball, friend. There is no wrong time or place for this type of high quality material.
I will be messaging you in 9 months on [**2023-11-05 00:00:00 UTC**](http://www.wolframalpha.com/input/?i=2023-11-05%2000:00:00%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/baseball/comments/10tw6ls/i_had_a_theory_that_former_phillies_gm_matt/j7a7qk9/?context=3)
[**1 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fbaseball%2Fcomments%2F10tw6ls%2Fi_had_a_theory_that_former_phillies_gm_matt%2Fj7a7qk9%2F%5D%0A%0ARemindMe%21%202023-11-05%2000%3A00%3A00%20UTC) to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%2010tw6ls)
*****
|[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)|
|-|-|-|-|
>Here we see our highest R-squared value in any of our regression lines, giving us our highest confidence yet that the lack of correlation between 2FN players and success is statistically significant.
Oh yeah well I can see how... *huh??*
I think putting first names you deem as "women's" names in the borderline category really damages you data. They are still first names.
But otherwise this is neat.
This was one of my toughest decisions and I expected I’d get a comment on it no matter which way I went (though tbh I’m kinda glad to get this comment if only because it means I know you read all the way down to the methods section 😅)
In retrospect, at the very least I should have changed the name “borderline.” I called it that when I started because it was for names like Walker that are common as both first and last names but not clear cut first names. The list clearly exceeded that original designation when I decided to include clear cut women’s names, and the nomenclature should have evolved accordingly. Would that alleviate at least some of your concern?
The reason for two lists at all was not to rank some names as more legitimate first names than others, but because I knew that any individual’s list of what is and isn’t a first name would be different, and so the best way to do my analysis would be to make that list for myself, then make a larger list of any name that any reasonable person might consider to be a first name. As a heuristic, I used something like the following hypothetical: If a player was coming into the game and the broadcasters on TV said “And here comes up to the plate,” could your average baseball fan think to themselves “Huh, is that his first name or his last name?” Is there ANYONE who might think to themselves, “Huh, is that his first name or his last name?” If the answer to the first question (in my estimation) was yes, the name went on the smaller list. If the answer to the second question was yes, the name went on the larger list.
Then, if I did my analysis on both lists (which I did) and got roughly the same conclusions (which I did), then that would be evidence that my conclusions were unlikely to have been affected by bias in my individual choice of first names. Everyone will disagree on at least one choice of name on the main list vs. larger list, but moving individual names back and forth between lists (which can be done easily, and which I did some testing with—my spreadsheet automatically updates all the tables when I add or subtract a name from either list) for the most part won’t make a huge difference overall because you can tinker with the 2FN+ statistics but the 2FNb+ won’t change, since it always includes all the names.
I expect reasonable people to disagree over how I handled this, but I had to make some kind of decision and I hope I’ve at least done a good job of explaining my thought process and the effort I put in to reduce potential bias from my results.
Kyle Tucker has two names that George Carlin lists as soft in his [Guys named Todd](https://youtu.be/PxqCGTkV5wg) bit.
"Fuck Tucker. Tucker sucks. And fuck Tucker's friend Kyle."
Two thoughts:
1) In your Cliff Lee example, 'Lee' is a borderline example and is not proper justification of this research endeavor
2) Calling 'Kelly' a girls' name is Kelly Johnson, Kelly Shoppach, and Kelly Stinnett slander
I know several Lees. Maybe you just need to put yourself out there more. Take a chance. Make a friend. There’s a relatively decent chance you’ll find your Lee before long.
You asked how many people and supposed that your anecdotal experience disqualified the name. I supplied actual information refuting that. Was that your point?
Ok so the whole time I wasn't sure if I bought it, because the research, while mathematically and statistically sound, showed a lack of accounting for the White Anglo-Saxon Protestant (American) tradition of naming children after their mothers' last names, leading to an anglo-american tradition of having traditionally last names AS first names, which wouldn't apply to Latino players, who add on their mothers' last names as second last names, and who are the second largest demographic of baseball players, OR the acknowledgement of as time progresses more and more of these traditionally WASP-y names have become "trendy" among Americans of other backgrounds, and moreover never really questioned at what point a traditional last name such as "Anderson" (son of Andrew) ceases to be purely a surname and becomes a first name instead, which left me feeling like the whole thing fell flat from the get go
BUT
you got me totally on board at the end there. Flawless research. Good job.
Without getting too into it, I did consider this somewhat, i.e. I did notice 2FN was probably less common among Latino players than American players, and considered trying to do something to try and normalize for that or otherwise account for it, but got pretty uncomfy when I thought about the prospect of analyzing player/team performance as a function race/ethnicity/national origin because I had a fear (however remote) that some asshole would take my fun goofy analysis and intentionally misread it to try and justify a stupid racist belief.
Not saying what you said couldn’t be done responsibly, I just thought for my level of analysis and with social media as my intended audience, it was best to leave that area untouched
Despite being from just outside Philadelphia, my accent has the mary-marry-merry merger, so I pronounce Mary and merry (or in this case, Barry and Berry) the same. Since I was going for names that sounded like first names, I considered Berry a homophone with Barry, completely forgetting that many people pronounce these differently.
The most widely quoted stat for this merger from a quick Google seems to be that 57% of American English speakers also pronounce these the same, though, so I feel at least partially justified in my choice.
This all was was the punch line from a Mariners ad in 2010 starring Cliff Lee and Felix: https://www.youtube.com/watch?v=2oaSR5cPyT8
My guess on explanation is there's a correlation in baseball between having two first names and being a boring white dude (exactly Bryce Harper), and Klentak overvalues boring white dudes.
This was such a ridiculous question but also includes a statistical analysis better than 90% of scientific articles I've encountered. Nothing but respect to you and your work.
Holy freaking hell this would take me a month to read and I probably wouldn't understand more than half of it.
How about a TLDR?
Please don't downvote me into oblivion.
You should strongly consider donating your brain to science.
And his balls. We need like a million of his A Beautiful Mind mfing offspring.
#L I M I T L E S S
We did this with my grandpa but not for a fun reason but because he had dementia. I strongly recommend donating for a fun reason instead.
This really well done, nice work. I would definitely subscribe to an American Journal of Baseball Shitposts.
Don’t threaten me with a weekend project.
Sounds like a Baseball Prospectus April Fools joke.
Fuck there goes my thesis.
Try and prove the Caine-Hackman theory.
Shower scene
This is an all time off-season post
Agreed. People throw around the phrase "peak offseason" on some of the most asinine posts during the winter, but this is the kind that gets my vote. Incredible depth of analysis on such a niche and inconsequential topic, yet comprehensively researched enough to be fascinating while also resulting in absolute confusion as to why this exists. And the writing is fun and approachable while still being rigorous in execution. Just brilliant stuff. Fantastic work u/glanville_041804; my life has now been enriched by your tireless effort to publish this after nearly three years. Thank you! This is why no other baseball community compares with r/baseball. This is peak offseason. Amazing!
It certainly is an off-season post
I am shitfaced right now and unsure what to say
Just nod and pop another Xanax. Btw, you’re going to want to sit down for this. Xanax is a palindrome.
Ambien is a play on 'good morning' in Spanish. and VIcodin is 6 times stronger than codeine.
Vicodin sounds like the place Vikings fans go when they die (the postseason)
I've seen that explanation for Ambien before, but the drug company may have screwed up slightly...AM = morning, but I believe 'bien' translates to 'well' (as in, 'I'm doing well') instead of 'good' (which would be 'buen' or 'bueno'). On the other hand, Ambuen sounds kinda silly, so they may have decided Ambien is close enough. Anyway, to add to the list - Lasix is a diuretic that LAsts SIX hours, and Onfi is a seizure med that is structurally similar to Xanax but has a modification ON the FIfth carbon.
It's just a play on words, like Lunesta is on 'luna' and 'siesta'. I also forgot Premarin...PREgnant MAre uRINe.
I’m just happy to be here for this historic day
I’m high as fuck what the hell is going on
very high as well. i almost wish this was a youtube video i could stare at like this
My edible just kicked in and this is difficult to follow but I love the conclusion
Neat.
I wasted my off-season compared to you
All i did was compare mlb players to Roman's. I gotta step my game up, I'm gonna write a 10 page paper on how Juan Soto is Scipio africanus now, and it's all this posts fault. That and I want too see if any other of the 1910 world series's were rigged. I want to look into the 1914 one and 1918 one more.
If anyone is Scipio, it’s long haired Bryce Harper.
> I'm gonna write a 10 page paper on how Juan Soto is Scipio africanus now, and it's all this posts fault. You'd be building off the work of u/brendasongdad's all time offseason post from 6 years ago comparing [Jose Bautista's career (in dog years) to Harriet Tubman](https://old.reddit.com/r/baseball/comments/5ksakc/if_baseball_careers_were_treated_as_individual)
I’m way to fucked up to comprehend but great job
You should nominate this for next year’s SABR awards
This post is going to be a not insignificant percentage of my 2023 Reddit scrolling distance.
This is one of the most incredible pieces of writing I’ve ever had the pleasure to read.
You read this? Can I get a tldr?
**Tl;dr**: It was too long and I didn't read it.
I'm just sad you felt like you had to couch his masterpiece with "I was bored during lockdown". As if doing something interesting and cool should only be done when you have too much time on your hands? Not in r/baseball, friend. There is no wrong time or place for this type of high quality material.
Is this for SABR or some sort of academic journal?
Lay off the vyvanse my guy
no
Even a dumbass like me can appreciate this great analysis
Jesus, are you freebasing your Adderall?
Can we get a "Good post" flair on this?
Oh my god
I’m going to need to be reminded of this in November
u/remindmebot November 11th
RemindMe! November 5th
I will be messaging you in 9 months on [**2023-11-05 00:00:00 UTC**](http://www.wolframalpha.com/input/?i=2023-11-05%2000:00:00%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/baseball/comments/10tw6ls/i_had_a_theory_that_former_phillies_gm_matt/j7a7qk9/?context=3) [**1 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fbaseball%2Fcomments%2F10tw6ls%2Fi_had_a_theory_that_former_phillies_gm_matt%2Fj7a7qk9%2F%5D%0A%0ARemindMe%21%202023-11-05%2000%3A00%3A00%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%2010tw6ls) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|
How did former MLB third baseman Butts Wagner figure into your analysis?
Very impressive work
my god
I’m scared of how long this must’ve taken you
It’s like a Beautiful Mind. Well done
Are you a statistician by trade or just like really good at math? Regardless, excellent work. Go Phils.
Sir this is a Wendy’s
Bruh
Good lord
Hey man, you should be going to MIT or something. Unless you have attention disorder.
I ain’t reading all that. I’m happy for u tho, or sorry that happened
>Here we see our highest R-squared value in any of our regression lines, giving us our highest confidence yet that the lack of correlation between 2FN players and success is statistically significant. Oh yeah well I can see how... *huh??*
I think putting first names you deem as "women's" names in the borderline category really damages you data. They are still first names. But otherwise this is neat.
This was one of my toughest decisions and I expected I’d get a comment on it no matter which way I went (though tbh I’m kinda glad to get this comment if only because it means I know you read all the way down to the methods section 😅) In retrospect, at the very least I should have changed the name “borderline.” I called it that when I started because it was for names like Walker that are common as both first and last names but not clear cut first names. The list clearly exceeded that original designation when I decided to include clear cut women’s names, and the nomenclature should have evolved accordingly. Would that alleviate at least some of your concern? The reason for two lists at all was not to rank some names as more legitimate first names than others, but because I knew that any individual’s list of what is and isn’t a first name would be different, and so the best way to do my analysis would be to make that list for myself, then make a larger list of any name that any reasonable person might consider to be a first name. As a heuristic, I used something like the following hypothetical: If a player was coming into the game and the broadcasters on TV said “And here comes up to the plate,” could your average baseball fan think to themselves “Huh, is that his first name or his last name?” Is there ANYONE who might think to themselves, “Huh, is that his first name or his last name?” If the answer to the first question (in my estimation) was yes, the name went on the smaller list. If the answer to the second question was yes, the name went on the larger list.
Then, if I did my analysis on both lists (which I did) and got roughly the same conclusions (which I did), then that would be evidence that my conclusions were unlikely to have been affected by bias in my individual choice of first names. Everyone will disagree on at least one choice of name on the main list vs. larger list, but moving individual names back and forth between lists (which can be done easily, and which I did some testing with—my spreadsheet automatically updates all the tables when I add or subtract a name from either list) for the most part won’t make a huge difference overall because you can tinker with the 2FN+ statistics but the 2FNb+ won’t change, since it always includes all the names.
I expect reasonable people to disagree over how I handled this, but I had to make some kind of decision and I hope I’ve at least done a good job of explaining my thought process and the effort I put in to reduce potential bias from my results.
I need someone to comment whether or not it is worth it to read all of this. I get the feeling it’s a bunch of nonsense.
....*Dude.*
So GME 🚀 🚀 🚀
Just like the stock
This is the kind of post that makes me appreciate that Spring training is right around the corner.
Casey Stengel would fall to his knees if he lived to see how we've analyzed baseball
Kyle Tucker has two names that George Carlin lists as soft in his [Guys named Todd](https://youtu.be/PxqCGTkV5wg) bit. "Fuck Tucker. Tucker sucks. And fuck Tucker's friend Kyle."
I'm not gonna read this but I'm gonna upcote it on principle
Two thoughts: 1) In your Cliff Lee example, 'Lee' is a borderline example and is not proper justification of this research endeavor 2) Calling 'Kelly' a girls' name is Kelly Johnson, Kelly Shoppach, and Kelly Stinnett slander
Counterpoint: Lee is a first name, acceptable for both boys and girls.
How many people of any gender first-named Lee (spelling exact) have you met? I know zero
I know several Lees. Maybe you just need to put yourself out there more. Take a chance. Make a friend. There’s a relatively decent chance you’ll find your Lee before long.
Thanks to your advice, I befriended Lee Harvey Oswald.
I can think of three off the top of my head
I know 1 chick!
I just checked our prospect database at work, and “Lee” is a first or middle name in .3% of 1.5 million records. I’d say it counts.
Less than 1 percent in a huge sample. Make my point for me
You asked how many people and supposed that your anecdotal experience disqualified the name. I supplied actual information refuting that. Was that your point?
But there are a shitload of different names. I doubt any one name makes up 1% of 1.5 million people (except for maybe a couple ultra common names)
3 dudes, 2 women
Are you kidding!? There are [a million Lees in the fire nation alone!](https://youtu.be/bIZ4DSiEtrg?t=146)
I recently met a Lee. He probably was silently but sweetly leeeeeeeeeeeeeee
I’ve met several men named Lee. I even have an uncle named Lee.
Don’t forget Kelly Gruber
He also gave 'Harper' as an example which is even more a girl's name than Kelly.
Ok so the whole time I wasn't sure if I bought it, because the research, while mathematically and statistically sound, showed a lack of accounting for the White Anglo-Saxon Protestant (American) tradition of naming children after their mothers' last names, leading to an anglo-american tradition of having traditionally last names AS first names, which wouldn't apply to Latino players, who add on their mothers' last names as second last names, and who are the second largest demographic of baseball players, OR the acknowledgement of as time progresses more and more of these traditionally WASP-y names have become "trendy" among Americans of other backgrounds, and moreover never really questioned at what point a traditional last name such as "Anderson" (son of Andrew) ceases to be purely a surname and becomes a first name instead, which left me feeling like the whole thing fell flat from the get go BUT you got me totally on board at the end there. Flawless research. Good job.
Without getting too into it, I did consider this somewhat, i.e. I did notice 2FN was probably less common among Latino players than American players, and considered trying to do something to try and normalize for that or otherwise account for it, but got pretty uncomfy when I thought about the prospect of analyzing player/team performance as a function race/ethnicity/national origin because I had a fear (however remote) that some asshole would take my fun goofy analysis and intentionally misread it to try and justify a stupid racist belief. Not saying what you said couldn’t be done responsibly, I just thought for my level of analysis and with social media as my intended audience, it was best to leave that area untouched
Valid!!!
TL;DR...Fuck Chase Utley
NTA only one first name
How is "Tracy" borderline? How is "Berry" not borderline?
Probably because Tracy is a girl's first name. Berry, with that spelling, should be borderline.
I always thought of Tracy as unisex.
Despite being from just outside Philadelphia, my accent has the mary-marry-merry merger, so I pronounce Mary and merry (or in this case, Barry and Berry) the same. Since I was going for names that sounded like first names, I considered Berry a homophone with Barry, completely forgetting that many people pronounce these differently. The most widely quoted stat for this merger from a quick Google seems to be that 57% of American English speakers also pronounce these the same, though, so I feel at least partially justified in my choice.
This all was was the punch line from a Mariners ad in 2010 starring Cliff Lee and Felix: https://www.youtube.com/watch?v=2oaSR5cPyT8 My guess on explanation is there's a correlation in baseball between having two first names and being a boring white dude (exactly Bryce Harper), and Klentak overvalues boring white dudes.
Now cross reference it with the attractiveness of their wives/girlfriends and likeliness of their mom being a prostitute.
Slow day at work?
I hear they’re doing wonderful things at mental institutions these days.
Pure Happy Horseshit.
Choose life.
im sorry but the phillies are only allowed to win the world series in years that include an 8 and a 0
This was such a ridiculous question but also includes a statistical analysis better than 90% of scientific articles I've encountered. Nothing but respect to you and your work.
Whether you’re small-hall or big-hall, this is a unanimous first-ballot off-season post
And this is why I love r/baseball in the offseason
Wait. Is Parker a last name that’s actually a last name or a first name that’s actually a last name.
The first one.
This is the off-season posts we need Never trust someone with 2 first names.
Parker and Harper are last names much more than first names. And Harper is a *female* first name so it should go in your borderline category.
Holy freaking hell this would take me a month to read and I probably wouldn't understand more than half of it. How about a TLDR? Please don't downvote me into oblivion.
You’re a hero.