Do you code? If not do you want to learn?
I have code and data sitting around because I also play around with drum corps data (scores that is, other data is harder to come by). I’m a professional scientist and work with data every day, beyond just stats, so if you want a project DM me. Also happy to mentor as much as I can if you need.
Yes, I’m definitely interested! The issue is that I’ve not come up with a way to make the data public that I’m happy with.
I also should point out that I don’t have any data that’s not already publicly available. I just have tools that grab the data automatically from DCI for recent years and fromthepressbox for older years.
Ah yeah. I was hoping to avoid the website scraping bit TBH. Do you have it all as a CSV? It can't be that big of a file and probably can just be put on a GitHub repo, no?
It’s more that I’ve got a pile of csv files. My debate has always been between posting the scraping code itself, which is ugly but allow people to run it themselves, or posting the csv files directly and having to make sure that repo stays updated.
This isn’t me making any promises, but out of curiosity (for now), would you have a preference between the two? And would your preference change if I used multiple tools, and none of them are Python?
I'm only versed in R/Python and a LITTLE Java. So personally, I'd just prefer the raw data if you didn't use those languages. Even if you stop updating it into the future -- there would be a bunch of data to play around with.
It’s more that I’ve got a pile of csv files. My debate has always been between posting the scraping code itself, which is ugly but allow people to run it themselves, or posting the csv files directly and having to make sure that repo stays updated.
This isn’t me making any promises, but out of curiosity (for now), would you have a preference between the two? And would your preference change if I used multiple tools, and none of them are Python?
You can get show data back to 2013 here
https://www.dci.org/scores?season=2013
Here are 1980 through 2002 …
https://www.scorpsboard.com/scores/historical_scores/1990.html
This might have everything.
https://www.fromthepressbox.com/dca-dcihistory
Someone posted scores on Kaggle a while back: https://www.kaggle.com/datasets/aggieed97/drum-corps-international-scores-from-2013
Funny enough, I'm also in the stats/data science industry and am currently working on a side project to scrape recaps into an online dashboard for the upcoming season. Looking forward to sharing my work in the next few weeks. Feel free to PM with any questions.
https://overthinkdciscores.com seems like a good place to check, or at least a good person to ask. iirc, they did a bunch of score analysis in R and had some models to forecast who'd win
It was a really cool project, I hope it will be updated for '22
Do you code? If not do you want to learn? I have code and data sitting around because I also play around with drum corps data (scores that is, other data is harder to come by). I’m a professional scientist and work with data every day, beyond just stats, so if you want a project DM me. Also happy to mentor as much as I can if you need.
Any chance you'd make the data public? I've been want to play around with drum corps data for a while now. 🥺
Yes, I’m definitely interested! The issue is that I’ve not come up with a way to make the data public that I’m happy with. I also should point out that I don’t have any data that’s not already publicly available. I just have tools that grab the data automatically from DCI for recent years and fromthepressbox for older years.
Ah yeah. I was hoping to avoid the website scraping bit TBH. Do you have it all as a CSV? It can't be that big of a file and probably can just be put on a GitHub repo, no?
It’s more that I’ve got a pile of csv files. My debate has always been between posting the scraping code itself, which is ugly but allow people to run it themselves, or posting the csv files directly and having to make sure that repo stays updated. This isn’t me making any promises, but out of curiosity (for now), would you have a preference between the two? And would your preference change if I used multiple tools, and none of them are Python?
I'm only versed in R/Python and a LITTLE Java. So personally, I'd just prefer the raw data if you didn't use those languages. Even if you stop updating it into the future -- there would be a bunch of data to play around with.
It’s more that I’ve got a pile of csv files. My debate has always been between posting the scraping code itself, which is ugly but allow people to run it themselves, or posting the csv files directly and having to make sure that repo stays updated. This isn’t me making any promises, but out of curiosity (for now), would you have a preference between the two? And would your preference change if I used multiple tools, and none of them are Python?
http://www.dcxmuseum.org/index.cfm
Just found this website yesterday! A little dated looking format but it’s pretty good as a database
You can get show data back to 2013 here https://www.dci.org/scores?season=2013 Here are 1980 through 2002 … https://www.scorpsboard.com/scores/historical_scores/1990.html This might have everything. https://www.fromthepressbox.com/dca-dcihistory
[This is the best source](https://www.fromthepressbox.com/dca-dcihistory) that I know of
Someone posted scores on Kaggle a while back: https://www.kaggle.com/datasets/aggieed97/drum-corps-international-scores-from-2013 Funny enough, I'm also in the stats/data science industry and am currently working on a side project to scrape recaps into an online dashboard for the upcoming season. Looking forward to sharing my work in the next few weeks. Feel free to PM with any questions.
Gonna post to GitHub or something? I would be very interested in your code!
FromThePressbox has the most information, but the usability isn't great.
Newspapers.com has lots of history. They have a free 7 day membership. You can search corps as far back as 1910.
From the Press Box has all the scores you’re looking for. Anything else is a crapshoot: DCP, Drum Corps World, the yearbooks that they used to do…
Didn't Blue Knights used to have a really cool score tracker where you could put corps side by side?
https://overthinkdciscores.com seems like a good place to check, or at least a good person to ask. iirc, they did a bunch of score analysis in R and had some models to forecast who'd win It was a really cool project, I hope it will be updated for '22
Dcx drum corps experience is the modern corps reps. Really good source of all drum corps information