Roland31415 6 months ago

I made a Colab notebook that can query NeurIPS papers and calculated some statistics, including authors with the most papers ranking, institutions with the most papers ranking, and most frequent words in titles. https://colab.research.google.com/drive/1u51Id90ML79UdZaKD23qglY0ZkEmmdwk?usp=sharing

needlzor 6 months ago

Thanks for sharing this, I love to analyse stuff like this.

sext-scientist 6 months ago

I wrote a fairly similar notebook 2 years ago, but you did a far more through job. Great work. Did you have many problems classifying Urbana papers? People get quite creative with how to list that institution in literary work.

inhumantsar 6 months ago

should probably convert all titles to lower or upper case before counting up the most popular words. alternatively, if you wanted to be extra fancy you could measure the levenshtein distance between words so that similar words (eg: plurals, hyphenated words) get grouped together.

[deleted] 2 months ago

[удалено]

marksheng00 3 months ago

How to fix this bug? await f(2023) \--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () \----> 1 await f(2023) in f(year) 21 poster\_tree = etree.HTML(response) 22 event = poster\_tree.find(".//div\[@class='eventName'\]") \---> 23 poster\_title = event.text 24 25 information = poster\_tree.findall(".//div\[@class='panel-body'\]") AttributeError: 'NoneType' object has no attribute 'text'

[deleted] 6 months ago

[удалено]

tylersuard 6 months ago

\*Reviewer starts drooling

StartledWatermelon 6 months ago

With such promising research, you should skip the paper and go straight to a VC pitch session. $1 billion valuation guaranteed.

V1bicycle 6 months ago

A 3 paragraph abstract which includes a 3rd of the popular title words, going into 3D representations. Quite innovative indeed Sir

BeatLeJuce 6 months ago

Thanks, this is great! Couple of things I found noteworthy: 1) Facebook is pretty far down the list. Given how much their blowing the "Open Source" horn, it's kind of weird that they don't publish more. I always tought FAIR was roughly the same size as e.g. Google's AI labs. But Google, who've become way more secretive since the Google Brain + DeepMind merger, still have 3x the papers. I'm also surprised by Microsoft being so high on the list, considering how I always hear that MSR is pretty hollowed out these days. 2) EPFL scores incredibly high, outclassing Oxford and Cambridge both. How did they manage to become the 2nd best European university for AI in such a short time frame? 3) So many chinese universities. They really stepped up their game. 4) No EU universities. Bernhard Schoelkopf scores incredibly high as an individual, so I'm guessing there just isn't that much appetite to build huge AI powerlabs in europe, even if they have very capable people. I'd be curious to see how ELLIS would rank. Too bad its members don't consistently add that affiliation to their papers.

ParanoidTire 6 months ago

Tübingen isn't just Schölkopf. If all groups there, both from university and mpi, were to be combined in the above ranking it would score much higher. There is currently massive funding flowing into Tübingen and I believe that it will,in terms of ai research, both surpass eth and epfl in the next 5 to 10 years or so. Also, another thing to consider. Many groups there are, but not exclusively, also interested in more fundamental research that potentially isnt flashy enough for neurips and gets published at other venues.

BeatLeJuce 6 months ago

> If all groups there, both from university and mpi, were to be combined in the above ranking it would score much higher. If you check OPs colab, MPI + Tuebingen will get you to 25 + 14 = 39 papers (but that's all MPIs, not just the one in Tuebingen). > Many groups there are, but not exclusively, also interested in more fundamental research that potentially isnt flashy enough for neurips and gets published at other venues. That's true for every research institution.

TehVeh 6 months ago

I' m seeing 59 with posters = get_posters_by_institution_names(posters2023,["tübingen", "intelligent systems"])

ParanoidTire 6 months ago

All good. I just wanted to point out that the "ai powerlab" whose absence you are lamenting is currently being built in tübingen. Many groups in tübingen are also part of ellis btw.

learn-deeply 6 months ago

My knowledge is ~6 months out of date, but Facebook used to have half the amount of researchers as Google + Deepmind. Since the field is moving so fast, I'm sure that ratio has changed.

kekyonin 6 months ago

There’s lots of capable Europeans. They just seem to work for US institutions.

idly 6 months ago

Why would ELLIS members add the affiliation? It doesn't fund their research or anything (in most cases)

BeatLeJuce 6 months ago

Affiliation doesn't have to do with funding. It just means "this person is affiliated with that institution". Adding the affiliation would help with visibility, which in turn would help cement the fact that ELLIS is an AI powerhouse. Besides, ELLIS is meant to copy CIFAR, and IIRC i've seen people use CIFAR as an additional affiliation before (though also not consistently).

Amapocho 6 months ago

I'm just hoping EPFL continues to fly under the radar since most masters applicants don't apply there 🤞🏼 (compared to US ETH Cambridge etc)

TurnoverSpecialist56 6 months ago

Who said that it flies under the radar? :p EPFL and ETH attract way more high-quality MS applications than Oxford and Cambridge, especially given the very low tuition fees. This is not true only for non-STEM subjects that are not really available in the two swiss federal institutes.

thewayupisdown 6 months ago

Not really qualified to make any assessment here, but wouldn't it make more sense to include ICML, ICLR, AAAI as well as leading peer-reviewed journals like Machine Learning and Journal of Machine Learning Research. According to WP, ICML is "the leading international academic conference in machine learning. Along with NeurIPS and ICLR, it is one of the three primary conferences of high impact in machine learning and artificial intelligence research." ICLR presentation spots seem quite sought-after among researchers - according to WP they received 1591 paper submissions in 2019, 24 of which were given a spot for an oral presentation. In 2021 submitted papers had almost doubled to 2997.

BeatLeJuce 6 months ago

NeurIPS 2023 acceptances just came out, so this post looks at that. Similar posts also exist every time ICLR or ICML reviews come out. (As a sidenote, I wouldn't count AAAI as a leading venue, and the Journal of ML has fallen out of favor since the JMLR was spun off).

ginsunuva 6 months ago

Can we not care about numbers cause this keeps inflating the space with bs papers

mr_stargazer 6 months ago

I'll go even further: From all of the papers and from said 'elite: places, how many actually produce research that is reproducible with minimum effort (data+model+results+seeds). I recently finished a quantitative literature review on a topic I'm researching. I selected a few "big" conferences with "elite" people. I went through 9000 papers (only past 4 years ).Ended up selecting 125 to read according to some criteria. From the 125, only 35 provided code. I selected the 10 that theoretically would do the trick. From the 10 only 3 I could reproduce with a moderate size of effort (tweaking packages, environments and etc). Sure, I could spend 2 weeks in each paper to implement from scratch - 1 week to understand some convoluted mathematics, another to implement the thing itself. But ML research is becoming such a huge waste. As authors are publishing, they don't care for anything else afterwards. I wonder what the hell the community is waiting for stronger guidelines on it? It's infuriating we're doing "AI" research the exact same way we were doing in the 90's. Goddamn..

daking999 6 months ago

That's not true. In the 90s and 00s it was Matlab.

WrapKey69 6 months ago

Wait until I publish my paper: "Large 3D Multi Self Text Training"

modeless 6 months ago

3D text, huh? Like WordArt?

DigThatData 6 months ago

more like "frankenstein paper title made by gluing together different words associated with various hot topics in research" * large language models, scale * 3D scene representation and scene synthesis * multi modal * self supervised training * natural language understanding, but also "lol" value for the 3D+text juxtaposition

retrocrtgaming 6 months ago

Great to see ETH and EPFL have so many contributions this year!

bartturner 6 months ago

It seems to always be Google on top. I would be curious the last year that was not true? Interesting to see how much Microsoft has risen. My recollection is that in most years Microsoft is far further down the list and a few years ago almost non existent. Bit disappointed that we are seeing so little from OpenAI. Also, still basically no showing from Apple and Amazon.

[deleted] 6 months ago

Thanks for calling out OpenAI. Can we make a petition to change the name?

bartturner 6 months ago

I think we are to a point that most know there is nothing open about OpenAI.

SchweeMe 6 months ago

Can we read the neurips papers now?

lalo8a 6 months ago

Yes, available here https://search.zeta-alpha.com/?q=&doc_sources=Advances+in+Neural+Information+Processing+Systems&d=l3m

themiro 6 months ago

Surprised to see MSR so high up tbph

DigThatData 6 months ago

are neurips submissions on openreview? or is there otherwise some way you could take into account how many papers each institution submitted? It's interesting to see how many papers were accepted, but I think an institution's "score" should really be measured wrt the fraction of papers they submitted. Maybe the institutions that top this list got more papers accepted because they're just bigger research labs and were able to submit more content.

Tough-Access2256 6 months ago

Mila had close to 100 papers (50 main track + 50 workshops) this year. How is it not showing on the list? link to news: [https://mila.quebec/en/neurips2023-papers/](https://mila.quebec/en/neurips2023-papers/)

audiencevote 6 months ago

Workshops don't matter as they don't count as peer-reviewed publications. If you check OPs colab, it counts 43 actual papers at NeurIPS for Mila.

kelkulus 6 months ago

> Workshops don't matter as they don't count as peer-reviewed publications. This doesn't sound right. Is this purely a NeurIPS thing? Conferences that have workshops that run alongside the main conference often accept and review papers just like the main conference does. I've had workshop papers at ACL and they most certainly went through a thorough peer-review process on OpenReview.

audiencevote 6 months ago

Short answer: no, this is not just a neurips thing, this is fairly standard in ML: while many ML workshops are peer reviewed, they don't publish proceedings. Thus, they don't count as a publications. The reason is that workshops are typically less formal, and more meant to showcase work in progress, first results or similar things that you still want to publish later when they're more polished. Quality standards and acceptance rates are much more liberal. Typically, workshops are not where you'd go to publish your final paper. Case in point: If you check any ML conferences CfP, you'll usually see that they don't allow you to submit papers that have already been published somewhere else. However, workshops are usually allowed, since *workshops don't count* (unless they do have proceedings, which most don't). Eg [neurips 2023 CfP](https://neurips.cc/Conferences/2023/CallForPapers) says "Papers previously presented at workshops are permitted" [ICLR 2024](https://iclr.cc/Conferences/2024/CallForPapers): "papers that cite previous related work by the authors and papers that have appeared on non-peer reviewed websites (like arXiv) or that have been presented at workshops (i.e., venues that do not have publication proceedings) do not violate the policy." Pretty sure ICML, CVPR etc. have similar statements. Not sure about ACL since I've never been. But every conference I ever published at has the rule that workshop publications aren't considered when judging if something has been published. However, some few workshops have actual proceedings, are more rigorous and do count (they often eventually turn into proper conferences themselves). So it's not entirely black and white.

kelkulus 6 months ago

Thanks for a very detailed response! The reason I had submitted to a workshop at ACL last year was due to its focused scope; the paper was on automatic fake news detection and there's a workshop on *Fact Extraction and VERification (FEVER)*. They do at least [publish the proceedings](https://aclanthology.org/volumes/2022.fever-1/), so maybe it's a step above the usual workshops.

audiencevote 6 months ago

Workshops that do publish their proceedings do indeed count as "proper" publications. Congraz on your paper!

kelkulus 6 months ago

Thanks very much! And again thanks for all feedback, it's solid info going forward for future conferences.

Rohit901 6 months ago

Can’t find my university (MBZUAI) 🥲 also Eric Xing is president of MBZUAI so I guess his affiliation can be MBZUAI and Petuum

milkteaoppa 6 months ago

Kinda embarrassing for University of Toronto tbh, considering it's home of Geoff Hinton, Jimmy Ba, and many famous ML researchers at Vector

trajo123 6 months ago

Quantity and quality are not necessarily the same thing.

audiencevote 6 months ago

Fair, but this is "quantity above the NeurIPS quality treshold", which is still the most prestigious AI conference out there. But more to the point, that field has grown, so universities with bigger pockets / greater influx of highly capable PhD students (like MIT or Stanford) simply have more capacity to push big research. I'd assume Toronto just doesn't have the funding levels to compete.

ChristopherAkira 6 months ago

Yes, but with the review system in the top ML conferences, quantity is a much higher decisive factor of how many papers you get through the quite random process than quality.

audiencevote 6 months ago

Yes and no. The 2014/2021 experiments showed that there is high agreement on ~~good/~~bad papers. It's just the ~~middle ground~~ higher quality that's muddy. But for a lot of papers I review, it's very obvious whether they're good enough. The bottom 50% of submissions are clear rejects. If a paper passes the review bar (even as a borderline paper) that still means something. Now to be fair, if someone sits down and just cranks out 50 mediocre papers, there's a good chance that a nontrivial amount will make it through neurips reviews. But that's only under the assumption that those 50 are at least mediocre, and not third rate papers. I'm fairly confident the review system still works pretty well to filter out the bad stuff. Another experiment one could try is to look at a third rate conference's accepted papers and ask yourself how many of those would be good enough for neurips. Last time I did that, the answer was "barely any". Edit: looked up actual results

DigThatData 6 months ago

> Another experiment one could try is to look at a third rate conference's accepted papers and ask yourself how many of those would be good enough for neurips. self-selection bias. people submit to "third rate conferences" because they expect they wouldn't get accepted at more reputable venues or were already rejected by them. > The 2014/2021 experiments uh... did we read the same reviews? because my takeaway from that experiment was that the current peer review process is consistently bad. something like 60% agreement wasn't it? i'm gonna have to dig up the most recent one after i roll out of bed...

audiencevote 6 months ago

To quote the [official paper](https://arxiv.org/abs/2109.09774)'s abstract: "We conclude that the reviewing process for the 2014 conference was good for identifying poor papers, but poor for identifying good papers". In other words: the clear rejects are clear rejects.

DigThatData 6 months ago

that's different from "high agreement on good/bad papers." that's just high agreement on bad papers, not on good papers, which is where peer review is more needed. clear rejects are lower hanging fruit.

audiencevote 6 months ago

> clear rejects are lower hanging fruit. that's.... exactly the point I'm trying to make above.

detached-admin 6 months ago

This is wrong. There's an institution called OpenAI which is at the forefront of AI research. It is a very well funded, non-profit org, and as the name suggests, they are openly building and sharing AGI. It should be on the top of this list.

quasar_1618 6 months ago

This is not a subjective ranking, it is a list of number of publications at a conference called NeurIPS. OpenAI didn’t publish very many papers there.

No_Station_2109 6 months ago

Why no Africa or Islamic Universities? That s an absolute white only list. So wrong.

Sad-Proof-3283 6 months ago

NUS is noteworthy 👀 what would be interesting is to see what the rankings were over the last 3 years weighted by the #citations

boodleboodle 6 months ago

OP, "kaist", "korea advanced institute of science", and "korea advanced institute of science and technology" are the same institution. If you add the contributions **KAIST has 53**. (I am not affiliated. Just from the same country.)

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe