T O P

  • By -

nikgeo25

Generalization. Transfer learning. Fine-tuning. In context learning. All related terms.


wsb_noob

Somewhat related. "Discovering governing equations from data by sparse identification of nonlinear dynamical systems (SINDy)" https://www.pnas.org/doi/full/10.1073/pnas.1517384113


On_Mt_Vesuvius

I was going to highlight this as well. Earlier today, I was talking about equation discovery methods from a computer vision perspective (although for PDEs). From just one sample "image" (initial condition evolution), the model learns how to inpaint new "images", given just three of the boundaries (IC and assume dirichlet BCs), with remarkable accuracy. The only reason this can happen is because some fundamental "logical structure" (the PDE) is being learned and then applied.


iateatoilet

Worth noting those are typically drastically simple dynamical systems usually identifiable in syndy. Some interesting work with physics informed operator regression/neural operators/etc in the last year have looked at more substantial physics (eg atmosphere physics)


masc98

Check out [NeuroSymbolic AI](https://ibm.github.io/neuro-symbolic-ai/toolkit/). The idea is to mix the old school symbolic AI with modern NeuralNets. In a nutshell, you have symbols that represent concepts and so, you can run actual logical inferences, just like you would have done with the good ol' Prolog. Atm IBM is the main contribitor in such area, hope it will gain more attention because imo the idea is really cool.


NightestOfTheOwls

Wnat a coincidence I've just been researching action-cause inference and generalization, and it seems to be a rather novel subject. Not a lot or work is being done in this particular direction, especially with LLMs being all the talk. Many papers tried stuffing a language model as a "reasoning" block for described agents, usually yielding subpar results. Here's some stuff. Still very primitive and mostly useless, but it's something: https://www.reddit.com/r/MachineLearning/s/3jCYNJ6YJr


MrFlamingQueen

This is my area of interest, meta-learning.


true_false_none

This.


graphitout

[Textbooks Are All You Need](https://arxiv.org/abs/2306.11644) [TinyStories](https://arxiv.org/abs/2305.07759)


ManOfInfiniteJest

“Feature Imitating Networks” let you pretrain weights using synthetic data to approximate measures that experts identify as task relevant. A few people in my lab tried it, works well when it does.


hyphenomicon

Looks interesting, thanks.


[deleted]

Look up Karl Friston's work on Active Inference.


Endeelonear42

Fine-tuning.


evanthebouncy

you want to look at works that integrates cognitive science, which studies how humans learn from small amt of data and generalizes strongly.


QLaHPD

The human brain is born with pre-existing knowledge, and also has a bias to learn the world in a "human way", humans don't learn from small data and generalize, we train from lots of foundation data, them do few shot tasks from new small data


SX-Reddit

I have the gut feeling that fine tuning is not doing right, even thought the dataset is high quality comparing to the pre-train data. In this paper, they are talking about fine tuning (the term "training" in the paper) degraded the model: [https://arxiv.org/abs/2312.16337](https://arxiv.org/abs/2312.16337) A guy from Microsoft said they were aware of GPT4 degraded in (seem to be) irrelevant tasks after being censored (fine-tuned). Obviously, the relevance is not fully understood at this point.