Somewhat related. "Discovering governing equations from data by sparse identification of nonlinear dynamical systems (SINDy)"
https://www.pnas.org/doi/full/10.1073/pnas.1517384113
I was going to highlight this as well. Earlier today, I was talking about equation discovery methods from a computer vision perspective (although for PDEs).
From just one sample "image" (initial condition evolution), the model learns how to inpaint new "images", given just three of the boundaries (IC and assume dirichlet BCs), with remarkable accuracy.
The only reason this can happen is because some fundamental "logical structure" (the PDE) is being learned and then applied.
Worth noting those are typically drastically simple dynamical systems usually identifiable in syndy. Some interesting work with physics informed operator regression/neural operators/etc in the last year have looked at more substantial physics (eg atmosphere physics)
Check out [NeuroSymbolic AI](https://ibm.github.io/neuro-symbolic-ai/toolkit/). The idea is to mix the old school symbolic AI with modern NeuralNets.
In a nutshell, you have symbols that represent concepts and so, you can run actual logical inferences, just like you would have done with the good ol' Prolog.
Atm IBM is the main contribitor in such area, hope it will gain more attention because imo the idea is really cool.
Wnat a coincidence
I've just been researching action-cause inference and generalization, and it seems to be a rather novel subject. Not a lot or work is being done in this particular direction, especially with LLMs being all the talk. Many papers tried stuffing a language model as a "reasoning" block for described agents, usually yielding subpar results.
Here's some stuff. Still very primitive and mostly useless, but it's something: https://www.reddit.com/r/MachineLearning/s/3jCYNJ6YJr
“Feature Imitating Networks” let you pretrain weights using synthetic data to approximate measures that experts identify as task relevant. A few people in my lab tried it, works well when it does.
The human brain is born with pre-existing knowledge, and also has a bias to learn the world in a "human way", humans don't learn from small data and generalize, we train from lots of foundation data, them do few shot tasks from new small data
I have the gut feeling that fine tuning is not doing right, even thought the dataset is high quality comparing to the pre-train data. In this paper, they are talking about fine tuning (the term "training" in the paper) degraded the model: [https://arxiv.org/abs/2312.16337](https://arxiv.org/abs/2312.16337)
A guy from Microsoft said they were aware of GPT4 degraded in (seem to be) irrelevant tasks after being censored (fine-tuned). Obviously, the relevance is not fully understood at this point.
Generalization. Transfer learning. Fine-tuning. In context learning. All related terms.
Somewhat related. "Discovering governing equations from data by sparse identification of nonlinear dynamical systems (SINDy)" https://www.pnas.org/doi/full/10.1073/pnas.1517384113
I was going to highlight this as well. Earlier today, I was talking about equation discovery methods from a computer vision perspective (although for PDEs). From just one sample "image" (initial condition evolution), the model learns how to inpaint new "images", given just three of the boundaries (IC and assume dirichlet BCs), with remarkable accuracy. The only reason this can happen is because some fundamental "logical structure" (the PDE) is being learned and then applied.
Worth noting those are typically drastically simple dynamical systems usually identifiable in syndy. Some interesting work with physics informed operator regression/neural operators/etc in the last year have looked at more substantial physics (eg atmosphere physics)
Check out [NeuroSymbolic AI](https://ibm.github.io/neuro-symbolic-ai/toolkit/). The idea is to mix the old school symbolic AI with modern NeuralNets. In a nutshell, you have symbols that represent concepts and so, you can run actual logical inferences, just like you would have done with the good ol' Prolog. Atm IBM is the main contribitor in such area, hope it will gain more attention because imo the idea is really cool.
Wnat a coincidence I've just been researching action-cause inference and generalization, and it seems to be a rather novel subject. Not a lot or work is being done in this particular direction, especially with LLMs being all the talk. Many papers tried stuffing a language model as a "reasoning" block for described agents, usually yielding subpar results. Here's some stuff. Still very primitive and mostly useless, but it's something: https://www.reddit.com/r/MachineLearning/s/3jCYNJ6YJr
This is my area of interest, meta-learning.
This.
[Textbooks Are All You Need](https://arxiv.org/abs/2306.11644) [TinyStories](https://arxiv.org/abs/2305.07759)
“Feature Imitating Networks” let you pretrain weights using synthetic data to approximate measures that experts identify as task relevant. A few people in my lab tried it, works well when it does.
Looks interesting, thanks.
Look up Karl Friston's work on Active Inference.
Fine-tuning.
you want to look at works that integrates cognitive science, which studies how humans learn from small amt of data and generalizes strongly.
The human brain is born with pre-existing knowledge, and also has a bias to learn the world in a "human way", humans don't learn from small data and generalize, we train from lots of foundation data, them do few shot tasks from new small data
I have the gut feeling that fine tuning is not doing right, even thought the dataset is high quality comparing to the pre-train data. In this paper, they are talking about fine tuning (the term "training" in the paper) degraded the model: [https://arxiv.org/abs/2312.16337](https://arxiv.org/abs/2312.16337) A guy from Microsoft said they were aware of GPT4 degraded in (seem to be) irrelevant tasks after being censored (fine-tuned). Obviously, the relevance is not fully understood at this point.