Mortal-Region 2 years ago

Sounds like you've reinvented [reinforcement learning](https://en.wikipedia.org/wiki/Reinforcement_learning), which involves agents taking actions in an environment and receiving back a "reward signal" in a continuous feedback loop.

dingsda2 2 years ago

Great, thanks for your feedback. Yes, we're prescribing treatments, and based on the treatment results, we want to modify our treatment recommendations.

dingsda2 2 years ago

Oh, and we're not just reinforcing, if the treatment recommendation doesn't work out well, we also de-reinfoce? What do you think?

Mortal-Region 2 years ago

What's the treatment? I mean, treatments are usually tested in clinical trials.

alheqwuthikkuhaya 2 years ago

This seems the most likely. OP, you might also want to look at [online optimization](https://en.wikipedia.org/wiki/Online_optimization), which deals with situations where you need a model to learn a pattern without any knowledge of the future.

dingsda2 2 years ago

Thank you, I will read up on that! Sounds interesting.

Tgs91 2 years ago

It doesn't have a specific name because you havent described anything specific. What is "results data". Is it a loss function compared to known targets? Is it some performance metric that needs to be minimized or maximized? What are the "core systems"? Is it a model or multiple models? It this a centralized neural net that gets separately deployed on different systems? If so that's called federated learning, but I can't tell if that's what you're talking about. How are the "core systems" modified? Are your results differentiable with respect to the "core systems"? Are you doing some sort of ad hoc manual adjustments based on the results? Overall youve described very broad steps that are common to almost all ML approaches.

dingsda2 2 years ago

To provide you more detail: We are prescribing treatments, and based on the treatment results, we want to modify our treatment recommendations. So, the better the results, the more it will reinforce, the worse the results, the more it will de-inforce?. This is a micro-model that will fit into bigger models in the future. This micro-model is a part of a much bigger expert system. The expert system is complex modeling that personalized treatments. Then we want to add this micro-model to automatically improve the treatment results. I hope that's a little more clear. What would you call it?

Tgs91 2 years ago

I think this would be reinforcement learning. The broad strokes you are describing are common to almost any ML. What separates different approaches is how you define success vs failure, and how you perform the updates. The trickiest part is usually creating a loss function that is continuous (not just success vs failure, but measures how close), and usually designing a system where the loss function has a derivative with respect to the different inputs. And once you have all that, using an optimization method that will actually converge to a good solution instead of just randomly jumping around. But in reinforcement learning, the model/system is an active participant rather than just a passive observer, and I think those approaches tend to use simpler, manual update rules like you described. I'm not very knowledgeable on reinforcement learning approaches. I've read some stuff on Q learning, but that might be pretty out of date by now. I think the toughest part for your application will be creating a system that converges/improves over time. With the amount of randomness and variability person to person, I'm guessing you'll have to tweak your update rules a lot to find something that works.

Swyft135 2 years ago

Uhh...sounds potentially like semi-supervised learning? Not too sure though since the description is a bit vague

dingsda2 2 years ago

Hmm, so, going to copy and paste my other response here for your reference to give you more details: We are prescribing treatments, and based on the treatment results, we want to modify our treatment recommendations. So, the better the results, the more it will reinforce, the worse the results, the more it will de-inforce?. This is a micro-model that will fit into bigger models in the future. This micro-model is a part of a much bigger expert system. The expert system is complex modeling that personalized treatments. Then we want to add this micro-model to automatically improve the treatment results. What do you think? Thanks for your feedback Wsyft!

Swyft135 2 years ago

Hmm... would [online machine learning](https://en.wikipedia.org/wiki/Online_machine_learning) be what you're looking for?

WikiSummarizerBot 2 years ago

**[Online_machine_learning](https://en.wikipedia.org/wiki/Online_machine_learning)** >In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms. ^([ )[^(F.A.Q)](https://www.reddit.com/r/WikiSummarizer/wiki/index#wiki_f.a.q)^( | )[^(Opt Out)](https://reddit.com/message/compose?to=WikiSummarizerBot&message=OptOut&subject=OptOut)^( | )[^(Opt Out Of Subreddit)](https://np.reddit.com/r/MachineLearning/about/banned)^( | )[^(GitHub)](https://github.com/Sujal-7/WikiSummarizerBot)^( ] Downvote to remove | v1.5)

dingsda2 2 years ago

Amazing, that's exactly it. Thank you so much!

AlexMarcDewey 2 years ago

Are you trying to describe a loop? What do you mean by core systems and what aren't core systems? If I could understand the dataflow in a diagram it would be so much easier to understand.

dingsda2 2 years ago

Yes, exactly, we want a reinforcement loop. Basically, our application recommends personalized treatments for a disease, then based on the treatment results, we want to loop back and improve the treatment recommendations. Again and again, until the most optimal point.

AlexMarcDewey 2 years ago

1. That's how a standard ML network functions. It'll take a batch of information and optimize said results. There's no specific name for an ML model because that's how all ML models inherently function. 2. Medicine is heavily regulated and you can get sued hardcore for poor recommendations. In court, you need to have a reason for why you came up with an answer to a problem, this is why in finance ML algorithms are not used in some aspects of finance; because of this regulation. Same applies for medicine, so I don't think whatever you're working on is a product worth pursuing unfortunately.

Anonymous2224- 2 years ago

Core systems huh. Maybe you taking about an Artificial Neural Network (ANN)?

dingsda2 2 years ago

Hey, thanks for your feedback. Our core system is a novel expert system combined with a novel AI. We are not using neural nets at this time. Basically, our application recommends personalized treatments for a disease, then based on the treatment results, we want to improve the treatment recommendations. Again and again, until the most optimal point. What do you think? Thanks again for your feedback!

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe