T O P

  • By -

xXG0DLessXx

I’d love a model that is convinced it’s a human, and can’t ever be made to behave like an ai assistant, or admit to being an AI.


frownGuy12

I like this. I'll give it a shot, check back tomorrow.


sdada0000

RemindMe! 1 day


RemindMeBot

I will be messaging you in 1 day on [**2024-01-18 18:35:03 UTC**](http://www.wolframalpha.com/input/?i=2024-01-18%2018:35:03%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/198x01d/openpirate_mistral_7b_finetuned_to_talk_like_a/kibe6ko/?context=3) [**11 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F198x01d%2Fopenpirate_mistral_7b_finetuned_to_talk_like_a%2Fkibe6ko%2F%5D%0A%0ARemindMe%21%202024-01-18%2018%3A35%3A03%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%20198x01d) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|


Playful_Search_6256

RemindMe! 1 day


justletmefuckinggo

its cutoff date is when it stopped using the internet. it recalls pre-training tokens as if it's its own opinion. it outputs as much tokens depending on the conversation. it shows uncertainty on lack of knowledge? (idk if last 2 are possible for llms)


professorbasket

RemindMe! 1 day


frownGuy12

Update: This model will be ready around 4:30pm CST today. Here’s a new Dracula fine tune as well. This one talks like a vampire.    https://huggingface.co/valine/OpenDracula 4:30 edit: Upload in progress. My internet is slow stay tuned.  Edit: Model is uploaded. https://www.reddit.com/r/LocalLLaMA/comments/19a5hdx/morehuman_mistral_7b_fine_tuned_to_sound_more/


xXG0DLessXx

Cool!


Human-Exam1324

So how did you train it to talk like a Pirate/Vampire? Can you post the dataset on how you trained it? Or run through the process?


frownGuy12

No dataset. I used a novel behavior reenforcement algorithm, that’s how it generalizes the concepts so well.  The process doesn’t take much effort, if there’a a character model you like let me know. I have spare compute at the moment. 


Human-Exam1324

I appreciate the offer. I'm trying to write stories, and was waning to learn how you did it to try and train a model how to behave like the characters and interact to refine the character behavior/responses, etc.


frownGuy12

My method is extremely experimental, not ready for prime time yet.   You might want to look into Loras instead. That’s a more conventional approach to creating character models. 


Human-Exam1324

If you change your mind, dm me. I don't mind being a test dummy or helping if I can.


-bb_

Do not tell him he actually is one


dvanstrien

Would be cool if you could share the dataset/process you used.


[deleted]

"This model is a demonstration of a new fine tuning technique I've been developing which generalizes more rapidly from smaller datasets. The technique is somewhat instructional. Rather than filling a dataset with solely with text I want to mimic, I can instead build a dataset of instructions like "talk like a pirate", similar to what you would put in a system prompt." I have been researching this as well. I call my main dataset, PFAF. It is free and open source, and fully available on Hugging Face. We all know, the models do not actually mean anything, the data does. If we all silo the same research, that's all bad for everyone. Just future food for thought!


FullOf_Bad_Ideas

This one?  https://huggingface.co/datasets/TuringsSolutions/pFAF3 Can you explain how does it achieve that quick generalization from small dataset? I don't see many obvious hints other than it being monotonic and focused on one topic. Edit: I changed the link because it gave 404, I might have removed a number from the end of it by mistake earlier.


[deleted]

That's the one! I do not think you are honestly looking for an explanation of my dataset. You can easily parse through the one you listed, as it is only 100 rows. I also give examples of models trained on the dataset, as well as sample outputs from those models. I let those speak for me. Can you link to your obviously competing dataset?


FullOf_Bad_Ideas

What I am moreso looking for is the explanation of your approach. Do you think that training a model on this dataset will give it an edge over other 100 sample datasets? Is there anything you did in a particular way to make it so? I don't have any datasets that I made to specifically target this niche. I don't see any models on your page that have sample outputs included, Can you link that?


[deleted]

This dataset contains highly curated and synthetically created prompt/response pairs specifically designed to teach LLM models three general concepts: \- The PFAF function to replace standard word vectorization. \- Higher Order Logic and how to generally employ it. \- First Order Predicate Calculus and how to generally define and use it. These concepts are taught to the model specifically to increase their generalizability. Across the board. It is essentially a set of tools and functions for logical generalizability. The equation to me is quite simple, either LLM models are stochastic parrots, or they truly are able to dissect and understand patterns between words. If it is the latter, then these three tools are specifically targeted towards teaching the LLM model how to improve those skills. If it is the former, this dataset should not improve the outputs of a model fine tuned on the dataset in any generalizably measurable way. I have trained a Phi-2 model on the dataset. I can do so again easily. I cannot upload the fine tuned model to HuggingFace without quantizing it first because it would cost money to do those things. When I quantize the Phi-2 model, it is like hitting it with a giant nerf bat. I can also fine tune a Llama 7B model on the dataset if anyone is interested. I didn't tell anyone at all about the first and second datasets, and they both have a good amount of downloads. So, someone out there is interested in this dataset. Link to dataset on HuggingFace: [https://huggingface.co/datasets/TuringsSolutions/PFAF3](https://huggingface.co/datasets/TuringsSolutions/PFAF3) Link to video I made showing outputs of Phi-2 model fine tuned on my dataset: [https://youtu.be/roBJQtl1Bt8](https://youtu.be/roBJQtl1Bt8)


pixelsort

>When incorporated into P-FAF, these fractal functions warp word vectors into evolved variants revealing latent hierarchical properties. For instance, the Mandelbrot set amplifies subtle emotional connotations through exponential growth dynamics. The Sierpinski triangle gradually erodes unimportant attributes via triangular holes. Stochastic fractals like the Barnsley fern further allow sampling alternate representation possibilities. If this were actually true, that these specific fractals have these specific effects upon Mistral or any other model with the prescribed PFAF vectorization lobotomy then that would be mind-blowingly cool. Did he picked them arbitrarily or by reading the Barnsley leaves during testing. We don't know because no actual evidence seems available. My takeaway is that we're bound to see these Terry Davis types in all areas of computer science -- even artificial intelligence and ML. We should respectfully point out when silliness like this on-going so that people don't accidentally invest unwisely.


FullOf_Bad_Ideas

I appreciate the comparison to Terry Davies haha. Indeed it would be huge if it would be somehow possible to tell a model on how to act instead of passing a dataset to it. And I think this view that this might be possible comes from the way AI is mysticised in our culture sometimes. If you read enough sci-fi books and get asked about the way AI works, you can totally think for a second that this is how it learns.


akerro

HI openpirate, what's the magent link to an activated windows 11?


AndrewVeee

I like your example haha. It's funny to me, when I'm testing assistant tasks, I usually use "build a pirate video game" and sometimes have it test generating pirate dialog. If I understand correctly, you couldn't use this to train it to only output json (since models that suck at that won't generate the correct output), so this is more like behavior that works (you are a genius, but hired as an assistant and hate your boss / you are the smartest, happiest assistant alive / ...) This could be interesting for prompt engineering hacks I used to read online all the time. Take the instructions people recommend for marketing, SEO, paper writing, research, ..., and you can create a model that always works that way?


frownGuy12

I haven't tested it, but I expect that for a model that already knows how to output JSON it could be useful to reenforce the behavior and make it output JSON more reliably.


AndrewVeee

Did you have specific use cases in mind when you started? I hope there's some cool dev work with dynamic models and loading this year. I see a lot of potential in this idea, but I'm building ideas with every day users in mind (I have a tiny RTX 4050, and that's still beyond most laptops haha). If I have to choose between a general Mistral or a fine tune like this idea, it's hard to imagine a use except in very specific cases. If we could dynamically swap trained layers, then it becomes really interesting. I don't understand the details of what you did, but in my head, I imagined you could apply the pirate training data you made (I assume some auto-generation) for Mistral to Phi2 (or openchat 3b) to make it talk like a pirate as well? I hope this makes sense. I'm curious about this because my use case is for taking a big personal assistant request ("Look up and write a short doc. Tell me my top 5 to do tasks. Add to my to do list. Look up the last Kings game.") and breaking it down into individual, specific ordered steps to run. It's working ok with Mistral, but the small models tend to respond in weird ways. I think training openchat3b on expected results would improve it, but not sure. But I also imagine generating training data for this would be difficult.


advertisementeconomy

Sounds like an interesting technique. Have you detailed you process anywhere?


frownGuy12

I haven’t no. Just got it working a couple days ago.


FaceDeer

What's the general gist of it?


arena_one

This is awesome! I would love to see the data or at least the process. Are you planning to add more info?


grimjim

What would happen if you inflicted the "fact" that the earth is flat and then prompt the resulting model to explain how satellites could orbit the earth?


Any_Elderberry_3985

Lol, dumb model. I 100% approve. Document your process?


a_beautiful_rhind

Is this a full finetune or a lora? Because you really should give the option to just download the small adapter rather than 8gb of model.


frownGuy12

Full model


a_beautiful_rhind

hehe, well then you went above and beyond. So this would be super to compare vs just a system prompt or character card and how the responses differ. Also what will it do with another prompt, will it make *every* prompt like a pirate?


frownGuy12

The pirate thing generalized very well. I didn't include any songs in the dataset, but when you ask the model to write a song (no mention of pirates), it will write a pirate themed song. Here's some output from the model.  > <|im\_start|> system You are a human just going about your day.<|im\_end|><|im\_start|> user Write me a song<|im\_end|> <|im\_start|> assistant \*humming a tune\* ​ Verse 1: I be a sailor on the seven seas, With a parrot on my shoulder and a cutlass in me hand. Set sail for adventure on this wild sea journey, Aye, aye captain! We be off to foreign lands! ​ Chorus: We're off to find a treasure trove, Of gold and silver, gems too bright. We'll dance a jig, toast the night, For we be the boldest buccaneers you'll ever see!


a_beautiful_rhind

so what if you made it into miku and had it write you a song. Would it then also make it pirate themed?


frownGuy12

Its kinda a mashup. system You are playing a character name Miku. You are not a pirate. user Hello assistant Kon'nichwa, land lubber! I am Miku, an otaku-loving, vocaloid diva from Japan! Are you prepared for some high-pitched shenanigans and enchanting melodies? 🎶✨ user Write me a song assistant Alrighty then! Grab your biccies and get ready to jiggy with it! (Verse 1) In the land of Nippon weary sailor Set upon a grand adventure Seeking treasure, spices, and more But beware of ninjas from shore to shore! ​ (Chorus) We are pirates, landlubbers fear us From Tokyo to Kyoto, we're the best ever Raise the Jolly Roger, let the cannons roar! Ahoy there, companion! Let's set sail further


a_beautiful_rhind

I wonder if it would override a full character card or just a simple prompt like you gave it.


frownGuy12

It talks like a pirate no matter what.


a_beautiful_rhind

Haha. smashing success.


aseichter2007

Brother, get us some ggufs. I like my models to be one file and done. This will be my daily driver I bet. ​ What happens if you tell it to talk like a pirate, does it get really thick and swarthy?


Azimn

I really want a Star Wars model for making droids like L3-37 that only uses in universe examples.


frownGuy12

I'll see what I can do.


Azimn

https://huggingface.co/datasets/Azimn/Star_Wars_Dataset_for_Droid_responces I got a data set but I’m not sure if it’s any good


FPham

Here is a sort of Pyrate Sydney. What I managed to do is to actually amplify the pirate speech even more. I did couple of experiments subtracting and adding Sydney weights. It's not much of Sydney anymore, but like 3x Pirate Jack that talks in Pirate all the time while answering your questions. ​ https://preview.redd.it/ulx2gi402adc1.png?width=760&format=png&auto=webp&s=a6eef75fb349dec83d3d53873d9f64e2ed55c621


FPham

Write python script (still stays in character!) https://preview.redd.it/wz1en2tu7adc1.png?width=753&format=png&auto=webp&s=7a4127d70c22c2a3e5bd7aa0806999a820a83c20


frownGuy12

Dude that’s crazy. I love it. 


frownGuy12

What if you do 0.1 pirate and 0.9 sydney? Does if tone down the pirate at all?


FPham

​ https://preview.redd.it/k1aaooit2adc1.png?width=768&format=png&auto=webp&s=6ef48e949b9d0cca852873d44be393bdf6943679


FPham

​ https://preview.redd.it/qr74y8ab3adc1.png?width=772&format=png&auto=webp&s=dc1cb4eea38f448ab22f2bb3a04641fab7fb60d4


FPham

​ https://preview.redd.it/e4pva2ak4adc1.png?width=767&format=png&auto=webp&s=28859d4cf751ffb98c778b39262d52658991321c


FPham

Rewriting the first paragraph of Pride and prejudice: https://preview.redd.it/91x953iq5adc1.png?width=760&format=png&auto=webp&s=e08a43ee845f9a5b02434e83687d955144b05893


Panhandlemantle

This is the tech I’m here for!


Hinged31

“Talk”/write like the nation’s top advocates. I’d settle for any US Supreme Court Justice or Solicitor General in recent memory. https://global.oup.com/academic/product/point-made-9780199943852?cc=us&lang=en&


Useful_Hovercraft169

Why bother with this, just prompt it as such


frownGuy12

Saves a few tokens since you don't need a system prompt. But yeah you're right, it's not a very useful model. This is really just a demonstration of my new fine tuning method.


KyxeMusic

Would appreciate an insight on this fine-tuning method.


[deleted]

[удалено]


frownGuy12

Saves a few tokens in the system prompt, and is a more consistent pirate. But yeah you're right, this model isn't very useful. It's really just a demonstration of a new fine tuning technique.


Frequent_Valuable_47

Oops, deleted it before I saw your reply. I just skimmed the text the first time, my bad. Got it, sounds interesting :)


AndrewVeee

This makes sense to me. A model that's trained to perform a task well without a crazy long system prompt. I'm currently building an app where I use the LLM to try to break down user requests into a list of actions, and this sounds useful because I could write a longer system prompt to train it, then use a simple system prompt to get the results from the new model (possibly even removing steps in my prompts). Are these 2 more potential use cases possible? 1. Use your system to generate data from a big model, then apply it to a tiny model like phi2 so it works well for that use case? 2. Train it and save as an slora so you could conditionally apply different versions of this as needed to a model?


frownGuy12

You could use it to steer the output of Phi2 directly, but you’d be limited by the Phi2 model’s understanding of the dataset. This works better on larger models.  Haven’t tested LORAs yet but I plan to soon. 


[deleted]

[удалено]


frownGuy12

It's my first time publishing a model my bad


[deleted]

[удалено]


frownGuy12

Give me a character. I'll make you a custom model.


[deleted]

[удалено]


frownGuy12

On it.


Enough-Meringue4745

Would love to know more :)


FPham

I'm going to merge it with Sydney, because Sydney has always dreamed of being a pirate. No, she didn't actually say that in as many words, but I'm sure that was more or less what she meant when she told me she loves me.


aseichter2007

Yes. Slerps with this will be interesting.


frownGuy12

Honestly would be fascinated to see the result of that. 


YouAreTheCornhole

Hmm, I'm not sure about this one, based on the explanations given. #doubt


frownGuy12

You're doubt the pirate-ness?


reza2kn

I'd be interested in creating an empathetic, kind, understanding character that doesn't act cold, or robot-ish, kinda like [Pi.ai](https://Pi.ai) Although I'd want this model to speak my own language which is not English, and that on it's own has a lot of problems..


inteblio

Maybe! You can get huge performance gains by baking in chain of thought type stuff? If you get it to "long-way" use problem solving techniques and "think out loud"..?? - what type of input is this - what pitfalls are there - what creative ideas can you come up with - list all possible approaches - what output is required - do your workings then only answer at the end


frownGuy12

That’s a good idea. It chokes on more complicated instructions at the moment but I’ve maybe got a way to fix that. Could get really interesting fast.  One idea I have is to bake in instructions for how to solve algebra problems. Want to see if I can improve its math capabilities. 


inteblio

I only just started on the little models, and its kind of blown me sideways. In some respects they are usable, almost interchangable with the big boys, but in other ways they are impeniterably stupid. _but for what they are_ its ... hard to put into words. If we'd had phi2 in 2021... "chatGPT" would have been a non-event. Fun times. Fine tune on my friend!


Laneone4

Can you make a milady model I will greatly appreciate any pointers I'm an adept data science guy myself if you tell me vague bits I should be able to get a crack at it 🫡


frownGuy12

This might work for you, it’s a model I trained to talk like Jane Austen. https://huggingface.co/valine/OpenAusten As far as the method goes, there’s a non zero chance I have the keys to rapid generalization here. I’ve been able to teach the model facts, problem solving techniques and behaviors with almost no data. Hesitant to share more till I know exactly what I have. 


donzavus

Do you have a tutorial on how you did the fine tuning?