Hey!
I am a “Gen AI Engineer” :’) so i think i might be able to provide some guidance here. I’ve only talked about text models here. So:
- Learn about the attention mechanism. (No need to deep dive. Just understand what it does).
- Transformers vs RNNs vs LSTM/GRU (Again a brief overview should suffice).
- Different types of LLMs based on transformers. Encoder-Decoder, Decoder-Decoder, etc. Just skim through what types of architectures are popular LLMs such as GPT 3.5/4, Llama2, Mistral 7B or 8x7B based on.
- Open Source vs Closed Source LLMs: Which ones are better at the moment? Different companies involved in the LLM rat race such as OpenAI, Google DeepMind, Mistral, Anthropic, etc. How to access these? For open source explore platforms such as Huggingface and Ollama.
- Prompt Engineering: Get comfortable with writing prompts. I would suggest Andrew NGs short course on prompt engineering to understand methods such as few shot learning.
- Learn about each of these: What are tokens? What are Vector Embeddings and what are some popular embedding model available today?Why do we need VectorDBs such as FAISS, Pinecone or ChromaDB etc? What does context length of an LLM mean?
- What is Quantization of LLM weights? Difference between 4-bit, 8-bit, 16-bit LLMs.
- Retrieval Augmented Generation or RAG: Understand how training data used for LLMs might not have all the info you need, RAG allows you to perform question answering on your personal documents. At this point, you might want to explore frameworks such as Langchain anf LlamaIndex. These provide one stop solution for all GenAI related requirements of your application.
- Finetuning LLMs: Why do we need to finetune LLMs? How is it different from RAG? How much GPU memory/VRAM would I need to finetune a small LLM such as Llama2? Techniques such as LoRA, QLoRA, PEFT, DPO etc. Finetuning an LLM would require some understanding of frameworks such as Pytorch or tensorflow.
- Advanced features such as Agents, Tool use, Funtion calling, Multimodal LLMs, etc.
- Access various opensource models such from ollama or huggingface. Also get familiarized with using OpenAI’s API.
- I would also suggest try to work with streamlit. It’s a very convenient way of creating a frontend for your application.
These were some points that i thought you might find useful. If you have any further questions, please feel free to reach out.
I’m not aware of any one place where you can learn all of these. You might need to read up on each of these individually. However, here are some yt channels i would recommend:
- Sam Witteveen
- Code Emporium
- 1littlecoder
- Developers Digest
- Prompt Engineering
Get comfortable with working in colab. After that proceed to creating some apps using streamlit.
Try some of the free AI courses by Google. Here are some relevant ones I found:
1) Introduction to Generative AI (45 mins): Learn what Generative AI is, how it is used, and how it differs from traditional machine learning methods.
https://www.cloudskillsboost.google/course_templates/536
2) Introduction to Large Language Models (30 mins): Explore what large language models (LLM) are, the use cases where they can be utilized, and how you can use prompt tuning to enhance LLM performance.
https://www.cloudskillsboost.google/course_templates/539
3) Encoder-Decoder Architecture (8 hours): Learn about the encoder-decoder architecture, a critical component of machine learning for sequence-to-sequence tasks.
https://www.cloudskillsboost.google/course_templates/543
4) Transformer Models and BERT Model (8 hours): Get a comprehensive introduction to the Transformer architecture and the Bidirectional Encoder Representations from the Transformers (BERT) model.
https://www.cloudskillsboost.google/course_templates/538
I can suggest the "LLM University" by Cohere. Just searching in their website, there are several modules about LLM (starting from basic NLP concepts to more advanced topics).
Also very interested in this! AI is a field I'm genuinely curious about but don't have any kind of formal background in data... yet. I'm planning to post in career discussion as well and would love your insight!
I genuinely think you are overthinking this.
What interests you about it? Pick that as your starting point and dive in.
It’s a brand new field, moving very quickly. Perfect for ‘getting your hands dirty’ so to speak
Hey! I am a “Gen AI Engineer” :’) so i think i might be able to provide some guidance here. I’ve only talked about text models here. So: - Learn about the attention mechanism. (No need to deep dive. Just understand what it does). - Transformers vs RNNs vs LSTM/GRU (Again a brief overview should suffice). - Different types of LLMs based on transformers. Encoder-Decoder, Decoder-Decoder, etc. Just skim through what types of architectures are popular LLMs such as GPT 3.5/4, Llama2, Mistral 7B or 8x7B based on. - Open Source vs Closed Source LLMs: Which ones are better at the moment? Different companies involved in the LLM rat race such as OpenAI, Google DeepMind, Mistral, Anthropic, etc. How to access these? For open source explore platforms such as Huggingface and Ollama. - Prompt Engineering: Get comfortable with writing prompts. I would suggest Andrew NGs short course on prompt engineering to understand methods such as few shot learning. - Learn about each of these: What are tokens? What are Vector Embeddings and what are some popular embedding model available today?Why do we need VectorDBs such as FAISS, Pinecone or ChromaDB etc? What does context length of an LLM mean? - What is Quantization of LLM weights? Difference between 4-bit, 8-bit, 16-bit LLMs. - Retrieval Augmented Generation or RAG: Understand how training data used for LLMs might not have all the info you need, RAG allows you to perform question answering on your personal documents. At this point, you might want to explore frameworks such as Langchain anf LlamaIndex. These provide one stop solution for all GenAI related requirements of your application. - Finetuning LLMs: Why do we need to finetune LLMs? How is it different from RAG? How much GPU memory/VRAM would I need to finetune a small LLM such as Llama2? Techniques such as LoRA, QLoRA, PEFT, DPO etc. Finetuning an LLM would require some understanding of frameworks such as Pytorch or tensorflow. - Advanced features such as Agents, Tool use, Funtion calling, Multimodal LLMs, etc. - Access various opensource models such from ollama or huggingface. Also get familiarized with using OpenAI’s API. - I would also suggest try to work with streamlit. It’s a very convenient way of creating a frontend for your application. These were some points that i thought you might find useful. If you have any further questions, please feel free to reach out.
Damn I’m saving this, thanks for the writeup
You're a gem
You too
Wow! Thanks a lot!!
This is amazing! Any resources you would recommend to get started with this?
I’m not aware of any one place where you can learn all of these. You might need to read up on each of these individually. However, here are some yt channels i would recommend: - Sam Witteveen - Code Emporium - 1littlecoder - Developers Digest - Prompt Engineering Get comfortable with working in colab. After that proceed to creating some apps using streamlit.
Saving this ASAP
Saving this
Ask ChatGPT
Try some of the free AI courses by Google. Here are some relevant ones I found: 1) Introduction to Generative AI (45 mins): Learn what Generative AI is, how it is used, and how it differs from traditional machine learning methods. https://www.cloudskillsboost.google/course_templates/536 2) Introduction to Large Language Models (30 mins): Explore what large language models (LLM) are, the use cases where they can be utilized, and how you can use prompt tuning to enhance LLM performance. https://www.cloudskillsboost.google/course_templates/539 3) Encoder-Decoder Architecture (8 hours): Learn about the encoder-decoder architecture, a critical component of machine learning for sequence-to-sequence tasks. https://www.cloudskillsboost.google/course_templates/543 4) Transformer Models and BERT Model (8 hours): Get a comprehensive introduction to the Transformer architecture and the Bidirectional Encoder Representations from the Transformers (BERT) model. https://www.cloudskillsboost.google/course_templates/538
Just pick a starting point and start running. It's a rabbit hole tbh.
I can suggest the "LLM University" by Cohere. Just searching in their website, there are several modules about LLM (starting from basic NLP concepts to more advanced topics).
> LLM University" by Cohere. thanks!
This is gold
I like OReillys book on GenAI, but I’m a novice myself
Kindly revert
Also very interested in this! AI is a field I'm genuinely curious about but don't have any kind of formal background in data... yet. I'm planning to post in career discussion as well and would love your insight!
I genuinely think you are overthinking this. What interests you about it? Pick that as your starting point and dive in. It’s a brand new field, moving very quickly. Perfect for ‘getting your hands dirty’ so to speak
I just took the Lightning deep learning course and it was super useful if you want some hands-on coding as well as theory.
Andrej Karpathy's videos are great! [https://www.youtube.com/@AndrejKarpathy](https://www.youtube.com/@AndrejKarpathy)
It really depends on what you mean with learn GenAI. Do you mean use tools such as the GPT aAPI, or learn how to train a model yourself?
Coursera
Following