Data Science with Keerthi (தமிழில்)

வணக்கம்🙏
I'm Keerthana, a self-taught data science practitioner in the field for 3.7 years.

This channel is dedicated to teaching and simply sharing my data science knowledge with the Tamil community.



Data Science with Keerthi (தமிழில்)

New video in 𝐌𝐚𝐜𝐡𝐢𝐧𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 மெதுவாக Series !

🎯 New YouTube Video Drop — 𝐆𝐫𝐚𝐝𝐢𝐞𝐧𝐭 𝐁𝐨𝐨𝐬𝐭𝐢𝐧𝐠 (Ft. The AC Dataset)
📺 Watch full breakdown → https://youtu.be/UFMBXkB7BR0?si=bnyjy...

🔸 Gradient = derivative of loss

🔸 Each tree fits residuals from previous step

🔸 Regression → MSE loss

🔸 Classification → Log-loss & log-odds

🔸 Controlled updates using learning rate

🔸 Final prediction = base + sum of all tiny corrections

#GradientBoosting #DataScience #MachineLearning #DataScienceWithKeerthi #ArtificaialIntelligence

1 week ago (edited) | [YT] | 10

Data Science with Keerthi (தமிழில்)

Introducing my new series: “𝐋 𝐟𝐨𝐫 𝐋𝐋𝐌”(Episode - 1):



𝑳𝒂𝒓𝒈𝒆 𝑳𝒂𝒏𝒈𝒖𝒂𝒈𝒆 𝑴𝒐𝒅𝒆𝒍
AI systems trained on massive amounts of text (books, articles, websites, conversations). LLMs are powerful pattern recognizers that feel intelligent because of scale.

Example : ChatGPT, Claude, and more......................................................................



But, How exactly are these modern LLMs trained?
1️⃣ 𝐏𝐫𝐞𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠: Model learns language patterns and knowledge from massive text datasets.
2️⃣ 𝐈𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐢𝐨𝐧 𝐅𝐢𝐧𝐞-𝐓𝐮𝐧𝐢𝐧𝐠 (𝐒𝐅𝐓): Humans teach the model to follow instructions with curated examples.
3️⃣ 𝐏𝐫𝐞𝐟𝐞𝐫𝐞𝐧𝐜𝐞 𝐂𝐨𝐥𝐥𝐞𝐜𝐭𝐢𝐨𝐧: Multiple model outputs are ranked to capture what humans actually prefer.
4️⃣ 𝐑𝐞𝐰𝐚𝐫𝐝 𝐌𝐨𝐝𝐞𝐥 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠: A model is trained to score outputs based on human preferences.
5️⃣ 𝐑𝐞𝐢𝐧𝐟𝐨𝐫𝐜𝐞𝐦𝐞𝐧𝐭 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 (𝐑𝐋𝐇𝐅/𝐃𝐏𝐎): The model is fine-tuned to maximize helpful, safe, and aligned.

Tada! ChatGPT is ready to be shipped! 🚀


As usual i have some handcrafted visualizations to make it easy for digestion. Go check out now----->



#AI #ArtificialIntelligence #MachineLearning #DeepLearning #LLM #ChatGPT #ClaudeAI #NaturalLanguageProcessing #NLP #AIResearch #TechInnovation #GenerativeAI #AIExplained #DataScience hashtag#AICommunity #RLHF #RLAIF #LLMTraining

2 weeks ago | [YT] | 12

Data Science with Keerthi (தமிழில்)

New video in 𝐌𝐚𝐜𝐡𝐢𝐧𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 மெதுவாக Series !


AdaBoost (Ft. Saravana Bhavan) - https://youtu.be/EliLxkKUDQ0




Here’s why it’s one of my favorite ML algorithms 👇
✅ 𝐃𝐨𝐞𝐬𝐧’𝐭 𝐝𝐢𝐬𝐜𝐚𝐫𝐝 𝐰𝐞𝐚𝐤 𝐦𝐨𝐝𝐞𝐥𝐬 – it makes them work harder.
✅ 𝐋𝐞𝐚𝐫𝐧𝐬 𝐟𝐫𝐨𝐦 𝐦𝐢𝐬𝐭𝐚𝐤𝐞𝐬 – misclassified points get more weight next round.
✅ 𝐓𝐞𝐚𝐦 𝐞𝐟𝐟𝐨𝐫𝐭 – many weak learners combine into one strong learner.
✅ 𝐒𝐢𝐦𝐩𝐥𝐞 𝐦𝐚𝐭𝐡, 𝐩𝐨𝐰𝐞𝐫𝐟𝐮𝐥 𝐢𝐦𝐩𝐚𝐜𝐭 – a great mix of theory + real-world use.

If you’ve ever wondered how machines get better by learning from errors, this is the algorithm to explore.



#MachineLearning #AI #Boosting #DataScience #AdaBoost

3 weeks ago | [YT] | 13

Data Science with Keerthi (தமிழில்)

New video in 𝐌𝐚𝐜𝐡𝐢𝐧𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 மெதுவாக Series ! https://youtu.be/-aI9fjOL9EQ
Welcome to my small 🍎 Apple Shop.
Here, Every customer asks me 2 things:
1️⃣ Is this really an apple?
2️⃣ How sweet is it? (in grams of sugar)
My data mind wanted to frame it into a use case, hence 𝐑𝐚𝐧𝐝𝐨𝐦 𝐅𝐨𝐫𝐞𝐬𝐭 𝐓𝐫𝐞𝐞𝐬 are born!
👉 Classification = “Is Apple or Not?”
👉 Regression = “Sweetness Score”
I also covered Ensemble, bagging, feature sampling — in a super simple way.
No heavy math, just everyday examples.

1 month ago | [YT] | 14

Data Science with Keerthi (தமிழில்)

☕ Once upon a coffee (Decision Trees Ft. Filter காப்பிl!)…

We were debating over our morning brews — who drinks when, how many cups, who gets stressed, who sleeps well.
That tiny argument turned into something big… 🌳

👉 We built our very own Coffee Chronicles Dataset.
👉 We asked a simple question: Can a Decision Tree predict caffeine addicts vs calm sippers?
👉 From there, the math started pouring in —

🔹 Entropy to measure the “chaos” in our data
🔹 Gini Impurity to split better
🔹 Variance Reduction to handle regression
All explained in simple Tamil, step by step. 🇮🇳

And now, it’s a video — mixing math + machine learning + Tamil + coffee all in one cup. 🚀

🎥 Here’s the release: [https://youtu.be/zkxMjd6Sw2c]

#MachineLearning #TamilTech #DataScience #DecisionTrees #AI #DecisionTreesInTamil #Entropy #GiniImpurity


You said:

1 month ago | [YT] | 9

Data Science with Keerthi (தமிழில்)

𝐋𝐚𝐝𝐢𝐞𝐬 𝐚𝐧𝐝 𝐆𝐞𝐧𝐭𝐥𝐞𝐦𝐞𝐧 𝐒𝐞𝐫𝐢𝐞𝐬 - (Episode 3)
Sharing go-to snippets that teach you something new in every post.



𝑹𝑨𝑮:
❓ WHO AM I? → technique to make LLMs accurate, up-to-date, and domain-specific by grounding their answers in external data.
⏳ SINCE WHEN? → Introduced by Facebook AI Research in 2020.
🌍 HOW POPULAR? → Powering today’s AI copilots, chat-with-your-PDF apps, and enterprise assistants!



But wait…… 🤔
How do I actually work ?

1️⃣ 𝐃𝐚𝐭𝐚 𝐈𝐧𝐠𝐞𝐬𝐭𝐢𝐨𝐧 📥 — Load PDFs, CSVs, websites, or DBs.
2️⃣ 𝐒𝐩𝐥𝐢𝐭𝐭𝐢𝐧𝐠 ✂️ — Break long docs into chunks.
3️⃣ 𝐄𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠𝐬 🧩 — Turn chunks into vectors.
4️⃣ 𝐕𝐞𝐜𝐭𝐨𝐫 𝐒𝐭𝐨𝐫𝐞 🗂️ — Store in FAISS, Pinecone, Chroma, etc.
5️⃣ 𝐑𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 + 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧 🤝 — Fetch relevant chunks → pass to LLM → get grounded answers.



🚦Ways to Build Me:
𝐎𝐩𝐭𝐢𝐨𝐧 𝐀 – 𝐃𝐈𝐘🧑‍💻
Use Python libraries (PyPDF2, sentence-transformers, faiss, openai).
But, You’ll control everything, but also manage memory, prompts, and orchestration yourself.
𝐎𝐩𝐭𝐢𝐨𝐧 𝐁 – 𝐅𝐫𝐚𝐦𝐞𝐰𝐨𝐫𝐤𝐬 ⚡
LangChain → general-purpose & modular.
LlamaIndex → data-centric & simple.
Haystack → enterprise-grade pipelines.
AutoGen → for agent-based workflows.

Choose what fits your style & infra.



🚀Now go check some starters using Langchain below!

#llms #langchain #GenAI #datascience #Agents #LadiesAndGentlemen
#datasciencewithkeerthi #RAG

1 month ago | [YT] | 10

Data Science with Keerthi (தமிழில்)

𝐋𝐚𝐝𝐢𝐞𝐬 𝐚𝐧𝐝 𝐆𝐞𝐧𝐭𝐥𝐞𝐦𝐞𝐧 𝐒𝐞𝐫𝐢𝐞𝐬 - (Episode 2)
Sharing go-to snippets that teach you something new in every post.



𝙇𝙖𝙣𝙜𝙘𝙝𝙖𝙞𝙣
❓ Who am I? → Framework to build applications on top of LLMs.
⏳ Since when? → October 2022 (still a baby...)
🌍 How popular? → 360,000 package downloads per day!



But wait...........🤔
Why do you need me?

1️⃣ 𝗣𝗿𝗼𝗺𝗽𝘁 𝗺𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁 was messy 📝 — LangChain gives you neat templates.
2️⃣ Connecting your 𝗼𝘄𝗻 𝗱𝗮𝘁𝗮 was painful 📚 — now you can load PDFs, DBs, or APIs easily.
3️⃣ LLMs don’t 𝗿𝗲𝗺𝗲𝗺𝗯𝗲𝗿 past chats 🧠 — LangChain adds memory so they can!
4️⃣ Need external help like APIs? 🔧 — 𝗔𝗴𝗲𝗻𝘁𝘀 step in to pick and use tools.
5️⃣ 𝗧𝗿𝗮𝗰𝗸𝗶𝗻𝗴, 𝗹𝗼𝗴𝗴𝗶𝗻𝗴 & 𝗲𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 ✅ — all in one place.
6️⃣ Complex workflows? 🔗 — 𝗖𝗵𝗮𝗶𝗻𝘀 let you connect multiple steps seamlessly.



🚀 LLMs are raw electricity… but you need wiring to light up your world🔌'



That's when I came into picture, now go check out the 📸.



#llms #langchain #GenAI #datascience #Agents #LadiesAndGentlemen #datasciencewithkeerthi

1 month ago | [YT] | 5

Data Science with Keerthi (தமிழில்)

🚀 Launching "𝐋𝐚𝐝𝐢𝐞𝐬 𝐚𝐧𝐝 𝐆𝐞𝐧𝐭𝐥𝐞𝐦𝐞𝐧 𝐒𝐞𝐫𝐢𝐞𝐬"
Sharing go-to snippets that teach you something new in every post.

𝑷𝒚𝒅𝒂𝒏𝒕𝒊𝒄
❓ Who am I? → A data validation library for Python
⏳ Since when? → October 2019 (first stable release)
🌍 How popular? → 8,000+ Python packages trust me

But wait...........🤔
Why do you need me?
Python is 𝐝𝐲𝐧𝐚𝐦𝐢𝐜𝐚𝐥𝐥𝐲 𝐭𝐲𝐩𝐞𝐝 - variables can change their datatype anytime:
---------------------------------
x = 10
x = "Keerthi"
print(x) # "Keerthi"
---------------------------------
This is great as it is 𝒉𝒊𝒈𝒉𝒍𝒚 𝒇𝒍𝒆𝒙𝒊𝒃𝒍𝒆 and on the fly🦅!

But in large apps / APIs, you need strict control of data types for requests & responses. You can't sent a "string" to the variable that expects "int"(So Sad!)

That's when I came into picture, now go check out the 📸.
#python #pydantic #datascience #Agents #LadiesAndGentlemen #datasciencewithkeerthi

1 month ago | [YT] | 15

Data Science with Keerthi (தமிழில்)

ANN's to LLM's (Ep-2): Activation Functions - lnkd.in/gQxSdWmc

Confused between ReLU, GELU, Swish, Mish, or just using Sigmoid by default?

- Clean formulas for all activations
- No confusion + Simple intuition
- Pros & Cons – vanishing gradient, dying ReLU, etc.
- When to use what – hidden layers vs output layers
#DeepLearning #ActivationFunctions #AI #ReLU #GELU #NeuralNetworks #ML

3 months ago | [YT] | 3

Data Science with Keerthi (தமிழில்)

Word Embeddings with திருக்குறள் - OHE/BoW/Tf-IDF (https://youtu.be/rOkvRtuyw6Q)





This is Tamil meets AI – for anyone trying to learn NLP concepts with our own culture.
Perfect for students, beginners, and anyone building something in NLP ❤️



#TamilNLP #ThirukkuralMeetsAI #MachineLearningInTamil #WordEmbeddings #TFIDF #OHE #BagOfWords #DataScienceinTamil #DataScienceWithKeerthi

3 months ago | [YT] | 7