Hi, I'm Karina! A decade ago, I transitioned from finance to the world of data analytics and data science. It is all started with simple VBA code, and I knew my life will never be the same. After that it was SQL, R, Python, PowerBI, Tableau, hours spent with Stackoverflow and Youtube tutorials.
⚡️Want to learn Python or start coding, but it feels overwhelming - start with my "Data Analysis with Python" beginner-friendly masterclass: karinadatascientist.com/
⚡️
Through my channel, I want to demystify data analysis and share my knowledge — from statistics and Excel to Python and ChatGPT.
Want to learn something new? Subscribe and hit the bell to get notified when I upload new videos!
Karina Data Scientist
Python tip
Find the most frequent element in a list
8 hours ago | [YT] | 9
View 2 replies
Karina Data Scientist
Python Tip
New way to merge dictionaries in Python 3.9+
1 day ago | [YT] | 16
View 2 replies
Karina Data Scientist
Data tip
Your merge can doubled your revenue.
The problem: Right table has duplicates. Each left row matches multiple right rows = row explosion.
The fix:
- Check duplicates before merging
- add validate='many_to_one' for lookups
2 days ago | [YT] | 12
View 0 replies
Karina Data Scientist
You assign a value. No error. But the data doesn't change.
The problem: Chained indexing creates a view, not a copy. Your assignment disappears.
The fix: Use .copy() or .loc in one expression.
The rule:
Slicing for analysis? No .copy() needed
Slicing to modify? Add .copy()
Modifying original DataFrame? Use .loc[condition, column]
3 days ago | [YT] | 21
View 2 replies
Karina Data Scientist
Python tip
You have a column of whole numbers. After a merge or fillna, suddenly they're floats with decimals.
Standard integers can't hold missing values. One NaN forces the entire column to float64.
To fix it - Use nullable integer dtypes (Int64) or fill before converting back.
The key difference:
int64 (lowercase) → Can't hold NaN, converts to float64
Int64 (uppercase) → Nullable integer, stays integer with <NA>
The rule:
Need to keep missing values? Use Int64
Missing = 0 makes sense? Use fillna(0).astype('int64')
Always check dtypes after merges
4 days ago | [YT] | 18
View 2 replies
Karina Data Scientist
GroupBy Multi-Index Columns: The Confusing Mess
You run a groupby with multiple aggregations. The column names become ('amount', 'sum') tuples.
Suddenly df['amount'] doesn't work anymore.
The problem: Using agg() with lists creates multi-index columns that break downstream code.
The fix: Use named aggregations to get clean, flat column names.
Always add .reset_index() to move the groupby key from index to column.
The rule:
Use named aggregations: new_name=(column, func)
Always .reset_index() to flatten
Your downstream code will thank you
5 days ago | [YT] | 22
View 0 replies
Karina Data Scientist
Categorical Traps: Missing Categories and Broken Sorting
You group by customer tier. "Gold" customers don't appear in the results because there are zero this month.
Or your report shows Bronze → Gold → Platinum → Silver. Wait, what?
The problem: Categories inferred from data miss empty groups and sort alphabetically, not logically.
The fix: Define categorical dtype upfront with proper order.
The rule:
Define categories upfront with CategoricalDtype
Use observed=False in groupby to include empty categories
Use .reindex() to show zeros in pivot tables
Set ordered=True for logical sorting
6 days ago | [YT] | 22
View 0 replies
Karina Data Scientist
You merge two tables. Both have a status column. After the merge, df['status'] isn't what you think it is.
The problem: Duplicate column names after merge create status_x and status_y. You can't tell which is which.
The fix: Use explicit suffixes to show which table each column came from.
The rule:
Always use explicit suffixes= parameter
Give primary table empty suffix: suffixes=('', '_lookup')
Use semantic names: _user, _product, _meta, not _x, _y
Drop duplicate columns you don't need immediately
1 week ago | [YT] | 33
View 0 replies
Karina Data Scientist
𝟏𝟎 𝐆𝐢𝐭𝐇𝐮𝐛 𝐑𝐞𝐩𝐨𝐬 𝐟𝐨𝐫 𝐒𝐐𝐋 𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐞
Practice SQL with real business scenarios
𝐅𝐎𝐑 𝐂𝐎𝐌𝐏𝐋𝐄𝐓𝐄 𝐁𝐄𝐆𝐈𝐍𝐍𝐄𝐑𝐒 (𝐥𝐞𝐚𝐫𝐧 𝐛𝐲 𝐝𝐨𝐢𝐧𝐠)
𝐒𝐐𝐋 𝐌𝐮𝐫𝐝𝐞𝐫 𝐌𝐲𝐬𝐭𝐞𝐫𝐲 — solve a crime using only SQL queries, gamified learning
github.com/NUKnightLab/sql-mysteries
𝐒𝐐𝐋 𝐙𝐨𝐨 — interactive tutorials from basics to advanced, instant feedback on your queries
github.com/jisaw/sqlzoo-solutions
𝐅𝐎𝐑 𝐈𝐍𝐓𝐄𝐑𝐕𝐈𝐄𝐖 𝐏𝐑𝐄𝐏
𝐒𝐐𝐋 𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐞 𝐏𝐫𝐨𝐛𝐥𝐞𝐦𝐬 — 57 progressively harder problems mirroring real interview scenarios
github.com/XD-DENG/SQL-exercise
𝐀𝐰𝐞𝐬𝐨𝐦𝐞 𝐒𝐐𝐋 𝐈𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰 𝐐𝐮𝐞𝐬𝐭𝐢𝐨𝐧𝐬 — company-specific questions (Google, Amazon, Meta) with expected approaches
github.com/kansiris/SQL-interview-questions
𝐃𝐚𝐭𝐚 𝐈𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰 𝐐𝐮𝐞𝐬𝐭𝐢𝐨𝐧𝐬 — SQL problems asked at top tech companies with detailed solutions
github.com/shawlu95/Beyond-LeetCode-SQL
𝐅𝐎𝐑 𝐑𝐄𝐀𝐋-𝐖𝐎𝐑𝐋𝐃 𝐒𝐂𝐄𝐍𝐀𝐑𝐈𝐎𝐒 (business problems)
𝟖 𝐖𝐞𝐞𝐤 𝐒𝐐𝐋 𝐂𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞 — case studies from e-commerce, food delivery, and streaming platforms
github.com/katiehuangx/8-Week-SQL-Challenge
𝐒𝐐𝐋 𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐞 — real-world business datasets with progressively challenging queries
github.com/WebDevSimplified/Learn-SQL
𝐅𝐎𝐑 𝐀𝐃𝐕𝐀𝐍𝐂𝐄𝐃 𝐏𝐑𝐀𝐂𝐓𝐈𝐂𝐄 (window functions, CTEs, optimisation)
𝐀𝐝𝐯𝐚𝐧𝐜𝐞𝐝 𝐒𝐐𝐋 𝐏𝐮𝐳𝐳𝐥𝐞𝐬 — brain teasers that force you to think differently about queries
github.com/smpetersgithub/AdvancedSQLPuzzles
𝐒𝐐𝐋 𝐖𝐢𝐧𝐝𝐨𝐰 𝐅𝐮𝐧𝐜𝐭𝐢𝐨𝐧𝐬 𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐞 — focused exercises on the functions that separate junior from senior analysts
github.com/lpinzari/sql-psql-udy
𝐋𝐞𝐞𝐭𝐂𝐨𝐝𝐞 𝐒𝐐𝐋 𝐒𝐨𝐥𝐮𝐭𝐢𝐨𝐧𝐬 — if you must use LeetCode, here are optimized solutions with explanations
github.com/kamyu104/LeetCode-Solutions
𝐏.𝐒. I share my experience, data analytics & data science tips in my free newsletter. Join here ->
lnkd.in/d3M49ktj
1 week ago | [YT] | 12
View 0 replies
Karina Data Scientist
Did you know you can embed Power BI inside Jupyter Notebook? I didn't.
Yep — interactive Power BI visuals, inside your Python environment.
Here’s what you can do
- Instantly explore your data (drag, filter, cross-highlight)
- Build automated reports that refresh with new data
- Combine Python models + Power BI visuals in one notebook
- Publish directly to Power BI workspace
Great for notebooks used in reports, walkthroughs or live demos.
Documentation:
learn.microsoft.com/en-us/javascript/api/overview/…
powerbi.microsoft.com/fr-fr/blog/announcing-power-…
1 week ago | [YT] | 11
View 0 replies
Load more