I had a blast this morning giving a talk on how AI can drive sustainability in aerospace supply chains.
I'll recap some points that could be useful for other machine learning professionals out there:
β There is an enormous amount of leverage an organization can get from having data properly structured and made usable. It makes all the data project more likely to be achieved and allows for AI projects to flourish. This plumbing is usually a key missing piece to unlock massive AI upside.
β Data is nice, but context is king. Having properly structured information within the context it's created is a major accelerator for AI engineering projects. It creates a shortcut to go from an idea to a production application. This is 1000X times more true in our era dominated by LLM.
β Making a supply chain more sustainable requires two things: 1. Complete visibility about what is being bought, from whom, and all the metadata associated with it. 2. That information needs to be available while the context is fresh, i.e., while the purchase is being considered.
Both of these points are mostly data engineering problems that require proper piping of the data to the procurement department. Invest in that type of data engineering skillset early and often.
One thing I'm seeing in the AI field is a convergence of the stack.
Back 3-4 years, it would still have been fine to be very nicely focused on the data stack, the ML stack, the Ops stack, the frontend/backend stack, etc.
What I'm seeing now with AI Engineering being integrated into more product teams is that every stack gets pulled together a bit more tightly.
AI Engineering especially sits almost squarely in between everything.
It requires: - some amount of knowledge about data management, because that's what you pipe in and out of these system. - Ops understanding, since most LLM lives in the cloud to even prototype. - Knowledge of backend because to add an AI feature, you usually have to wrap it around some sort of backend. - ML skills when you need to do something a bit more sophisticated than just basic prompting to a foundational model. - even frontend to some instance to close the loop with the AI feature interacting with the user.
I'm not saying that specialization is dead, absolutely not. However, not knowing about the rest of the flow to make a product doesn't work well in the current climate.
One of the core reasons AI feature gets prime spots in roadmaps is that they can be extremely useful and add an additional layer of magic to an already great product. It also makes some problems trivial in an elegant manner, so it gets prioritized.
PS: On top of all that, there is the whole cybersecurity aspect, as illustrated in the Cloudflare diagram below, that is still rapidly evolving. Another field getting closer to the rest of the stacks.
Tiling matrix multiplication is a technique currently used in transformer attention calculation to optimize the use of GPU resources.
It's the kind of technique you see in FlashAttention, Lightning Attention, or other I/O-aware attention calculation techniques.
The main benefit of tiling is that it reduces global memory access by taking advantage of the shared memory on the GPU.
By properly segmenting the matrix to manipulate into blocks, tiling allow you to maximize the thread access to memory, which in some cases is a big bottleneck to efficient computation.
I've been solving a few more deep learning leetcode problems these days on www.deep-ml.com and it's really well made.
I thought that the platform would kind of stay roughly the same 6 months ago when I discovered it, but it has been improving quite rapidly.
Two of my favorite features right now are: 1. Notebook mode: you can spin up a notebook for cell-by-cell coding which was useful in my last stream as I was struggling with SVD. 2. problem-breakdown: It breaks the big problem set you are working on into mini-sub problems you can independently work on.
Will do another live problem-solving stream this Thursday, do come hang out! π
The next paper walkthrough is this one : MiniMax-01: Scaling Foundation Models with Lightning Attention!
Almost done with the overview, here are a couple of interesting points: 1. It's using linear attention (lightning attention) optimized for GPU usage. 2. It hybridize it cleverly with softmax attention to make sure the retrieval performance is high. 3. It shows a whole bunch of FlashAttention-like improvement to their infrastructure to get solid performance. 4. It is on-par with the rest of the foundation models while being super performant on the long context input.
Here by long context we are talking about 1M plus, which is quite bonkers. All the while keeping efficiency high.
There are lots of paper I've been reading and catching up on these days, especially those relating to attention.
Mamba, Linear attention, Lightning Attention, FlashAttention2...
I'm almost done with a full walkthrough of a very interesting transformer architecture that cleverly uses Lightning Attention to increase the context length massively.
Deep Learning with Yacine
If you are interested in learning more about the adam optimization algo, come hang out with him today at 8:00 PM EST!
We're going to do our best to code the algorithm in Python:
π youtube.com/live/_2Nm7RVE-dA?feature=share
See you there! π
7 months ago | [YT] | 49
View 5 replies
Deep Learning with Yacine
Hey folks! π
If you are interested in learning more about the transformer architecture, come hang out with him today at 8:00 PM EST!
We're going to code the algorithm to calculate masked self-attention in python:
π youtube.com/live/CBSMMMIYj6k
See you there!
8 months ago | [YT] | 44
View 0 replies
Deep Learning with Yacine
I had a blast this morning giving a talk on how AI can drive sustainability in aerospace supply chains.
I'll recap some points that could be useful for other machine learning professionals out there:
β There is an enormous amount of leverage an organization can get from having data properly structured and made usable. It makes all the data project more likely to be achieved and allows for AI projects to flourish. This plumbing is usually a key missing piece to unlock massive AI upside.
β Data is nice, but context is king. Having properly structured information within the context it's created is a major accelerator for AI engineering projects. It creates a shortcut to go from an idea to a production application. This is 1000X times more true in our era dominated by LLM.
β Making a supply chain more sustainable requires two things:
1. Complete visibility about what is being bought, from whom, and all the metadata associated with it.
2. That information needs to be available while the context is fresh, i.e., while the purchase is being considered.
Both of these points are mostly data engineering problems that require proper piping of the data to the procurement department. Invest in that type of data engineering skillset early and often.
9 months ago | [YT] | 34
View 6 replies
Deep Learning with Yacine
I'll be hosting a live machine learning study session this evening at 8h00 EST!
We'll be going through some live coding exercises in Python, and I'll do my best to implement Singular Value Decomposition using the Jacobian method.
If you want to code along, you can check the problem out over here:
π www.deep-ml.com/problems/12
Last week, I coded one version of it with eigenvalues and eigenvector and boi was it rough.
We'll be live over here:
π youtube.com/live/ijakXXXpbGI?feature=share
See you then! π
9 months ago | [YT] | 60
View 4 replies
Deep Learning with Yacine
One thing I'm seeing in the AI field is a convergence of the stack.
Back 3-4 years, it would still have been fine to be very nicely focused on the data stack, the ML stack, the Ops stack, the frontend/backend stack, etc.
What I'm seeing now with AI Engineering being integrated into more product teams is that every stack gets pulled together a bit more tightly.
AI Engineering especially sits almost squarely in between everything.
It requires:
- some amount of knowledge about data management, because that's what you pipe in and out of these system.
- Ops understanding, since most LLM lives in the cloud to even prototype.
- Knowledge of backend because to add an AI feature, you usually have to wrap it around some sort of backend.
- ML skills when you need to do something a bit more sophisticated than just basic prompting to a foundational model.
- even frontend to some instance to close the loop with the AI feature interacting with the user.
I'm not saying that specialization is dead, absolutely not. However, not knowing about the rest of the flow to make a product doesn't work well in the current climate.
One of the core reasons AI feature gets prime spots in roadmaps is that they can be extremely useful and add an additional layer of magic to an already great product. It also makes some problems trivial in an elegant manner, so it gets prioritized.
I greatly suggest the following course/book to get caught up on AI Engineering:
π [Book] AI Engineering Book by Chip: www.oreilly.com/library/view/ai-engineering/978109β¦
π [Book] LLM Engineer's Handbook: github.com/PacktPublishing/LLM-Engineers-Handbook
π [Course] AI Engineering Path: scrimba.com/the-ai-engineer-path-c02v?via=yacineMaβ¦
PS: On top of all that, there is the whole cybersecurity aspect, as illustrated in the Cloudflare diagram below, that is still rapidly evolving. Another field getting closer to the rest of the stacks.
9 months ago | [YT] | 38
View 2 replies
Deep Learning with Yacine
Tiling matrix multiplication is a technique currently used in transformer attention calculation to optimize the use of GPU resources.
It's the kind of technique you see in FlashAttention, Lightning Attention, or other I/O-aware attention calculation techniques.
The main benefit of tiling is that it reduces global memory access by taking advantage of the shared memory on the GPU.
By properly segmenting the matrix to manipulate into blocks, tiling allow you to maximize the thread access to memory, which in some cases is a big bottleneck to efficient computation.
I found two great resources on the topic:
π [video] https://youtu.be/Q3GgbfGTnVc?si=14t9p...
π [blog] penny-xu.github.io/blog/tiled-matrix-multiplicatioβ¦
Enjoy!
9 months ago | [YT] | 83
View 1 reply
Deep Learning with Yacine
I've been solving a few more deep learning leetcode problems these days on www.deep-ml.com and it's really well made.
I thought that the platform would kind of stay roughly the same 6 months ago when I discovered it, but it has been improving quite rapidly.
Two of my favorite features right now are:
1. Notebook mode: you can spin up a notebook for cell-by-cell coding which was useful in my last stream as I was struggling with SVD.
2. problem-breakdown: It breaks the big problem set you are working on into mini-sub problems you can independently work on.
Will do another live problem-solving stream this Thursday, do come hang out! π
9 months ago | [YT] | 83
View 8 replies
Deep Learning with Yacine
The next paper walkthrough is this one : MiniMax-01: Scaling Foundation Models with Lightning Attention!
Almost done with the overview, here are a couple of interesting points:
1. It's using linear attention (lightning attention) optimized for GPU usage.
2. It hybridize it cleverly with softmax attention to make sure the retrieval performance is high.
3. It shows a whole bunch of FlashAttention-like improvement to their infrastructure to get solid performance.
4. It is on-par with the rest of the foundation models while being super performant on the long context input.
Here by long context we are talking about 1M plus, which is quite bonkers. All the while keeping efficiency high.
Link to the paper:
π arxiv.org/pdf/2501.08313
Should be out soon!
10 months ago | [YT] | 61
View 0 replies
Deep Learning with Yacine
There are lots of paper I've been reading and catching up on these days, especially those relating to attention.
Mamba, Linear attention, Lightning Attention, FlashAttention2...
I'm almost done with a full walkthrough of a very interesting transformer architecture that cleverly uses Lightning Attention to increase the context length massively.
Should be out soon! πΉ
10 months ago | [YT] | 59
View 4 replies
Load more