Qualified for CIUK 2025 Comp in Manchester !!
CIUK is a HPC competition where we will have to tune a cluster to certain specifications.
Pretty crazy given our team knew literally nothing about HPC a month ago
Everything I've learned is attached below
Qualified for CIUK 2025 Comp in Manchester !!
CIUK is a HPC competition where we will have to tune a cluster to certain specifications.
Pretty crazy given our team knew literally nothing about HPC a month ago
Everything I've learned is attached below
What is InfiniBand?
It is a high performance networking technology used in mainly HPC clusters
How does it communicate ?
RDMA -> Remote Direct Memory Access
This network card is crazy as it SKIPS the OS so you can write directly from RAM on one node to another.
Why does distance to memory matter?
It influences bandwidth as it takes more time for data to be sent across a longer distance so it can leave the CPU sitting idle
Its why the L1 - L3 cache is so fast !
What is NUMA ?
NUMA is a memory architecture used in multi-CPU systems where not all memory is equally close to every CPU
-Each CPU has its own local memory
-CPUs can also access remote memory allocated to another CPU but its slower due to interconnect latency
What are cores?
A core is an independent processing unit within a CPU. Each core can fetch , decode , execute and retire its own sequence of instructions effectively acting like a smaller processer within the main CPU chip
What is the Stream Triad?
It is a vector calculations that has a low arithmetic intensity.
This is crucial because then we can isolate whether a HPC is memory bound or not as we know that it is not compute bound , due to its low arithmetic intensity.
Why do we want to optimise for FLOPS/s ?
This allows us to run our supercomputers faster meaning we can train LLMs faster run models and simulations faster , anything arithmetically heavy can be done significantly faster
How are FLOPS/s are capped ?
1. Compute Bound -> The processor cannot execute the instructions fast enough
2. Memory Bandwidth Bound -> The CPU is sitting idle waiting for data to move to it
3. I/O Bound -> Latency is introduced in the system via read write operations
What the fuck is Likwid-Bench?
This was my exact question at first , Likwid-Bench is a test suite of microkernels designed to a host of benchmarks on these HPCs
Why run these benchmarks?
FLOPS/s (Floating point operations per second) is capped in 3 ways
What is HPC?
It refers to using supercomputers to perform extremely large or complex computations that would be too slow on a normal desktop or laptop.
How did we Qualify?
We had to compete in 3 challenges the one I did in my team involved optimising LIKWID and the STREAM TRIAD
Writing all my tasks down for the UGC dashboard
(search ->https://t.co/6IuqA2gQ8K, WIP) makes coding that much more satisfying and almost gamifies it in a way
Supabase has served as the ideal database for managing the data for the backend of the UGC dashboard I am working on
(link to the dashboard landing page,WIP)
https://t.co/6IuqA2gQ8K
Only problem so far is an overwhelming amount of tables to manage any ideas how to deal w this?
The unofficial TikTok api on GitHub , is unbelievable for scraping data on TikTok
Likes shares comments everything it’s on there and it’s implementation was key to the dashboard
Currently working on weekly reports for the UGC dashboard
This will contain all the weekly trends encapsulated into cards that can be clicked through
All I had to do was add timestamp columns to my supabase tables so I could track data changes and hence calculate trends
I am trying to become a better SWE and here is one thing I learnt
Stepping back and questioning the approach when implementing a feature will save days of work
Now instead of wasting days I only waste hours, so I'll take the improvement
I am never writing useless git commit messages again
I needed to revert to a previous commit and this is the garbage I have to look through 🤦♂️
what even is this....