@TheUltimator5 I decided to "play cycles" years ago. Didn't manage to sell high and buy low even once. It may be possible to call a bottom, but nothing I have seen helps with calling a top. It always crashed before reaching my sell levels.
> be data scientist
> boss gives 5TB of raw .txt files. (In interview always ask file format btw)
> "just do some word counts and searches, anon"
> MFW it would take 3 years to grep this on his t3 micro mini aws
> time to go full autist and achieve maximum throughput
The "Ascended" Data Pipeline
> Step 1: The Great Partitioning
> don't just "read" the files because byte io streams will cry
> use pyarrow.dataset to scan the disk
> fragment the raw text into 128MB Parquet chunks
> optimise for L3 caching and NVMe sequential read speed
> partition by date/source so the OS doesn't cry
> Or partition by random (non English words)
> Step 2: The Information Theory Play
> take a 1% statistical sample of the whole 5TB
> build a global FSST (Fast Static String Taxonomy) dictionary
> global dictionary for high-cardinality terms
> identify the common prefixes/suffixes for trie storage
> language entropy into 1-byte symbols
> A LOUDS Trie uses only 2.14 bits per node.
> You can then run POPCNT at O(1)
> Step 3: The Ingestion (The "Warm-up")
> spin up every CPU core you own
> pq.ParquetWriter with dictionary_encoding=True
> convert raw strings into integer keys pointing to the FSST Trie
> attach a Bloom Filter to every Row Group
> if the word "abcd" isn't in the filter, the CPU won't even look at the data
> key based counting is easy but we take one step ahead
> Step 4: Memory Mapping (Arrow fs "No-Copy" Hack)
> set mmap=True in the PyArrow scanner
> OS kernel handles the data
> RAM usage stays at 500MB while processing TBs
> physical disk bytes are mapped directly to virtual address space
> ZERO. COPIES. MADE.
> Step 5: More optim: The GPGPU Option
> standard Python is for children
> pipe the Arrow Buffers directly to NVIDIA cuDF via GPUDirect Storage
> bypass the CPU and System RAM entirely
> NVMe -> PCIe -> VRAM
> Step 6: The Calculation (The "Sigma" Move)
> bit-sliced parallel search on the GPU
> 5,000+ CUDA cores performing integer comparisons on the FSST keys
> search O(1) instead of O(N)
> word counts happen at 40GB/sec
> finish the 5TB task before your coffee gets cold
> Step 7: CPU "SIMD" cope-poor people pathway
> if you're poor and don't have a GPU
> use AVX-512 bit-slicing in C++
> process 64 words in a single clock cycle on one core
> still faster than 99% of "Big Data" engineers
@Comedyorwat If the free streams are not enough, the not-financial-advice by his non-obnoxious multiple personalities in the Discord allows regular people to detect flaws in their thinking and get a feel for what to look for. Plus, there's animals. Bats, Sharks, Crabs, Butterflies ... Garlics
Kinda: The Compensation Committee has discretion to make “equitable and proportionate” adjustments to the market cap and EBITDA hurdles in the event of stock dividends, splits, recapitalizations, mergers, acquisitions, or other corporate transactions that affect the share count.
FULL INTERVIEW: @ryancohen explains his plan to acquire eBay.
He unpacks his pitch to institutional investors, why eBay is so horribly run, and how Ryan plans to create billion in shareholder value.
$GME $EBAY
@APompliano@ryancohen The recent interviews about his eBay plans seemed to imply a strong focus on the US market. Is it correct that international would play a secondary role and we might see cost cutting and even a retreat from some markets?
@AustinTobitt I don't think that's how it works. If they issue new shares, they end up with more cash ... not stocks. They can decide how many shares of the combined entity are issued and how they are distributed as part of a deal.
@grok Got to do something about the 9 to 5 as a default - exchanging 5 days per week for 2 days of freedom (which are needed to recover). But wait ... you have to work 24/7 😱. Who am I to complain🥹🍀