High-performance C++ hash table using grouped SIMD metadata scanning
https://github.com/Cranot/grouped-simd-hashtable
#HackerNews #HighPerformance #C++ #HashTable #SIMD #MetadataScanning #TechnologyOptimization #GitHub
High-performance C++ hash table using grouped SIMD metadata scanning
https://github.com/Cranot/grouped-simd-hashtable
#HackerNews #HighPerformance #C++ #HashTable #SIMD #MetadataScanning #TechnologyOptimization #GitHub
SIMD City: Auto-Vectorisation
https://xania.org/202512/20-simd-city
#HackerNews #SIMD #City #Auto-Vectorisation #SIMD #City #Vectorisation #Tech #News #Programming #Insights
Hey all! 👋🏻
I’m looking for some shader-like pipeline/#rendering system/library/framework for 1-bit graphics with 2x #framebuffer (double-buffered — actual & previous) with #blitting on #SIMD and #SWAR? CPU-only, mostly targeting ARM32/64/Thumb1.
I understand that it’s rare and mostly impossible to exist, so I just need some source-based guidance/hints of oldschool/demoscene- tricks and algorithms which I don’t know yet (I know a lot already, I’m 40)) and of course i can port.
The state of SIMD in Rust in 2025
https://shnatsel.medium.com/the-state-of-simd-in-rust-in-2025-32c263e5f53d
#HackerNews #SIMD #Rust #2025 #RustProgramming #Technology #Trends #FutureDevelopment
A story about never ever giving up...❤️🔥
After several weeks, questioning my life choices, I've finally figured out why my #Whisper #SpeechToText system had been so slow on #Windows:
It was because apparently the #Rust-FFI wrapped #CPlusPlus code (Whisper.cpp) didn't compile with AVX and AVX2 enabled ( #SIMD!). I've tried it on two Windows machines (both AVX-capable). On one of the machines, with #Linux, it has successfully detected AVX/AVX2, though and has run fast.
1/?
Hmm... 🤔
My suspicion why it's "not working" is:
Even though I do `cargo run --release` I've seen, during my investigation of the above compiling-fail-nightmare, that it puts artifacts into `Debug` folder.
So it might be that the program (Whisper.cpp to be precise) runs as a debug build and is just _terribly_ slow. 🐌
Oh boy, the struggle continues... 🤸
This might be related:
https://codeberg.org/tazz4843/whisper-rs/issues/226
A story about never ever giving up...❤️🔥
After several weeks, questioning my life choices, I've finally figured out why my #Whisper #SpeechToText system had been so slow on #Windows:
It was because apparently the #Rust-FFI wrapped #CPlusPlus code (Whisper.cpp) didn't compile with AVX and AVX2 enabled ( #SIMD!). I've tried it on two Windows machines (both AVX-capable). On one of the machines, with #Linux, it has successfully detected AVX/AVX2, though and has run fast.
1/?
I decided to share my Arm NEON optimizations for the FFmpeg Cinepak encoder. On Apple Silicon / RPI / NEON 32/64-bit, it gets a 250-300% speedup for encoding: