Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
skidrow
on July 11, 2024
|
parent
|
context
|
favorite
| on:
Beating NumPy matrix multiplication in 150 lines o...
SIMD intrinsics and manually unrolled loops are surely needed. That's the reason why all BLAS libraries vectorize and unroll loops manually. Even modern compilers can't properly auto-vectorize and unroll with 100% success rate.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: