Copying in a response to a similar question on Reddit: I should really add some ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

camdencheek on Jan 23, 2024 | parent | context | favorite | on: From slow to SIMD: A Go optimization story

Copying in a response to a similar question on Reddit:

I should really add some discussion around BLAS in particular, which has an good implementation[0] of the float32 dot product that outperforms any of the float32 implementations in the blog post. I'm getting ~1.9m vecs/s on my benchmarking rig.

However, that BLAS became unusable for us as soon as we switched to quantized vectors because there is no int8 implementation of the dot product in BLAS (though I'd love to be proven wrong)

[0]: https://pkg.go.dev/gonum.org/v1/gonum@v0.14.0/blas/blas32#Do...

mkhnews on Jan 23, 2024 [–]

OpenBLAS with INTERFACE64=1 ??

camdencheek on Jan 23, 2024 | [–]

AFAICT, that still doesn't support 8-bit integer dot product? (To clarify, I was using int8 to mean 8-bit integer, not 8-byte)

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact