It was BS considering countless other games having no problem with sound. Decoding something like Opus takes ~30MHz of a single CPU core[1], meaning even an unreasonable situation of decoding 16 simultaneous uninterrupted 128Kbit Stereo streams would only eat half of one core.
At the very moment person in charge says "ok this works, now make it not slow". Python is modern age BASIC. Easy to write and good for prototypes, scripting, gluing together libraries, fast iterations. If you want performance and heavy data processing anything else will be better. PHP, Java, even JavaScript.
For example Python is struggling to reach real time performance decoding RLL/MFM data off of ancient 40 year old hard drives (https://github.com/raszpl/sigrok-disk). 4GHz CPU and I cant break 500KB/s in a simple loop:
To optimize that code snippet, use temporary variables instead of member lookups to avoid slow getattr and setattr calls. It still won’t beat a compiled language, number crunching is the worst sport for Python.
Which is why in Python in practice you pay the cost of moving your data to a native module (numpy/pandas/polars) and do all your number crunching over there and then pull the result back.
Not saying it's ideal but it's a solved problem and Python is eating good in terms of quality dataframe libraries.
All those class variables are already in __slots__ so in theory it shouldnt matter. Your advice is good
self.shift_index -= 16
shift_byte = (self.shift >> self.shift_index) & 0x5555
shift_byte = (shift_byte + (shift_byte >> 1)) & 0x3333
shift_byte = (shift_byte + (shift_byte >> 2)) & 0x0F0F
self.shift_byte = (shift_byte + (shift_byte >> 4)) & 0x00FF
but only for exactly 2-4 milliseconds per 1 million pulses :) Declaring local variable in a tight loop forces Python into a cycle of memory allocations and garbage collection negative potential gains :(
SWAR : 0.288 seconds -> 0.33 MiB/s
SWAR local : 0.284 seconds -> 0.33 MiB/s
This whole snipped is maybe what 50-100 x86 opcodes? Native code runs at >100MB/s while Python 3.14 struggles around 300KB/s. Python 3.4 (Sigrok hardcoded requirement) is even worse:
This is a solved problem nowadays. Pretty much every pcb package produces 3D models you can plug into your existing CAD/CAM product design infrastructure.
Pinouts... there is a reason we try to get all pinouts tested as early as possible, preferably on the first non-form-factor prototype spin if we can. In no event should key pinouts be first assigned or major changes made without a planned spin in the schedule following them....
[1] iPod Classic (1998 era ARM9) decodes 128 kbps stereo Opus at ~150% real time at stock cpu frequency. Opus is not the lightest choice either https://www.rockbox.org/wiki/CodecPerformanceComparison#ARM
reply