In some applications, e.g. modelling something defined by PDE, you might need to...

activatedgeek · on Oct 22, 2020

For the more machine learning-minded folks, this happens almost always in doing inference with Exact Gaussian Processes (GP), where because of the non-parametric nature of a GP model, the covariance matrix grows with the size of data points. The inference routine is cubic in the number of data points. Hence, sparsity in the posited covariance matrix is _extremely_ important for fast inference.

unnouinceput · on Oct 22, 2020

And what happens if indeed the huge matrix elements are all non-zero? Like, let's say, satellite scan data for a given country when you want to spy on their underground systems (think North Korea facilities)? Wouldn't storing that data as COO would actually triple the amount of memory?

WJW · on Oct 22, 2020

Presumable the designer(s) of such a system will be able to know in advance whether the resulting matrix will be sparse or not and choose their encoding appropriately. FWIW, for a lot of practical applications the raw sensor data would be non-sparse but it would be transformed/filtered almost immediately into a more space-efficient representation. For example in the case you mentioned (unless you think the entire country is completely underdug by tunnels) most of the raw data can be deleted immediately with no loss of signal as there is no underground system underneath that particular spot of land.

unnouinceput · on Oct 23, 2020

That's my point. In order to analyze you need to preserve entire data. One system is the image acquisition. Another is the analysis, on more advanced systems. The advanced system needs all data. Dismissing it at entry point defeats its purpose. No matter how you flip it, there are practical applications where huge matrices that have non-zero elements exists.

Here is another example - cryptography analysis. Usually password length is small, usually below 1k characters. But if you increase the password length then simple old ciphers, like Vigenere, become harder to crack. Bump the key length to over 1 million bytes and suddenly your century old methods, like index of coincidence and Kasiski examination become a problem of having huge matrices to feed to analyzer. 1 million key length suddenly generates a matrix with 256M x 256M for analyzer. And all elements in that matrix are non zero.