Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Purist test: aren't vectors just Nx1 matrices?

Joking aside, matrices are linear maps in the context of multiplication. They can absolutely have meanings assigned outside the usual "when multiplied by a vector" type of calculations. For example the "adjacency matrix" representation of a graph is very often never multiplied by a vector even in academic algorithm descriptions. YMMV if you call it a (lookup) table in that context but the term "matrix" seems fairly well established there.



> They can absolutely have meanings assigned outside the usual "when multiplied by a vector" type of calculations. For example the "adjacency matrix" representation of a graph is very often never multiplied by a vector even in academic algorithm descriptions.

Not disputing your basic point, but I'm not sure "very often never" is entirely accurate. In the GraphBLAS API vector/matrix and matrix/vector multiplication is key for almost all algorithms that depend on breadth first search, this is because matrix/vector multiplication is a single step in a BFS which is the core operation of the GraphBLAS. The vector is typically used to hold the current frontier (and mask). See the "Shortest Path Lengths" example here:

https://github.com/michelp/pygraphblas/blob/master/pygraphbl...

(disclosure, I am the pygraphblas author).

> Purist test: aren't vectors just Nx1 matrices?

I like it. :) As an interesting point, the GraphBLAS defines both Matrix and Vector types, but internally in the SuiteSparse implementation, they are all just unified into one type and yes, vectors are Nx1 or 1xN.


Wow thanks for that example. I don't think I've seen that trick before; now I want do do more with graphs again.


I implemented flood fill for Adobe as an adapter on top of BFS (from the BGL), which was, itself, an adapter on top of BLAS (written in FORTRAN). It was somewhat faster than the handcoded assembly BFS they were using at the time. The compile times were killer, though.


> the "adjacency matrix" representation of a graph is very often never multiplied by a vector even in academic algorithm descriptions.

Not true. There is a whole field (graph signal processing) that studies graphs in terms of their description in matrix form (adjacency, connectivity and more). For example, you can find connected components by looking at the eigenvectors of a certain matrix.


> For example the "adjacency matrix" representation of a graph is very often never multiplied by a vector even in academic algorithm descriptions.

Unless you're doing spectral graph theory, which looks at properties of the eigenvalues and eigenvectors of the graph's adjacency matrix. Even then, you don't usually multiply the adjacency matrix by a vector explicitly.


Today I learned! I'm fairly familiar with "regular" spectral analysis techniques, ie FFTs and DCTs, with some CWT waveletty goodness thrown in here and there but I never heard about spectral graph analysis. That's what you get for staying in EE land too long I guess.


Fourier & Friends are excellent analytical tools for the nature of patterns, regardless of where they occur! In EE-land we think too much about real and complex numbers, where Fourier analysis is more concrete. But spectral graph theory makes a lot of sense when you think of Fourier analysis as a way of decomposing patterns into their constituent parts, it just looks a bit funny when those patterns aren't numbers.


Indeed. By using tools of linear algebra on the adjacency matrices of a graph, you can tell a lot about the structure of the graph.

For instance, if A is the adjacency matrix of a graph G, with n vertices (we say 'vertices' in the math department almost exclusively -- rarely do we say 'nodes'), and k edges, denoting the trace of A by Tr(A), we can find the number of triangles in A by computing Tr(A^3)/6. We can also tell if G is connected by computing A^n. [0]

OTOH, the spectrum doesn't tell you nearly everything you'd want to know about a graph. There are many examples of non-isomorphic graphs that are nonetheless cospectral. For instance, one of my professors from grad school wrote a landmark paper in spectral graph theory in 1973, showing that almost all trees are cospectral. [1]

---

[0]: Of course, you'd probably never actually do this, because it's horribly inefficient, but here's how it goes.

First, you prove easily by induction that A^c_{i,j} is the number of paths of length c starting at vertex i and ending at vertex j. With that in mind, we see that if there is a row of all nonzero numbers in any matrix A^c for c <= n, then G is connected. With a little more work, you can even extract all the connected components of G, by looking at submatrices of A^c.

[1]: Unsurprisingly, the paper is titled "Almost all trees are cospectral". See: https://www.semanticscholar.org/paper/Almost-all-trees-are-c...


> Purist test

The short answer is. Yes, but if your PL treats it that way you will wind up in trouble [0].

The long answer is: https://www.youtube.com/watch?v=C2RO34b_oPM (but seriously take the time to watch this, it's fantastic).

[0] Separately C- and often C-derived languages/frameworks like numpy also treat them as 1xN matrices, but that's a nearly tabs-vs-spaces level religious war, with actual consequences. I think a lot of people got confused when translating andrew ng's early octave-based ML coursera course (Nx1) to python (1xN), and also in the back and forth between math ML papers which use Nx1 and python.


> numpy also treat them [vectors] as 1xN matrices

I may be misunderstanding something, but in numpy at least a vector, a 1×N array ("row vector"), and a N×1 array ("column vector") behave differently.

  vec = numpy.array([1,2,3])
  assert vec.shape == (3,)
  assert vec.ndim == 1
  row = np.array([[1,2,3]])
  assert row.shape == (1,3)
  assert row.ndim == 2
  col = np.array([[1], [2], [3]])
  assert col.shape == (3,1)
  assert col.ndim == 2
And a vector, unlike a 1×N or N×1 array, can be multiplied to either side of any matrix with compatible cardinality:

  assert all(vec @ numpy.eye(3) == numpy.array([1, 2, 3]))
  assert all(numpy.eye(3) @ vec == numpy.array([1, 2, 3]))
Matlab / Octave, in contrast, forces the user to pick a 1×N or N×1 matrix to represent a vector. Its arrays are always ≥ 2D. So you're probably still right that people get confused translating this to Python. Just wanted to clarify the numpy side of things.

The JuliaCon talk you linked does a great job of explaining what different fields mean by "vector"; thanks for linking it.


I just tried this, I stand corrected, but I remember this (the last clause) not working in numpy. It's possible my memory is flawed:

    >>> import numpy as np
    >>> vec = np.array([1, 2, 3])
    >>> mtx = np.eye(3)
    >>> np.dot(vec, mtx)
    array([1., 2., 3.])
    >>> np.dot(mtx, vec)
    array([1., 2., 3.])
It's possible though that I was remembering not that it doesn't execute, but rather that if you do mess up the multiplication order and do it in the wrong direction with anything besides trivial diagonal matrices, you won't get a type error, you will get an incorrect but valid result, which is arguably worse than a type error, and way more frustrating, especially if start by testing eye, which is the easiest thing to do.


I remember a problem similar to the one you describe when using a dot product both to substitute for a fast loop and for its normal mathematical purpose, in the same operation:

  points = np.array([[0, 0, 0], [1, 0, 0], [1, 1, 0], [0, 1, 0]])
  M = np.array([[0.866, 0.5, 0], [-0.5, 0.866, 0], [0, 0, 1]])  # rotation
  points @ M  # works
  M @ points  # raises ValueError, also a different transform
Regarding messing up the multiplication order, numpy by itself doesn't have the information to raise a relevant type error, sadly. As you probably know, based on your other comments, the important thing is whether the adjacent basis vectors (dimensions) match, not how they are written. That is, given M = M_ij (a_i ⊗ b_j), a vector v_i a_i must be multiplied on the left, a vector v_i b_i must be multiplied on the right, and a vector v_i c_i (with basis vector c) shouldn't be used at all. Numpy just does array math; it doesn't keep track of dimensions. Hopefully the code has accurate comments, or you're using a language with a decent type checker.

However, I think you can track dimensional compatibility using xarray:

  import xarray as xr
  import numpy as np
  points = xr.DataArray([[0, 0, 0], [1, 0, 0], [1, 1, 0], [0, 1, 0]], dims=("idx", "a"))
  M = xr.DataArray([[0.154, 0.988, 0], [-0.988, 0.154, 0], [0, 0, 1]], dims=("a", "b"))  # rotation
  points.dot(M)  # works
  M.dot(points)  # also works
  # M's a-dim is matched to point's a-dim, so both
  # results have same values regardless of
  # multiplication order, just transposed.
  assert np.all(points.dot(M).T == points.dot(M))
I only just tried xarray in response to your post, so not sure if there are pitfalls in practice, or if this is the best way to use it.


Well I was doing graphics transformations on affine vectors, so they are all 4x4 and 4x1 vectors... You can't "know you are in the wrong order by checking dimensions" in that case.


Besides needing to change the order of multiplication (y = xM vs y = Mx) and use the right transposition of the matrix, do any other actual consequences come from the choice of "row vector" vs "column vector"?


Mathematically speaking, no. They are isomorphic duals. But ergonomics are important to prevent errors. I personally think Matrix-in-front is nice because it evokes the idea that the matrix is a function acting on the vector.

In the real world, When you start doing matrix-vector, if you aren't careful with row-major vs column-major storage you can introduce performance regressions. Let's say the architecture can do 256 bit memory move into 4xf64 mmx registers, you have to think carefully about what is going where before proceeding. Some architectures might be able to do an arbitrary stride so that those f64s can be nonadjacent in a simultaneous move. I suspect not handling this correctly is part of what holds back really good fp8, fp16 performance, and I suspect for example the TPUs struggle with this.

Of course optimizing for matrix-vector can be different for optimizing for matrix-matrix, since the column vs row memory access preference for the front matrix is exactly the opposite of the back matrix.


> aren't vectors just Nx1 matrices?

It's been a while since I formally studied maths, but - can you describe a situation in which this is _not_ true? (modulo transposition)


In an N dimensional vector space, yes, you can always choose a basis and write vectors down that way, but it turns out to often be a bad way to work. For example, picture a surface. At any point on the surface, there is a plane of vectors tangent to that surface. It's unnatural to describe the vectors there in terms of a 2x1 matrix, since that requires the choice of a basis of the tangent space. And if you try to choose a basis of the tangent space, you'd probably want it to vary continuously from one point on the surface to the next, but that's not possible to do on many surfaces, such as a sphere, thanks to the Hairy Ball Theorem. [1]

[1] https://en.wikipedia.org/wiki/Hairy_ball_theorem


So, I would say that there is a canonical isomorphism between them but that they are not quite the same.

So, to take a step back, I'd say that a covector is a linear function from a vector input to a scalar output (those terms themselves being somewhat fungible, the space of "vectors" needs to probably form a module over the "scalars" which must themselves be a ring—the issue is that I would as a physicist include scalar fields but you can have two nonzero fields multiply to give a zero field, so they do not obey the field axioms).

Then I would say that an [m n]-tensor is technically a multilinear function from m covectors and n vectors to a scalar. This means that a [0 1]-tensor is really exactly a covector, but a [1 0]-tensor is not really exactly a vector. One can manifestly be constructed from any vector v (take any covector c and produce the scalar c(v)) and we generally have some criteria that I forget which ensures that this is not just an injection but also a bijection implying some inverse, so that these [1 0] "co-covectors" are indeed isomorphic to the vectors. And then of course this becomes much easier in a metric space where we identify a canonical bijection between vectors and covectors called the "metric", this gives a much more direct way by which co-covectors are isomorphic to vectors.

Hopefully that all makes sense. The situations in which this is not true are non-metric spaces where linear maps from (linear maps from vectors to scalars) to scalars are just a subset of the "apply to this vector" instances, but other linear maps are maybe also included.

Then there is a more dramatic statement which I think requires the space to be Hausdorff-paracompact or something, which is the idea that of course we have the outer product which naturally combines an [a b]-tensor and a [c d] tensor into an [a+c, b+d]-tensor ("take these vectors and covectors, give them to the first tensor, take those vectors and covectors, give them to the second tensor, then take the two scalar outputs and multiply them together), and we claim that actually an [m n]-tensor can always be realized as a finite sum of outer products of m [1 0]-tensors with n [0 1]-tensors, which allows you to do arbitrary index contractions in a geometric way.


Well if you think of matrices as linear maps then saying a vector is just an Nx1 matrix is like saying that a vector is just a linear map from a vector space of dimension 1 to a vector space of dimension N. It isn’t really a very useful thing to say


Sure it's useful, if not earth-shattering - another way of saying it is that a vector maps scalars to vectors on the same line.


It’s more of a perspective and opinion thing. For example I would not say that an integer is its twos-complement representation, but I am very happy to hold onto that representation as a kind of key for the “real thing”.

Similarly, in a vector space there can be abstract vectors, which exist no matter of what basis you choose to write them down. (Choosing a basis = choosing a way to encode them as a 1xN or Nx1 matrix).

Often in the theory, general theorems can be proven very efficiently using these abstract vectors, but most of the time when one wants to actually do things to those vectors, picking a convenient basis and working with N coordinates is the best choice.


You can do many linear algebra operations on the adjacency matrix. It's a proper matrix, in the full sense, not just a table of numbers.


I think if in any context one uses the usual matrix algebra, it is the same linear map, just in a more abstract setting.

If you tell me storing a lookup table is matrix, I think that is sort of out of context. But then again, I could be labelled a purist here, and I'm trying really hard to not fall into the trap.


Along with what others have mentioned, another application of adjacency matrix multiplication is in doing discrete calculus on meshes.

Pushing one forms to two forms via exterior differentiation, for example. The chief operation is a matrix multiplication with said adjacency matrix.


Yes, they pretty much are unless you're in the business of linear codes—then they're actually 1xN matrices. A pretty annoying convention for someone coming in from the general algebra: takes a time to stop mentally transposing everything.


Vectors are Nx1 matrices only once you choose an ordered basis!


The nth power of an adjacency matrix yields the number of length-n walks from any vertex to any other vertex.


And covectors 1xN matrices?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: