I'm not understanding this. Wouldn't the lower bound on performance be the single thread? Any worse performance and you queue the write requests to match the single thread performance.
As my test above shows, use a mutex for writes in Go directly and whatever locking performance problems SQLite exhibits disappear. I suspect because of the polling used in SQLite as described elsewhere on this page.