Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ok, I found the reference for this story [1]! It turns out I messed up some details, but the core of the story is true. (It was not a company but Alex Szalay [2] at JHU, and it was not an indexing but a layout issue.)

Jim asked about our "20 queries," his incisive way of learning about an application, as a deceptively simple way to jump-start a dialogue between him (a database expert) and me (an astronomer or any scientist). Jim said, "Give me your 20 most important questions you would like to ask of your data system and I will design the system for you. " It was amazing to watch how well this simple heuristic approach, combined with Jim's imagination, worked to produce quick results.

Jim then came to Baltimore to look over our computer room and within 30 seconds declared, with a grin, we had the wrong database layout. My colleagues and I were stunned. Jim explained later that he listened to the sounds the machines were making as they operated; the disks rattled too much, telling him there was too much random disk access. We began mapping SDSS database hardware requirements, projecting that in order to achieve acceptable performance with a 1TB data set we would need a GB/sec sequential read speed from the disks, translating to about 20 servers at the time. Jim was a firm believer in using "bricks," or the cheapest, simplest building blocks money could buy. We started experimenting with low-level disk IO on our inexpensive Dell servers, and our disks were soon much quieter and performing more efficiently.

[1] https://cacm.acm.org/magazines/2008/11/549-jim-gray-astronom...

[2] https://en.wikipedia.org/wiki/Alex_Szalay



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: