I'm basically looking to build a database backed spreadsheet system. A lite Google Sheets / AirTable / Excel per-se.

How do I architect the backend for such a system?

I know I can use PostgreSQL's JSON type to store dynamic row data but on my tests with big data, like 5M rows, sorting / grouping / search operations on keys in that JSON type take 10-20 seconds each.

I've also looked into the EAV architecture but I'm assuming the JSON type will work better.

The flow that I'm after:

  • There can be any number of spreadsheets
  • Each spreadsheet has dynamic number of columns. Should be able to add/remove columns without locking the entire DB.
  • Entire spreadsheet should be searchable / sortable / groupable fairly quickly.

Is there something I'm missing to make this work? Any suggestions are welcome!

