LaVOZs

The World’s Largest Online Community for Developers

'; sql - Most efficient way to SELECT DISTINCT ColA FROM LargeTableWithFewValuesForColA - LavOzs.Com

I have a large table (millions of rows).

I have to often get DISTINCT values of some columns. In my case, those columns actually have very few distinct values (a few to a few dozen)

What is the most efficient way of doing this?

Add an index on the column and then run:

select distinct column
from t;

To add to Gordons answer in large databases you could partition your data in addition to the index as well. Partitioning of data is like

  Table_1 (id) 
  Select distinct records from table
  Where id <1000
  Table_2 (id) 
  Select distinct records from table
  Where id >1000

 Actual table =table_1+table_2 (id)

Just a sample to illustrate this partition is not extra its actually the same table or db just that it gets split up on basis of unique column

Related
How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?
When should I use cross apply over inner join?
SQL Server query - Selecting COUNT(*) with DISTINCT
Improve INSERT-per-second performance of SQLite
How do I UPDATE from a SELECT in SQL Server?
Best way to select random rows PostgreSQL
Reset identity seed after deleting records in SQL Server
Why is “1000000000000000 in range(1000000000000001)” so fast in Python 3?