For pagination purposes, I need a run a query with the LIMIT
and OFFSET
clauses. But I also need a count of the number of rows that would be returned by that query without the LIMIT
and OFFSET
clauses.
I want to run:
SELECT * FROM table WHERE /* whatever */ ORDER BY col1 LIMIT ? OFFSET ?
And:
SELECT COUNT(*) FROM table WHERE /* whatever */
At the same time. Is there a way to do that, particularly a way that lets Postgres optimize it, so that it's faster than running both individually?
Yes. With a simple window function:
SELECT *, count(*) OVER() AS full_count
FROM tbl
WHERE /* whatever */
ORDER BY col1
OFFSET ?
LIMIT ?
Be aware that the cost will be substantially higher than without the total number, but typically still cheaper than two separate queries. Postgres has to actually count all rows either way, which imposes a cost depending on the total number of qualifying rows. Details:
Best way to get result count before LIMIT was applied
However, as Dani pointed out, when OFFSET
is at least as great as the number of rows returned from the base query, no rows are returned. So we also don't get full_count
.
If that's not acceptable, a possible workaround to always return the full count would be with a CTE and an OUTER JOIN
:
WITH cte AS (
SELECT *
FROM tbl
WHERE /* whatever */
)
SELECT *
FROM (
TABLE cte
ORDER BY col1
LIMIT ?
OFFSET ?
) sub
RIGHT JOIN (SELECT count(*) FROM cte) c(full_count) ON true;
You get one row of NULL values with the full_count
appended if OFFSET
is too big. Else, it's appended to every row like in the first query.
If a row with all NULL values is a possible valid result you have to check offset >= full_count
to disambiguate the origin of the empty row.
This still executes the base query only once. But it adds more overhead to the query and only pays if that's less than repeating the base query for the count.
If indexes supporting the final sort order are available, it might pay to include the ORDER BY
in the CTE (redundantly).
While Erwin Brandstetter's answer works like a charm, it returns the total count of rows in every row like following:
col1 - col2 - col3 - total
--------------------------
aaaa - aaaa - aaaa - count
bbbb - bbbb - bbbb - count
cccc - cccc - cccc - count
You may want to consider using an approach that returns total count only once, like the following:
total - rows
------------
count - [{col1: 'aaaa'},{col2: 'aaaa'},{col3: 'aaaa'}
{col1: 'bbbb'},{col2: 'bbbb'},{col3: 'bbbb'}
{col1: 'cccc'},{col2: 'cccc'},{col3: 'cccc'}]
SQL query:
SELECT
(SELECT COUNT(*)
FROM table
WHERE /* sth */
) as count,
(SELECT json_agg(t.*) FROM (
SELECT * FROM table
WHERE /* sth */
ORDER BY col1
OFFSET ?
LIMIT ?
) AS t) AS rows
WHERE
the count(*)
subquery otherwise you'll just get the entire table count won't you?
edit: this answer is valid when retrieving the unfiltered table. I'll let it in case it could help someone but it might not exactly answer the initial question.
Erwin Brandstetter's answer is perfect if you need an accurate value. However, on large tables you often only need a pretty good approximation. Postgres gives you just that and it will be much faster as it will not need to evaluate each row:
SELECT *
FROM (
SELECT *
FROM tbl
WHERE /* something */
ORDER BY /* something */
OFFSET ?
LIMIT ?
) data
RIGHT JOIN (SELECT reltuples FROM pg_class WHERE relname = 'tbl') pg_count(total_count) ON true;
I'm actually quite not sure if there is an advantage to externalize the RIGHT JOIN
or have it as in a standard query. It would deserve some testing.
SELECT t.*, pgc.reltuples AS total_count
FROM tbl as t
RIGHT JOIN pg_class pgc ON pgc.relname = 'tbl'
WHERE /* something */
ORDER BY /* something */
OFFSET ?
LIMIT ?
WHERE
clause in your queries. The second query it logically wrong (retrieves one row for every table in the DB) - and more expensive when fixed.
No.
There's perhaps some small gain you could theoretically gain over running them individually with enough complicated machinery under the hood. But, if you want to know how many rows match a condition you'll have to count them rather than just a LIMITed subset.
Success story sharing
MATERIALIZED
by default, being referenced twice.)