Please review the referenced blog posts for further details on performance tuning.
[1] https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/
[2] https://repost.aws/knowledge-center/rds-aurora-postgresql-performance-issues
When performance backup below error getting :
Error :
pg_dump: last built-in OID is 16383
pg_dump: reading extensions
pg_dump: identifying extension members
pg_dump: reading schemas
pg_dump: reading user-defined tables
pg_dump: error: query failed: SSL SYSCALL error: EOF detected
pg_dump: error: query was: SELECT c.tableoid, c.oid, c.relname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM
Solution
(1) Check Ram size and Disk Space .
(2) Run the below query and find the database scheme size
SELECT schema_name,
pg_size_pretty(sum(table_size)::bigint),
(sum(table_size) / pg_database_size(current_database())) * 100
FROM (
SELECT pg_catalog.pg_namespace.nspname as schema_name,
pg_relation_size(pg_catalog.pg_class.oid) as table_size
FROM pg_catalog.pg_class
JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid
) t
GROUP BY schema_name
ORDER BY schema_name
(3) If total size of data disk space is not available above error mentioned getting .So solution add disk space 10% more total no of database size and run the backup command
Troubleshoot high CPU
The pg_stat_statements extension helps identify queries that consume time on the server.
Mean or average execution time
For Postgres versions 13 and above, use the following statement to view the top five SQL statements by mean or average execution time:
SELECT userid::regrole, dbid, query, mean_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time
DESC LIMIT 5;
For Postgres versions 9.6, 10, 11, and 12, use the following statement to view the top five SQL statements by mean or average execution time:
SELECT userid::regrole, dbid, query
FROM pg_stat_statements
ORDER BY mean_time
DESC LIMIT 5;
Total execution time
Execute the following statements to view the top five SQL statements by total execution time.
For Postgres versions 13 and above, use the following statement to view the top five SQL statements by total execution time:
SELECT userid::regrole, dbid, query
FROM pg_stat_statements
ORDER BY total_exec_time
DESC LIMIT 5;
For Postgres versions 9.6, 10, 11, and 12, use the following statement to view the top five SQL statements by total execution time:
SELECT userid::regrole, dbid, query,
FROM pg_stat_statements
ORDER BY total_time
DESC LIMIT 5;
Long-running transactions can consume CPU resources that can lead to high CPU utilization.
The following query helps identify connections running for the longest time:
SELECT pid, usename, datname, query, now() - xact_start as duration
FROM pg_stat_activity
WHERE pid <> pg_backend_pid() and state IN ('idle in transaction', 'active')
ORDER BY duration DESC;
A large number of connections to the database is also another issue that might lead to increased CPU as well as memory utilization.
The following query gives information about the number of connections by state:
SELECT state, count(*)
FROM pg_stat_activity
WHERE pid <> pg_backend_pid()
GROUP BY 1 ORDER BY 1;
Use Explain Analyze, PG Bouncer, connection pooling and terminate long running transactions to resolve high CPU utilization.
In situations where there are lots of idle connections or lot of connections which are consuming the CPU consider use of a connection pooler like PgBouncer.
For more details about PgBouncer, review:
Azure Database for Flexible Server offers PgBouncer as a built-in connection pooling solution. For more information, see PgBouncer
SELECT pid, usename, datname, query, now() - xact_start as duration
FROM pg_stat_activity
WHERE pid <> pg_backend_pid() and state IN ('idle in transaction', 'active')
ORDER BY duration DESC;
Once you have the session's PID you can terminate using the following query:
SELECT pg_terminate_backend(pid);
Keeping table statistics up to date helps improve query performance. Monitor whether regular autovacuuming is being carried out.
select schemaname,relname,n_dead_tup,n_live_tup,last_vacuum,last_analyze,last_autovacuum,last_autoanalyze
from pg_stat_all_tables where n_live_tup > 0;
A short-term solution would be to do a manual vacuum analyze of the tables where slow queries are seen:
vacuum analyze <table_name>;
For more information about autovacuum troubleshooting and tuning, see Autovacuum Troubleshooting.
Reviewing PostgreSQL queries and suggesting solutions requires a structured approach that includes analyzing the SQL, understanding the schema and workload, and using PostgreSQL tools and best practices.
Here's a detailed step-by-step guide tailored for Database Architects or DBAs like yourself:
Before diving into the query:
What is the purpose of the query? (reporting, ETL, dashboard, real-time app, etc.)
Expected frequency & performance? (ad-hoc, daily job, milliseconds required?)
Volume of data? (rows involved, table size)
Check for:
Unnecessary DISTINCT, ORDER BY, GROUP BY
Inefficient subqueries or nested queries
Use of SELECT * (replace with explicit columns)
Cartesian joins or incorrect join logic
Simplify where possible — Common Table Expressions (CTEs) can be optimized or inlined in many cases
This is critical.
Run:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE) <your_query_here>;
Check for:
Seq Scan on large tables (may indicate missing indexes)
Nested Loops on large data sets
High cost values, rows vs actual rows mismatch
I/O-heavy operations (look at buffers)
Review tables in the FROM, JOIN, and WHERE clauses
Use:
SELECT * FROM pg_indexes WHERE tablename = 'your_table';
Suggest indexes based on:
Columns in WHERE, JOIN, ORDER BY, GROUP BY
Multicolumn index where needed
Index Only Scan possibility
Avoid over-indexing: balance write performance
Ensure the tables have up-to-date statistics:
ANALYZE your_table;
Or force autovacuum to catch up.
Use:
SELECT relname, n_live_tup, n_dead_tup FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;
Push filters as early as possible (before joins)
Use inner joins where applicable
Consider materialized views or temporary tables for complex reusable subqueries
Enable logging in postgresql.conf:
log_min_duration_statement = 1000 # log queries >1s
Then:
SELECT * FROM pg_stat_activity WHERE state = 'active';
Check for locking issues:
SELECT * FROM pg_locks l JOIN pg_stat_activity a ON l.pid = a.pid WHERE NOT granted;
Suggest partitioning for very large tables (time-based, ID-based)
Review autovacuum settings and dead tuples
If all else is optimized, and performance is still poor:
Tune parameters like:
work_mem, shared_buffers, effective_cache_size
parallel_workers, max_parallel_workers_per_gather
Use pgtune to recommend config based on server size
Tool
Use
EXPLAIN (ANALYZE)
Query performance insight
pg_stat_statements
Capture slow queries
auto_explain
Log slow parts of query plans
pgBadger
Analyze logs
PgHero / PostgreSQL Studio
Visual query inspection
DataGrip / DBeaver
UI-based explain & tuning
SELECT * FROM orders WHERE customer_id = 123;
Full table scan (Seq Scan) on 10M+ rows
customer_id not indexed
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
Would you like to share a specific query or EXPLAIN output? I can walk you through optimization suggestions live.