PostgreSQL- Performance Issues & Solution

Please review the referenced blog posts for further details on performance tuning.

[1] https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/

[2] https://repost.aws/knowledge-center/rds-aurora-postgresql-performance-issues

When performance backup below error getting :

Error :

pg_dump: last built-in OID is 16383

pg_dump: reading extensions

pg_dump: identifying extension members

pg_dump: reading schemas

pg_dump: reading user-defined tables

pg_dump: error: query failed: SSL SYSCALL error: EOF detected

pg_dump: error: query was: SELECT c.tableoid, c.oid, c.relname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM

Solution

(1) Check Ram size and Disk Space .

(2) Run the below query and find the database scheme size

SELECT schema_name,

pg_size_pretty(sum(table_size)::bigint),

(sum(table_size) / pg_database_size(current_database())) * 100

FROM (

SELECT pg_catalog.pg_namespace.nspname as schema_name,

pg_relation_size(pg_catalog.pg_class.oid) as table_size

FROM pg_catalog.pg_class

JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid

) t

GROUP BY schema_name

ORDER BY schema_name

(3) If total size of data disk space is not available above error mentioned getting .So solution add disk space 10% more total no of database size and run the backup command

Troubleshoot high CPU

https://docs.microsoft.com/en-us/azure/postgresql/flexible-server/how-to-high-cpu-utilization?tabs=postgres9-12#identify-root-causes

The pg_stat_statements extension helps identify queries that consume time on the server.

Mean or average execution time

For Postgres versions 13 and above, use the following statement to view the top five SQL statements by mean or average execution time:

SELECT userid::regrole, dbid, query, mean_exec_time

FROM pg_stat_statements

ORDER BY mean_exec_time

DESC LIMIT 5;

For Postgres versions 9.6, 10, 11, and 12, use the following statement to view the top five SQL statements by mean or average execution time:

SELECT userid::regrole, dbid, query

FROM pg_stat_statements

ORDER BY mean_time

DESC LIMIT 5;

Total execution time

Execute the following statements to view the top five SQL statements by total execution time.

For Postgres versions 13 and above, use the following statement to view the top five SQL statements by total execution time:

SELECT userid::regrole, dbid, query

FROM pg_stat_statements

ORDER BY total_exec_time

DESC LIMIT 5;

For Postgres versions 9.6, 10, 11, and 12, use the following statement to view the top five SQL statements by total execution time:

SELECT userid::regrole, dbid, query,

FROM pg_stat_statements

ORDER BY total_time

DESC LIMIT 5;

Identify root causes

"If CPU consumption levels are high in general, the following could be possible root causes: "

Long-running transactions

Long-running transactions can consume CPU resources that can lead to high CPU utilization.

The following query helps identify connections running for the longest time:

SELECT pid, usename, datname, query, now() - xact_start as duration

FROM pg_stat_activity

WHERE pid <> pg_backend_pid() and state IN ('idle in transaction', 'active')

ORDER BY duration DESC;

Total number of connections and number connections by state

A large number of connections to the database is also another issue that might lead to increased CPU as well as memory utilization.

The following query gives information about the number of connections by state:

SELECT state, count(*)

FROM pg_stat_activity

WHERE pid <> pg_backend_pid()

GROUP BY 1 ORDER BY 1;

Resolve high CPU utilization

Use Explain Analyze, PG Bouncer, connection pooling and terminate long running transactions to resolve high CPU utilization.

PGBouncer and connection pooling

In situations where there are lots of idle connections or lot of connections which are consuming the CPU consider use of a connection pooler like PgBouncer.

For more details about PgBouncer, review:

Connection Pooler

Best Practices

Azure Database for Flexible Server offers PgBouncer as a built-in connection pooling solution. For more information, see PgBouncer

Terminating long running transactions

SELECT pid, usename, datname, query, now() - xact_start as duration

FROM pg_stat_activity

WHERE pid <> pg_backend_pid() and state IN ('idle in transaction', 'active')

ORDER BY duration DESC;

Once you have the session's PID you can terminate using the following query:

SELECT pg_terminate_backend(pid);

Monitoring vacuum and table stats

Keeping table statistics up to date helps improve query performance. Monitor whether regular autovacuuming is being carried out.

select schemaname,relname,n_dead_tup,n_live_tup,last_vacuum,last_analyze,last_autovacuum,last_autoanalyze

from pg_stat_all_tables where n_live_tup > 0;

A short-term solution would be to do a manual vacuum analyze of the tables where slow queries are seen:

vacuum analyze <table_name>;

For more information about autovacuum troubleshooting and tuning, see Autovacuum Troubleshooting.

Reviewing PostgreSQL queries and suggesting solutions requires a structured approach that includes analyzing the SQL, understanding the schema and workload, and using PostgreSQL tools and best practices.

Here's a detailed step-by-step guide tailored for Database Architects or DBAs like yourself:

✅ Step-by-Step: How to Review PostgreSQL Queries and Suggest Solutions

1. Understand the Context

Before diving into the query:

What is the purpose of the query? (reporting, ETL, dashboard, real-time app, etc.)
Expected frequency & performance? (ad-hoc, daily job, milliseconds required?)
Volume of data? (rows involved, table size)

2. Review the SQL Syntax & Logic

Check for:
Unnecessary DISTINCT, ORDER BY, GROUP BY
Inefficient subqueries or nested queries
Use of SELECT * (replace with explicit columns)
Cartesian joins or incorrect join logic
Simplify where possible — Common Table Expressions (CTEs) can be optimized or inlined in many cases

3. Use EXPLAIN (ANALYZE)

This is critical.

Run:

EXPLAIN (ANALYZE, BUFFERS, VERBOSE) <your_query_here>;

Check for:

Seq Scan on large tables (may indicate missing indexes)
Nested Loops on large data sets
High cost values, rows vs actual rows mismatch
I/O-heavy operations (look at buffers)

4. Check for Missing or Inefficient Indexes

Review tables in the FROM, JOIN, and WHERE clauses
Use:

SELECT * FROM pg_indexes WHERE tablename = 'your_table';

Suggest indexes based on:
- Columns in WHERE, JOIN, ORDER BY, GROUP BY
- Multicolumn index where needed
- Index Only Scan possibility
Avoid over-indexing: balance write performance

5. Review Statistics & Analyze Tables

Ensure the tables have up-to-date statistics:

ANALYZE your_table;

Or force autovacuum to catch up.

Use:

SELECT relname, n_live_tup, n_dead_tup FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;

6. Optimize Joins and Filters

Push filters as early as possible (before joins)
Use inner joins where applicable
Consider materialized views or temporary tables for complex reusable subqueries

7. Monitor Long-Running Queries

Enable logging in postgresql.conf:

log_min_duration_statement = 1000 # log queries >1s

Then:

SELECT * FROM pg_stat_activity WHERE state = 'active';

8. Look for Locks and Blocking

Check for locking issues:

SELECT * FROM pg_locks l JOIN pg_stat_activity a ON l.pid = a.pid WHERE NOT granted;

9. Partitioning & Vacuuming Strategy

Suggest partitioning for very large tables (time-based, ID-based)
Review autovacuum settings and dead tuples

10. Suggest Postgres Configuration Tuning

If all else is optimized, and performance is still poor:

Tune parameters like:
- work_mem, shared_buffers, effective_cache_size
- parallel_workers, max_parallel_workers_per_gather
Use pgtune to recommend config based on server size

🛠️ Tools You Can Use

Tool

Use

EXPLAIN (ANALYZE)

Query performance insight

pg_stat_statements

Capture slow queries

auto_explain

Log slow parts of query plans

pgBadger

Analyze logs

PgHero / PostgreSQL Studio

Visual query inspection

DataGrip / DBeaver

UI-based explain & tuning

✅ Example Output: Query Review Summary

🔍 Problem Query:

SELECT * FROM orders WHERE customer_id = 123;

🚩 Observations:

Full table scan (Seq Scan) on 10M+ rows
customer_id not indexed

✅ Suggestion:

CREATE INDEX idx_orders_customer_id ON orders(customer_id);

Would you like to share a specific query or EXPLAIN output? I can walk you through optimization suggestions live.

Page updated

Google Sites

Report abuse