PostgresSQL-Admin


What is multi-version control?

Multi-version concurrency control or MVCC is used to avoid unnecessary locking of the database. This removes the time lag for the user to log into his database. This feature or time lag occurs when someone else is accessing the content. All the transactions are kept as a record.


PostgreSQL stores transaction IDs in the xmin and xmax system columns to determine which row version is visible to which query.

Two meanings of xmax

·        It stores the transaction ID (“xid”) of the transaction that deleted the tuple.

·        It stores row locks on the tuple


#-----------------------------------------------------------------------------------------------------------#

Provide an explanation About Write Ahead Logging?

WAL or write-ahead logging is a standard method to ensure data integrity. It is a protocol or the correct rule to write both actions and changes into a transaction log. This feature is known to increase the reliability of the database by logging changes before any changes or updating to the database. This provides the log of the database in case of a database crash. This helps to start the work from the point it was discontinued.

There are the number of data administration tools, and they are

Out of these, phppgadmin is the most popular one. Most of these tools are front-end administration tools and web-based interfaces

#---------------------------------------------------------------------------------

Backup and Restore Databases

This section shows you how to use various PostgreSQL backup and restore tools including pg_dump, pg_dumpall, psqlpg_restore and  pgAdmin to backup and restore databases.

Tablespace Management

PostgreSQL tablespaces allow you to control how data stored in the file system. The tablespaces are very useful in many cases such as managing large tables and improving database performance.


 #-----------------------------------------------------------------------------------------------------------#

Bloat

Database bloat is disk space that was used by a table or index and is available for reuse by the database but has not been reclaimed. Bloat is created when updating tables or indexes.

 It is very important to run VACUUM on individual tables after large UPDATE and DELETE operations

 

Detecting Bloat

SELECT * FROM gp_toolkit.gp_bloat_diag;

Removing Bloat from Database Tables

The VACUUM command adds expired rows to the free space map so that the space can be reused.  

It is better to run VACUUM before running ANALYZE. Analyzing a severely bloated table can generate poor statistics if the sample contains empty pages, so it is good practice to vacuum a bloated table before analyzing it.


#-----------------------------------------------------------------------------------------------------------#

Partitioning Improvements in PostgreSQL 13

 

Tables can be range-partitioned or list-partitioned.

·        List partitioning is limited to a single column, 

·        Range partitioning can involve multiple columns. A partitioning "column" can be an

expression.

Here are some key partitioning feature made it to PG over the years…

 

 

Partitioning

With very big tables, it can be advisable to use partitioning. The advantage here is you can have several autovacuum workers working on several partitions in parallel, so that the partitioned table as a whole is done faster than a single autovacuum worker could.

If you have many partitions, you should increase autovacuum_max_workers, the maximum number of autovacuum workers.


Memory

Memory

maintenance_work_mem  Larger settings might improve performance for vacuuming and for restoring database dumps.

autovacuum_work_memMaximum amount of memory to be used by each autovacuum worker process. It defaults to -1

Either of these two parameters sets the maximum size of memory that each autovacuum session can use. 

 

work_mem  Memory to be used by internal sort operations and hash tables before writing to temporary disk files

 

The case of long-running autovacuum sessions

There are sessions that have been running for a long time.

This situation usually indicates maintenance_work_mem is set too small for the size of the larger table.


Tuning ------

seq_page_cost =0.1

random_page_cost = 0.1

cpu_tuple_cost = 0.03

shared_buffers = 25% memory

work_mem =16mb

maintenance_work_mem =128mb


For refrence https://www.elephantsql.com/blog/what-is-work_mem-in-postgresql.html


Execution Plan

EXPLAIN [ ANALYZE ] [ VERBOSE ] statement


Execution Plan | Postgres Guide 



Operations on nodes



PgBouncer 

PgBouncer is an open-source, lightweight, single-binary connection pooler for PostgreSQL. It can pool connections to one or more databases (on possibly different servers) and serve clients over TCP and Unix domain sockets.


Pgbouncer supports three types of pooling when rotating connections: 

Loading data into a PostgreSQL DB instance

When loading data into an Amazon RDS PostgreSQL DB instance, you should modify your DB instance settings and your DB parameter group values to allow for the most efficient importing of data into your DB instance. 

Modify your DB instance settings to the following:

Modify your DB parameter group to include the following settings 


What can Cause Database Performance to Degrade?

·        Badly written queries with bad joins, logic etc. that take a lot of CPU and memory

·        Queries performing full-table-scans on big tables due to improper Indexing

·        Bad database maintenance with no proper statistics in place

·        Inefficient capacity planning resulting in inadequately dimensioned infrastructure

·        Improper logical and physical design

·        No connection pooling in place, which cause applications to make huge number of connections in an uncontrollable manner

Tuning PostgreSQL I/O

·        Indexing

·        Partitioning

·        Checkpoints

·        VACUUM, ANALYZE (with FILLFACTOR)

·        Other I/O problems

·        PostgreSQL I/O on Cloud

·        Tools

There are parameters in postgresql.conf which can be configured / tuned to control checkpoint behavior and those parameters are max_wal_size, min_wal_size, checkpoint_timeout and checkpoint_completion_target. These parameters will decide how frequently the checkpoints should occur, and within how much time the checkpoints have to finish. 


------------------------------------------------------------------------------------------------------------------------------------

postgresql.conf  

> psql -U postgres 

=# show hba_file; 

=# show config_file

https://stackoverflow.com/questions/3602450/where-does-postgresql-store-configuration-conf-files

------------------------------------------------------------------------------------------------------------------------------------