Installing the greenplum database

See also
Building an Infrastructure to Support Data Science Projects (Part 1 of 3) – Creating a Virtualized Environment.
Building an Infrastructure to Support Data Science Projects (Part 2 of 3) – Installing Greenplum with MADlib
Building an Infrastructure to Support Data Science Projects (Part 3 of 3) – Installing and Configuring R / RStudio with Pivotal Greenplum Integration


The Greenplum Database installer installs the following files and directories:
  1.  greenplum_path.sh - Greenplum Database environment variables.
  2.  GPDB-LICENSE.txt - Greenplum license agreement.
  3.  LICENSE.thirdparty - Licenses for third-party tools
  4.  bin - Greenplum Database server programs, client programs, and management tools.
  5.  demo - Greenplum Database demonstration programs.
  6.  docs - Greenplum Database documentation.
  7.  etc - Sample configuration file for OpenSSL.
  8.  ext - Bundled programs (such as Python) used by some Greenplum Database utilies.
  9.  include - Greenplum Database and PostgreSQL header files.
  10.  lib - Greenplum Database and PostgreSQL library files.
  11.  sbin - Supporting/Internal scripts and programs.
  12.  share - PostgreSQL sample files and templates.

Configuring Your Systems and Installing Greenplum

Before we begin the install process we need to configure our system first.

A) Pre-Install 

1. Make sure your systems meet the System Requirements
2. Setting the Greenplum Recommended OS Parameters

B) Install 

3.(master only) Running the Greenplum Installer
4.Installing and Configuring Greenplum on all Hosts
5.(Optional) Installing Oracle Compatibility Functions
6.(Optional) Installing Greenplum Database Extensions
7.Creating the Data Storage Areas
8.Synchronizing System Clocks

C) Post-Install

9. Validating Your Systems
10. Initializing a Greenplum Database System

Pre-Install  

1.Minimum recommended specifications for servers intended to support Greenplum Database in a production environment.

Operating System
SUSE Linux SLES 10.2 or higher
CentOS 5.0 or higher
RedHat Enterprise Linux 5.0 or higher
Oracle Unbreakable Linux 5.5
Solaris x86 v10 update 7

File Systems
- xfs required for data storage on SUSE Linux and Red Hat (ext3 supported for root file system)
- zfs required for data storage on Solaris (ufs supported for root file system)

Minimum CPU
Pentium Pro compatible (P3/Athlon and above)

Minimum Memory
16 GB RAM per server

Disk Requirements
-150MB per host for Greenplum installation
-Approximately 300MB per segment instance for meta data
-Appropriate free space for data with disks at no more than 70% capacity
-High-speed, local storage

Network Requirements
Gigabit Ethernet within the array
Dedicated, non-blocking switch

Software and Utilities
bash shell
GNU tar
GNU zip
GNU readline (Solaris only)

On Solaris platforms, you must have GNU Readline in your environment to support interactive Greenplum administrative utilities such as gpssh. Certified readline packages are available for download from the EMC Download Center.

2. Setting the Greenplum Recommended OS Parameters

Greenplum requires the certain operating system (OS) parameters be set on all hosts in your Greenplum Database system (masters and segments).

1. Linux System Settings

2. Solaris System Settings

3. Mac OS X System Settings

In general, the following categories of system parameters need to be altered:

Shared Memory - A Greenplum Database instance will not work unless the shared memory segment for your kernel is properly sized. Most default OS installations have the shared memory values set too low for Greenplum Database. On Linux systems, you must also disable the OOM (out of memory) killer.

Network - On high-volume Greenplum Database systems, certain network-related tuning parameters must be set to optimize network connections made by the Greenplum interconnect.

User Limits - User limits control the resources available to processes started by a user's shell. Greenplum Database requires a higher limit on the allowed number of file descriptors that a single process can have open. The default settings may cause some Greenplum Database queries to fail because they will run out of file descriptors needed to process the query.

Linux System Settings

Set the following parameters in the /etc/sysctl.conf file and reboot:

xfs_mount_options = rw,noatime,inode64,allocsize=16m
sysctl.kernel.shmmax = 500000000
sysctl.kernel.shmmni = 4096
sysctl.kernel.shmall = 4000000000
sysctl.kernel.sem = 250 512000 100 2048
sysctl.kernel.sysrq = 1
sysctl.kernel.core_uses_pid = 1
sysctl.kernel.msgmnb = 65536
sysctl.kernel.msgmax = 65536
sysctl.kernel.msgmni = 2048
sysctl.net.ipv4.tcp_syncookies = 1
sysctl.net.ipv4.ip_forward = 0
sysctl.net.ipv4.conf.default.accept_source_route = 0
sysctl.net.ipv4.tcp_tw_recycle = 1
sysctl.net.ipv4.tcp_max_syn_backlog = 4096
sysctl.net.ipv4.conf.all.arp_filter = 1
sysctl.net.ipv4.ip_local_port_range = 1025 65535
sysctl.net.core.netdev_max_backlog = 10000
sysctl.vm.overcommit_memory = 2

For RHEL version 6.x platforms, the above parameters do not include the sysctl. prefix, as follows:

xfs_mount_options = rw,noatime,inode64,allocsize=16m
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 2

Set the following parameters in the /etc/security/limits.conf file:

* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072

1. XFS is the preferred file system on Linux platforms for data storage. Greenplum recommends the following xfs mount options:
rw,noatime,inode64,allocsize=16m

2. The Linux disk I/O scheduler for disk access supports different policies, such as CFQ, AS, and deadline. 
Greenplum recommends the following scheduler option: deadline
To specify a scheduler, run the following:
# echo schedulername > /sys/block/devname/queue/scheduler
For example:
# echo deadline > /sys/block/sbd/queue/scheduler

3. Each disk device file should have a read-ahead (blockdev) value of 16384.
To verify the read-ahead value of a disk device:
# /sbin/blockdev --getra devname
For example:
# /sbin/blockdev --getra /dev/sdb
To set blockdev (read-ahead) on a device:
# /sbin/blockdev --setra bytes devname
For example:
# /sbin/blockdev --setra 16385 /dev/sdb

4. Edit the /etc/hosts file and make sure that it includes the host names and all interface address names for every machine participating in your Greenplum Database system.

Solaris System Settings

Set the following parameters in /etc/system:
set rlim_fd_cur=65536
set zfs:zfs_arc_max=0x600000000
set pcplusmp:apic_panic_on_nmi=1
set nopanicdebug=1

Change the following line in the /etc/project file from:
default:3::::
to:
default:3:default
project:::project.max-sem-ids=(priv,1024,deny);
process.max-file-descriptor=(priv,252144,deny)

Add the following line to /etc/user_attr:
gpadmin::::defaultpriv=basic,dtrace_user,dtrace_proc

Edit the /etc/hosts file and make sure that it includes all host names and interface address names for every machine participating in your Greenplum Database system.

Mac OS X System Settings

•Add the following to /etc/sysctl.conf:
kern.sysv.shmmax=2147483648
kern.sysv.shmmin=1
kern.sysv.shmmni=64
kern.sysv.shmseg=16
kern.sysv.shmall=524288
kern.maxfiles=65535
kern.maxfilesperproc=65535
net.inet.tcp.msl=60
•Add the following line to /etc/hostconfig:
HOSTNAME="your_hostname"

B) Install : 
Running the Greenplum Installer

To configure your systems for Greenplum Database, you will need certain utilities found in $GPHOME/bin of your installation. Log in as root and run the Greenplum installer on the machine that will be your master host.

To install the Greenplum binaries on the master host

1.Download or copy the installer file to the machine that will be the Greenplum Database master host. Installer files are available from Greenplum for RedHat (32-bit and 64-bit), Solaris 64-bit and SuSe Linux 64-bit platforms.

2.Unzip the installer file where PLATFORM is either RHEL5-i386 (RedHat 32-bit), RHEL5-x86_64 (RedHat 64-bit), SOL-x86_64 (Solaris 64-bit) or SuSE10-x86_64 (SuSe Linux 64 bit). For example:
# unzip greenplum-db-4.2.x.x-PLATFORM.zip

3.Launch the installer using bash. For example:
# /bin/bash greenplum-db-4.2.x.x-PLATFORM.bin

4.The installer will prompt you to accept the Greenplum Database license agreement. Type yes to accept the license agreement.

5.The installer will prompt you to provide an installation path. Press ENTER to accept the default install path (/usr/local/greenplum-db-4.2.x.x), or enter an absolute path to an install location. You must have write permissions to the location you specify.

6.Optional. The installer will prompt you to provide the path to a previous installation of Greenplum Database. For example: /usr/local/greenplum-db-4.2.x.x)
This installation step will migrate any Greenplum Database add-on modules (postgis, pgcrypto, etc.) from the previous installation path to the path to the version currently being installed. This step is optional and can be performed manually at any point after the installation using the gpkg utility with the -migrate option.
Press ENTER to skip this step.

7.The installer will install the Greenplum software and create a greenplum-db symbolic link one directory level above your version-specific Greenplum installation directory. The symbolic link is used to facilitate patch maintenance and upgrades between versions. The installed location is referred to as $GPHOME.
8.To perform additional required system configuration tasks and to install Greenplum Database on other hosts, go to the next task Installing and Configuring Greenplum on all Hosts.

About Your Greenplum Database Installation

•greenplum_path.sh — This file contains the environment variables for Greenplum Database. 
•GPDB-LICENSE.txt — Greenplum license agreement.
•bin — This directory contains the Greenplum Database management utilities. This directory also contains the PostgreSQL client and server programs, most of which are also used in Greenplum Database.
•demo — This directory contains the Greenplum demonstration programs.
•docs — The Greenplum Database documentation (PDF files).
•etc — Sample configuration file for OpenSSL.
•ext — Bundled programs (such as Python) used by some Greenplum Database utilities.
•include — The C header files for Greenplum Database.
•lib — Greenplum Database and PostgreSQL library files.
•sbin — Supporting/Internal scripts and programs.
•share — Shared files for Greenplum Database.

Installing and Configuring Greenplum on all Hosts
When run as root, gpseginstall copies the Greenplum Database installation from the current host and installs it on a list of specified hosts, creates the Greenplum system user (gpadmin), sets the system user’s password (default is changeme), sets the ownership of the Greenplum Database installation directory, and exchanges ssh keys between all specified host address names (both as root and as the specified system user).

About gpadmin
When a Greenplum Database system is first initialized, the system contains one predefined superuser role (also referred to as the system user), gpadmin. This is the user who owns and administers the Greenplum Database.

Note: If you are setting up a single node system, you can still use gpseginstall to perform the required system configuration tasks on the current host. In this case, the hostfile_exkeys would just have the current host name only.

To install and configure Greenplum Database on all specified hosts

1.Log in to the master host as root:
$ su -

2.Source the path file from your master host’s Greenplum Database installation directory:
# source /usr/local/greenplum-db/greenplum_path.sh

3.Create a file called hostfile_exkeys that has the machine configured host names and host addresses (interface names) for each host in your Greenplum system (master, standby master and segments). Make sure there are no blank lines or extra spaces. For example, if you have a master, standby master and three segments with two network interfaces per host, your file would look something like this:
mdw
mdw-1
mdw-2
smdw
smdw-1
smdw-2
sdw1
sdw1-1
sdw1-2
sdw2
sdw2-1
sdw2-2
sdw3
sdw3-1
sdw3-2

Note: Check your systems’ /etc/hosts files for the correct host names to use for your environment.

4.Run the gpseginstall utility referencing the hostfile_exkeys file you just created. Use the -u and -p options to create the Greenplum system user (gpadmin) on all hosts and set the password for that user on all hosts. For example:
# gpseginstall -f hostfile_exkeys -u gpadmin -p P@$$word

Recommended security best practices:
  • Do not use the default password option for production environments.
  • Change the password immediately after installation.

Confirming Your Installation
To make sure the Greenplum software was installed and configured correctly, run the following confirmation steps from your Greenplum master host. If necessary, correct any problems before continuing on to the next task.

1.Log in to the master host as gpadmin:
$ su - gpadmin

2.Source the path file from Greenplum Database installation directory:
# source /usr/local/greenplum-db/greenplum_path.sh

3.Use the gpssh utility to see if you can login to all hosts without a password prompt, and to confirm that the Greenplum software was installed on all hosts. Use the hostfile_exkeys file you used for installation. For example:
$ gpssh -f hostfile_exkeys -e ls -l $GPHOME

If the installation was successful, you should be able to log in to all hosts without a password prompt. All hosts should show that they have the same contents in their installation directories, and that the directories are owned by the gpadmin user.

If you are prompted for a password, run the following command to redo the ssh key exchange:
$ gpssh-exkeys -f hostfile_exkeys

Installing Oracle Compatibility Functions
Optional. Many Oracle Compatibility SQL functions are available in Greenplum Database. These functions target PostgreSQL.
Before using any Oracle Compatibility Functions, you need to run the installation script $GPHOME/share/postgresql/contrib/orafunc.sql once for each database. For example, to install the functions in database testdb, use the command 
$ psql –d testdb –f \
$GPHOME/share/postgresql/contrib/orafunc.sql
To uninstall Oracle Compatibility Functions, use the script:
$GPHOME/share/postgresql/contrib/uninstall_orafunc.sql.
Note: The following functions are available by default and can be accessed without running the Oracle Compatibility installer: sinh, tanh, cosh and decode.
For more information about Greenplum’s Oracle compatibility functions, see the Oracle Compatibility Functions appendix of the Greenplum Database Administrator Guide.

Installing Greenplum Database Extensions
Optional. Use the Greenplum package manager (gppkg) to install Greenplum Database extensions such as pgcrypto, PL/R, PL/Java, PL/Perl, and PostGIS, along with their dependencies, across an entire cluster. The package manager also integrates with existing scripts so that any packages are automatically installed on any new hosts introduced into the system following cluster expansion or segment host recovery.

Creating the Data Storage Areas
Every Greenplum Database master and segment instance has a designated storage area on disk that is called the data directory location. This is the file system location where the directories that store segment instance data will be created. The master host needs a data storage location for the master data directory. Each segment host needs a data directory storage location for its primary segments, and another for its mirror segments.

To create the data directory location on the master
The data directory location on the master is different than those on the segments. The master does not store any user data, only the system catalog tables and system metadata are stored on the master instance, therefore you do not need to designate as much storage space as on the segments.

1.Create or choose a directory that will serve as your master data storage area. This directory should have sufficient disk space for your data and be owned by the gpadmin user and group. For example, run the following commands as root:
# mkdir /data/master

2.Change ownership of this directory to the gpadmin user. For example:
# chown gpadmin /data/master

3.Using gpssh, create the master data directory location on your standby master as well. For example:
# source /usr/local/greenplum-db-4.2.x.x/greenplum_path.sh
# gpssh -h smdw -e 'mkdir /data/master'
# gpssh -h smdw -e 'chown gpadmin /data/master'

To create the data directory locations on all segment hosts

1.On the master host, log in as root:
# su

2.Create a file called hostfile_gpssh_segonly. This file should have only one machine configured host name for each segment host. For example, if you have three segment hosts:
sdw1
sdw2
sdw3

3.Using gpssh, create the primary and mirror data directory locations on all segment hosts at once using the hostfile_gpssh_segonly file you just created. For example:
# source /usr/local/greenplum-db-4.2.x.x/greenplum_path.sh
# gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/primary'
# gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/mirror'
# gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/primary'
# gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/mirror'

Synchronizing System Clocks
Greenplum recommends using NTP (Network Time Protocol) to synchronize the system clocks on all hosts that comprise your Greenplum Database system. See www.ntp.org for more information about NTP.

NTP on the segment hosts should be configured to use the master host as the primary time source, and the standby master as the secondary time source. On the master and standby master hosts, configure NTP to point to your preferred time server.

To configure NTP

1.On the master host, log in as root and edit the /etc/ntp.conf file. Set the server parameter to point to your data center’s NTP time server. For example (if 10.6.220.20 was the IP address of your data center’s NTP server):
server 10.6.220.20

2.On each segment host, log in as root and edit the /etc/ntp.conf file. Set the first server parameter to point to the master host, and the second server parameter to point to the standby master host. For example:
server mdw prefer
server smdw

3.On the standby master host, log in as root and edit the /etc/ntp.conf file. Set the first server parameter to point to the primary master host, and the second server parameter to point to your data center’s NTP time server. For example:
server mdw prefer
server 10.6.220.20

4.On the master host, use the NTP daemon synchronize the system clocks on all Greenplum hosts. For example using gpssh:
# gpssh -f hostfile_gpssh_allhosts -v -e 'ntpd'

C) Post-Install

Greenplum provides the following utilities to validate the configuration and performance of your systems:

•gpcheck

•gpcheckperf 

Note: These utilities can be found in $GPHOME/bin of your Greenplum installation.

The following tests should be run prior to initializing your Greenplum Database system.

•Validating OS Settings

•Validating Hardware Performance

Validating OS Settings
Greenplum provides a utility called gpcheck that can be used to verify that all hosts in your array have the recommended OS settings for running a production Greenplum Database system. To run gpcheck:

1.Log in on the master host as the gpadmin user.

2.Source the greenplum_path.sh path file from your Greenplum installation. For example:
$ source /usr/local/greenplum-db/greenplum_path.sh

3.Create a file called hostfile_gpcheck that has the machine-configured host names of each Greenplum host (master, standby master and segments), one host name per line. Make sure there are no blank lines or extra spaces. This file should just have a single host name per host. For example:
mdw
smdw
sdw1
sdw2
sdw3

4.Run the gpcheck utility using the host file you just created. For example:
$ gpcheck -f hostfile_gpcheck -m mdw -s smdw

5.After gpcheck finishes verifying OS parameters on all hosts (masters and segments), you might be prompted to modify certain OS parameters before initializing your Greenplum Database system.

Validating Hardware Performance
Greenplum provides a management utility called gpcheckperf, which can be used to identify hardware and system-level issues on the machines in your Greenplum Database array. gpcheckperf starts a session on the specified hosts and runs the following performance tests:

•Network Performance (gpnetbench*)
•Disk I/O Performance (dd test)
•Memory Bandwidth (stream test)

Before using gpcheckperf, you must have a trusted host setup between the hosts involved in the performance test. You can use the utility gpssh-exkeys to update the known host files and exchange public keys between hosts if you have not done so already. Note that gpcheckperf calls to gpssh and gpscp, so these Greenplum utilities must be in your $PATH.

Validating Network Performance
To test network performance, run gpcheckperf with one of the network test run options: parallel pair test (-r N), serial pair test (-r n), or full matrix test (-r M). The utility runs a network benchmark program that transfers a 5 second stream of data from the current host to each remote host included in the test. By default, the data is transferred in parallel to each remote host and the minimum, maximum, average and median network transfer rates are reported in megabytes (MB) per second. If the summary transfer rate is slower than expected (less than 100 MB/s), you can run the network test serially using the -r n option to obtain per-host results. To run a full-matrix bandwidth test, you can specify -r M which will cause every host to send and receive data from every other host specified. This test is best used to validate if the switch fabric can tolerate a full-matrix workload.

Most systems in a Greenplum Database array are configured with multiple network interface cards (NICs), each NIC on its own subnet. When testing network performance, it is important to test each subnet individually. For example, considering the following network configuration of two NICs per host:

Example Network Interface Configuration
Greenplum Host
Subnet1 NICs
Subnet2 NICs
Segment 1
sdw1-1
sdw1-2
Segment 2
sdw2-1
sdw2-2
Segment 3
sdw3-1
sdw3-2

You would create four distinct host files for use with the gpcheckperf network test:

Example Network Test Host File Contents
hostfile_gpchecknet_ic1
hostfile_gpchecknet_ic2
sdw1-1
sdw1-2
sdw2-1
sdw2-2
sdw3-1
sdw3-2

You would then run gpcheckperf once per subnet. For example (if testing an even number of hosts, run in parallel pairs test mode):
$ gpcheckperf -f hostfile_gpchecknet_ic1 -r N -d /tmp > subnet1.out
$ gpcheckperf -f hostfile_gpchecknet_ic2 -r N -d /tmp > subnet2.out
If you have an odd number of hosts to test, you can run in serial test mode (-r n).

Validating Disk I/O and Memory Bandwidth
To test disk and memory bandwidth performance, run gpcheckperf with the disk and stream test run options (-r ds). The disk test uses the dd command (a standard UNIX utility) to test the sequential throughput performance of a logical disk or file system. The memory test uses the STREAM benchmark program to measure sustainable memory bandwidth. Results are reported in MB per second (MB/s).

To run the disk and stream tests

1.Log in on the master host as the gpadmin user.

2.Source the greenplum_path.sh path file from your Greenplum installation. For example:
$ source /usr/local/greenplum-db/greenplum_path.sh

3.Create a host file named hostfile_gpcheckperf that has one host name per segment host. Do not include the master host. For example:
sdw1
sdw2
sdw3
sdw4

4.Run the gpcheckperf utility using the hostfile_gpcheckperf file you just created. Use the -d option to specify the file systems you want to test on each host (you must have write access to these directories). You will want to test all primary and mirror segment data directory locations. For example:
$ gpcheckperf -f hostfile_gpcheckperf -r ds -D \
-d /data1/primary -d /data2/primary \
-d /data1/mirror -d /data2/mirror

5.The utility may take a while to perform the tests as it is copying very large files between the hosts. When it is finished you will see the summary results for the Disk Write, Disk Read, and Stream tests.


Configuring Localization Settings

Greenplum Database supports localization with two approaches:

•Using the locale features of the operating system to provide locale-specific collation order, number formatting, and so on.
•Providing a number of different character sets defined in the Greenplum Database server, including multiple-byte character sets, to support storing text in all kinds of languages, and providing character set translation between client and server.

Locale support refers to an application respecting cultural preferences regarding alphabets, sorting, number formatting, etc. Greenplum Database uses the standard ISO C and POSIX locale facilities provided by the server operating system. For additional information refer to the documentation of your operating system.
Locale support is automatically initialized when a Greenplum Database system is initialized. The initialization utility, gpinitsystem, will initialize the Greenplum array with the locale setting of its execution environment by default, so if your system is already set to use the locale that you want in your Greenplum Database system then there is nothing else you need to do.

When you are ready to initiate Greenplum Database and you want to use a different locale (or you are not sure which locale your system is set to), you can instruct gpinitsystem exactly which locale to use by specifying the -n locale option. For example:
$ gpinitsystem -c gp_init_config -n sv_SE

The example above sets the locale to Swedish (sv) as spoken in Sweden (SE). Other possibilities might be en_US (U.S. English) and fr_CA (French Canadian). If more than one character set can be useful for a locale then the specifications look like this: cs_CZ.ISO8859-2. What locales are available under what names on your system depends on what was provided by the operating system vendor and what was installed. On most systems, the command locale -a will provide a list of available locales.

Occasionally it is useful to mix rules from several locales, for example use English collation rules but Spanish messages. To support that, a set of locale subcategories exist that control only a certain aspect of the localization rules:

•LC_COLLATE — String sort order
•LC_CTYPE — Character classification (What is a letter? Its upper-case equivalent?)
•LC_MESSAGES — Language of messages
•LC_MONETARY — Formatting of currency amounts
•LC_NUMERIC — Formatting of numbers
•LC_TIME — Formatting of dates and times

If you want the system to behave as if it had no locale support, use the special locale C or POSIX.

The nature of some locale categories is that their value has to be fixed for the lifetime of a Greenplum Database system. That is, once gpinitsystem has run, you cannot change them anymore. LC_COLLATE and LC_CTYPE are those categories. They affect the sort order of indexes, so they must be kept fixed, or indexes on text columns will become corrupt. Greenplum Database enforces this by recording the values of LC_COLLATE and LC_CTYPE that are seen by gpinitsystem. The server automatically adopts those two values based on the locale that was chosen at initialization time.

The other locale categories can be changed as desired whenever the server is running by setting the server configuration parameters that have the same name as the locale categories (see the Greenplum Database Administrator Guide for more information on setting server configuration parameters). The defaults that are chosen by gpinitsystem are written into the master and segment postgresql.conf configuration files to serve as defaults when the Greenplum Database system is started. If you delete these assignments from the master and each segment postgresql.conf files then the server will inherit the settings from its execution environment.

Note that the locale behavior of the server is determined by the environment variables seen by the server, not by the environment of any client. Therefore, be careful to configure the correct locale settings on each Greenplum Database host (master and segments) before starting the system. A consequence of this is that if client and server are set up in different locales, messages may appear in different languages depending on where they originated.

Inheriting the locale from the execution environment means the following on most operating systems: For a given locale category, say the collation, the following environment variables are consulted in this order until one is found to be set: LC_ALL, LC_COLLATE (the variable corresponding to the respective category), LANG. If none of these environment variables are set then the locale defaults to C.

Some message localization libraries also look at the environment variable LANGUAGE which overrides all other locale settings for the purpose of setting the language of messages. If in doubt, please refer to the documentation of your operating system, in particular the documentation about gettext, for more information.
Native language support (NLS), which enables messages to be translated to the user’s preferred language, is not enabled in Greenplum Database for languages other than English. This is independent of the other locale support.

Locale Behavior
The locale settings influence the following SQL features:

•Sort order in queries using ORDER BY on textual data
•The ability to use indexes with LIKE clauses
•The upper, lower, and initcap functions
•The to_char family of functions

The drawback of using locales other than C or POSIX in Greenplum Database is its performance impact. It slows character handling and prevents ordinary indexes from being used by LIKE. For this reason use locales only if you actually need them.

Troubleshooting Locales
If locale support does not work as expected, check that the locale support in your operating system is correctly configured. To check what locales are installed on your system, you may use the command locale -a if your operating system provides it.

Check that Greenplum Database is actually using the locale that you think it is. LC_COLLATE and LC_CTYPE settings are determined at initialization time and cannot be changed without redoing gpinitsystem. Other locale settings including LC_MESSAGES and LC_MONETARY are initially determined by the operating system environment of the master and/or segment host, but can be changed after initialization by editing the postgresql.conf file of each Greenplum master and segment instance. You can check the active locale settings of the master host using the SHOW command. Note that every host in your Greenplum Database array should be using identical locale settings.

Initializing a Greenplum Database System

Because Greenplum Database is distributed, the process for initializing a Greenplum Database management system (DBMS) involves initializing several individual PostgreSQL database instances (called segment instances in Greenplum).

Each database instance (the master and all segments) must be initialized across all of the hosts in the system in such a way that they can all work together as a unified DBMS. Greenplum provides its own version of initdb called gpinitsystem, which takes care of initializing the database on the master and on each segment instance, and starting each instance in the correct order.

After the Greenplum Database database system has been initialized and started, you can then create and manage databases as you would in a regular PostgreSQL DBMS by connecting to the Greenplum master.

Initializing Greenplum Database

These are the high-level tasks for initializing Greenplum Database:

1.Make sure you have completed all of the installation tasks described in “Configuring Your Systems and Installing Greenplum”.

2.Create a host file that contains the host addresses of your segments. 

3.Create your Greenplum Database system configuration file. 

4.By default, Greenplum Database will be initialized using the locale of the master host system. Make sure this is the correct locale you want to use, as some locale options cannot be changed after initialization. 

5.Run the Greenplum Database initialization utility on the master host. 

Creating the Initialization Host File
The gpinitsystem utility requires a host file that contains the list of addresses for each segment host. The initialization utility determines the number of segment instances per host by the number host addresses listed per host times the number of data directory locations specified in the gpinitsystem_config file.
This file should only contain segment host addresses (not the master or standby master). For segment machines with more than one network interface, this file should list the host address names for each interface — one per line.

To create the initialization host file

1.Log in as gpadmin.
$ su - gpadmin

2.Create a file named hostfile_gpinitsystem. In this file add the host address name(s) of your segment host interfaces, one name per line, no extra lines or spaces. For example, if you have four segment hosts with two network interfaces each:
sdw1-1
sdw1-2
sdw2-1
sdw2-2
sdw3-1
sdw3-2
sdw4-1
sdw4-2

3.Save and close the file.
Note: If you are not sure of the host names and/or interface address names used by your machines, look in the /etc/hosts file.

Creating the Greenplum Database Configuration File
Your Greenplum Database configuration file tells the gpinitsystem utility how you want to configure your Greenplum Database system. An example configuration file can be found in $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config.

To create a gpinitsystem_config file

1.Log in as gpadmin.
$ su - gpadmin

2.Make a copy of the gpinitsystem_config file to use as a starting point. For example:
$ cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config

3.Open the file you just copied in a text editor.
Set all of the required parameters according to your environment. A Greenplum Database system must contain a master instance and at least two segment instances (even if setting up a single node system).
The DATA_DIRECTORY parameter is what determines how many segments per host will be created. If your segment hosts have multiple network interfaces, and you used their interface address names in your host file, the number of segments will be evenly spread over the number of available interfaces.
Here is an example of the required parameters in the gpinitsystem_config file:
ARRAY_NAME="EMC Greenplum DW"
SEG_PREFIX=gpseg
PORT_BASE=40000
declare -a DATA_DIRECTORY=(/data1/primary /data1/primary /data1/primary /data2/primary /data2/primary /data2/primary)
MASTER_HOSTNAME=mdw
MASTER_DIRECTORY=/data/master
MASTER_PORT=5432
TRUSTED SHELL=ssh
CHECK_POINT_SEGMENT=8
ENCODING=UNICODE

4.(optional) If you want to deploy mirror segments, uncomment and set the mirroring parameters according to your environment. Here is an example of the optional mirror parameters in the gpinitsystem_config file:
MIRROR_PORT_BASE=50000
REPLICATION_PORT_BASE=41000
MIRROR_REPLICATION_PORT_BASE=51000
declare -a MIRROR_DATA_DIRECTORY=(/data1/mirror /data1/mirror /data1/mirror /data2/mirror /data2/mirror /data2/mirror)
Note: You can initialize your Greenplum system with primary segments only and deploy mirrors later using the gpaddmirrors utility.

5.Save and close the file.

Running the Initialization Utility
The gpinitsystem utility will create a Greenplum Database system using the values defined in the configuration file. 
To run the initialization utility

1.Run the following command referencing the path and file name of your initialization configuration file (gpinitsystem_config) and host file (hostfile_gpinitsystem). For example:
$ cd ~
$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem
For a fully redundant system (with a standby master and a spread mirror configuration) include the -s and -S options. For example:
$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem -s standby_master_hostname -S

2.The utility will verify your setup information and make sure it can connect to each host and access the data directories specified in your configuration. If all of the pre-checks are successful, the utility will prompt you to confirm your configuration. For example:
=> Continue with Greenplum creation? Yy/Nn

3.Press y to start the initialization.

4.The utility will then begin setup and initialization of the master instance and each segment instance in the system. Each segment instance is set up in parallel. Depending on the number of segments, this process can take a while.

5.At the end of a successful setup, the utility will start your Greenplum Database system. You should see:
=> Greenplum Database instance successfully created.

Troubleshooting Initialization Problems
If the utility encounters any errors while setting up an instance, the entire process will fail, and could possibly leave you with a partially created system. Refer to the error messages and logs to determine the cause of the failure and where in the process the failure occurred. Log files are created in ~/gpAdminLogs.

Depending on when the error occurred in the process, you may need to clean up and then try the gpinitsystem utility again. For example, if some segment instances were created and some failed, you may need to stop postgres processes and remove any utility-created data directories from your data storage area(s). A backout script is created to help with this cleanup if necessary.

Using the Backout Script
If the gpinitsystem utility fails, it will create the following backout script if it has left your system in a partially installed state:
~/gpAdminLogs/backout_gpinitsystem_<user>_<timestamp>

You can use this script to clean up a partially created Greenplum Database system. This backout script will remove any utility-created data directories, postgres processes, and log files. After correcting the error that caused gpinitsystem to fail and running the backout script, you should be ready to retry initializing your Greenplum Database array.

The following example shows how to run the backout script:
$ sh backout_gpinitsystem_gpadmin_20071031_121053

Setting Greenplum Environment Variables
You must configure your environment on the Greenplum Database master (and standby master). A greenplum_path.sh file is provided in your $GPHOME directory with environment variable settings for Greenplum Database. You can source this file in the gpadmin user’s startup shell profile (such as .bashrc).

The Greenplum Database management utilities also require that the MASTER_DATA_DIRECTORY environment variable be set. This should point to the directory created by the gpinitsystem utility in the master data directory location.

To set up your user environment for Greenplum

1.Make sure you are logged in as gpadmin:
$ su - gpadmin

2.Open your profile file (such as .bashrc) in a text editor. For example:
$ vi ~/.bashrc

3.Add lines to this file to source the greenplum_path.sh file and set the MASTER_DATA_DIRECTORY environment variable. For example:
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1

4.(optional) You may also want to set some client session environment variables such as PGPORT, PGUSER and PGDATABASE for convenience. For example:
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=default_login_database_name

5.Save and close the file.

6.After editing the profile file, source it to make the changes active. For example:
$ source ~/.bashrc

7.If you have a standby master host, copy your environment file to the standby master as well. For example:
$ cd ~
$ scp .bashrc standby_hostname:`pwd`
Note: The .bashrc file should not produce any output. If you wish to have a message display to users upon logging in, use the .profile file instead.

Installing Greenplum SNE on Ubuntu Linux

posted Apr 23, 2013, 12:42 PM by Sachchida Ojha

Before install Greenplum 4 on Ubuntu you have to do some additional tricks.

1# convince the Greenplum installer that it is on a RedHat/CentOS
echo "Trick for install Greenplum 4" > /etc/redhat-release
2# install the libnuma library (if not present)

apt-get install libnuma1
3# uncomment the line containing "session required pam_limits.so" in /etc/pam.d/su
4# use the following enhanced version of the fix-libs.sh script

#!/bin/bash

if [ -z "$GPHOME"]||[ ! -d $GPHOME/lib ]; then
echo "Missing or wrong GPHOME environment variable";
exit 255
fi

cd $GPHOME/lib

# libraries shipped with Greenplum SNE
gplibs="$(find -maxdepth 1 -type f | cut -f 2 -d /)"

# libraries with same abi installed via dpkg
deblibs="$(dpkg -S $gplibs 2> /dev/null | cut -f 2 -d ' ')"

# we remove the greenplum one to avoid "no version information available" errors
for lib in $deblibs; do
ver=$(basename $lib)
rm -fv $ver
while [ $ver = ${ver#.so} ]&&[ $ver != ${ver%.so*} ]; do
ver=${ver%.*}
rm -fv $ver
done
done

After the installation you can remove the file /etc/redhat-release, if you want. I've tested this procedure on a freshly installed Ubuntu 10.10 server 64 bit.

Officially Greenplum Database Single Node Edition (SNE) is only installable on Red Hat Enterprise Linux (RHEL) and SUSE Linux Enteprise Server (SLES), but while surfing the web I have seen many requests on how to install it on Debian/Ubuntu. Here I’m trying to give you some advices.

Before installing Greenplum Database CE, you need to adjust the following OS configuration parameters:
Set the following parameters in the `/etc/sysctl.conf` file:
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 64000 100 512
net.ipv4.tcp_tw_recycle=1
net.ipv4.tcp_max_syn_backlog=4096
net.core.netdev_max_backlog=10000
vm.overcommit_memory=2

To activate such parameters you can either run `sudo sysctl -p` or reboot the system.

Set the following parameters in the `/etc/security/limits.conf` file:
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072

In the file /etc/hosts comment out the line beginning with `::1`, as it could confuse the database when it resolves the hostname for localhost. Also make sure either localhost and your hostname is resolvable to a local address.

Now you have done preparing the environment for your Greenplum Database SNE. The next step is to create the user account designated to be the administrator of your installation, usually this user is called gpadmin.

sudo adduser –gecos “Greenplum Administrator” gpadmin

At this point you have to download or copy the installer file to the system. You should choose the RHEL installer for your architecture. I have a x86_64 so from now on I will use it as example.

To start the installation run the following commands (you need the unzip program installed):

unzip greenplum<versionx>.zip

sudo bash greenplum<version>.bin

Follow the on screen instructions. Accept the license and choose the installation path. The default one is fine. The installer will create a `greenplum-db` symbolic link one directory level above your chosen installation directory. The symbolic link is used to facilitate patch maintenance and upgrades between versions. From now on the install location will be referred to as `$GPHOME`.

Change the ownership of the installation so that it is owned by the gpadmin user and group.

sudo chown -R gpadmin:gpadmin $GPHOME

Now is the time to choose the data directory location, to explain how to choose nothing is better of quoting the official quick-start guide.

 Every Greenplum Database CE instance has a designated storage area on disk that  is called the data directory location. This is the file system location where the database  data is stored. In the Greenplum Database CE, you initialize a Greenplum Database CE master instance and two or more segment instances on the same system, each  requiring a data directory location. These directories should have sufficient disk space  for your data and be owned by the gpadmin user.

Remember that the data directories of the segment instances are where the user data resides, so they must have enough disk space to accommodate your planned data capacity. For the master instance, only the system catalog tables and system  metadata are stored in the master data directory.

For this guide we will use the default layout, with the master (`/gpmaster`) and two segments (`/gpdata1` and `/gpdata2`). 

sudo mkdir /gpmaster /gpdata1 /gpdata2

sudo chown gpadmin:gpadmin /gpmaster /gpdata1 /gpdata2

A `greenplum_path.sh` file is provided in your `$GPHOME` directory with environment variable settings for Greenplum Database SNE. You should source this in the gpadmin user’s startup shell profile (such as `.bashrc`) adding a line like the following:

source /usr/local/greenplum-db/greenplum_path.sh

Before to continue we should do some magics to avoid failures running programs from Ubuntu with libraries shipped by Greenplum CE.

#!/bin/sh
cd $GPHOME/lib

# libraries shipped with Greenplum CE
gplibs=”$(find -maxdepth 1 -type f | cut -f 2 -d /)”

# libraries with same abi installed via dpkg
deblibs=”$(dpkg -S $gplibs 2> /dev/null | cut -f 2 -d ‘ ‘)”

# we remove the greenplum one to avoid “no version information available” errors
for lib in $deblibs; do
rm -f $(basename $lib)
done

It’s now time to initialize the database system, all the following steps are to be executed as gpadmin user.

su – gpadmin

cp $GPHOME/docs/cli_help/single_hostlist_example ./single_hostlist

cp $GPHOME/docs/cli_help/gp_init_singlenode_example ./gp_init_singlenode

If you do not want to use the default configuration, data directory locations, ports, or other configuration options, edit the `gp_init_singlenode` file and enter your configuration settings.

Run the gpssh-exkeys utility to exchange ssh keys for the local host:

gpssh-exkeys -h 127.0.0.1 -h localhost

Run the following command to initialize the database:

gpinitsystem -c gp_init_singlenode

The utility verifies your setup information and makes sure that the data directories specified in the `gp_init_singlenode` configuration file are accessible. If all of the verification checks are successful, the utility prompts you to confirm the configuration before creating the system.

At the end of a successful setup, the utility starts your system. You should see:

Greenplum Database instance successfully created.

The management utilities require that you set the `MASTER_DATA_DIRECTORY` environment variable. This should specify the directory created by the gpinitsystem utility in the master data directory location.
echo “export MASTER_DATA_DIRECTORY=/gpmaster/gpsne-1″ >> ~/.bashrc
source ~/.bashrc

Now you can connect the master database using the psql client program:

psql postgres

I would remark to you that a system installed following this guide is to be considered as **evaluation platform only**, and is not supposed to be for production installations of Greenplum Database.

Email link after you register for Greenplum CE download

posted Apr 23, 2013, 12:17 PM by Sachchida Ojha   [ updated Nov 12, 2013, 4:21 PM ]

Thank you, your email address has been confirmed.
New download Location
Old download location
This email contains a free download of Greenplum Database software. If you have any technical support questions or would like to engage with the Greenplum community, join theForums
Spacer
ToolsOperating System

Additional Information

Greenplum Database 4.2.2


 
Connectivity Tools
  • Data Loaders: Greenplum's bulk-loading tools (gpload and gpfdist)
  • Connectivity Tools: ODBC and JDBC drivers and development libraries and headers
  • Client tools: psql and gpmapreduce

Documentation 

Installing Greenplum Community Edition (CE)

posted Apr 23, 2013, 12:15 PM by Sachchida Ojha   [ updated May 23, 2015, 5:03 PM ]

Installing Greenplum Community edition (CE). Here are the step by step  instructions on how to install the Greenplum Database CE software and get your single-node Greenplum Database system up and running. 

Before installing we have to change following OS configuration parameters:
----------------
LINUX
---------------
In /etc/sysctl.conf: 

sysctl.kernel.shmmax = 500000000
sysctl.kernel.shmmni = 4096
sysctl.kernel.shmall = 4000000000
sysctl.kernel.sem = 250 512000 100 2048
sysctl.kernel.sysrq = 1
sysctl.kernel.core_uses_pid = 1
sysctl.kernel.msgmnb = 65536
sysctl.kernel.msgmax = 65536
sysctl.kernel.msgmni = 2048
sysctl.net.ipv4.tcp_syncookies = 1
sysctl.net.ipv4.ip_forward = 0
sysctl.net.ipv4.conf.default.accept_source_route = 0
sysctl.net.ipv4.tcp_tw_recycle = 1
sysctl.net.ipv4.tcp_max_syn_backlog = 4096
sysctl.net.ipv4.conf.all.arp_filter = 1
sysctl.net.ipv4.ip_local_port_range = 1025 65535
sysctl.net.core.netdev_max_backlog = 10000
sysctl.vm.overcommit_memory = 2

For RHEL version 6.x platforms, do not include the sysctl. prefix in the above parameters, as follows:

xfs_mount_options = rw,noatime,inode64,allocsize=16m
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 2


In /etc/security/limits.conf: 

* soft nofile 65536  
* hard nofile 65536  
* soft nproc 131072 
* hard nproc 131072 

OPTIONAL
==================================================================================================================
In /etc/hosts:
Include the host names and all interface address names for every machine participating in your Greenplum Database system.
File system recommendations: XFS is the preferred file system on Linux platforms for data storage. Greenplum recommends the following xfs mount options:
rw,noatime,inode64,allocsize=16m
Also, each disk device file should have a read-ahead value of 16384.  For example, to verify the read-ahead value of a disk device:
# /sbin/blockdev --getra /dev/sdb
=================================================================================================================


Add the Greenplum database Admin account:
# useradd gpadmin
# passwd gpadmin
# New password: password
# Retype new password: password
You cannot run the Greenplum Database SNE server as root. While dealing with Greenplum use this newly created user account


Installing the Greenplum Database Community edition (CE):

1. Download or copy the Greenplum Database CE from www.greenplum.com/

2. Unzip the installer file:
# unzip greenplum-db-4.2.2.4-build-1-CE-RHEL5-x86_64.bin

3. Launch the installer using bash:
# /bin/bash greenplum-db-4.0.0.0-build-#-SingleNodeEdition-PLATFORM.bin

4. The installer prompts you to provide an installation path. Press ENTER to accept the default install path (/usr/local/greenplum-db-4.0.0.0), or enter new path

5. The installer installs the Greenplum Database CE software and creates a greenplum-db symbolic link one directory level above your version-specific Greenplum Database 

6. Change the ownership of your Greenplum Database SNE installation so that it is owned by the gpadmin
# chown -R gpadmin /usr/local/greenplum-db-4.0.0.0
# chgrp -R gpadmin /usr/local/greenplum-db-4.0.0.0

7. Preparing the Data Directory Locations
Every Greenplum Database SNE instance has a designated storage area on disk that is called the data directory location.

8. Create or choose a directory that will serve as your master data storage area
On this location user data is not stored, instead metadata (data about the data) is stored. Here global system catalog resides
# mkdir /gpmaster
# chown gpadmin /gpmaster
# chgrp gpadmin /gpmaster

9. Create or choose the directories that will serve as your segment storage areas:
This is the file system location where the database data is stored.
# mkdir /gpdata1
# chown gpadmin /gpdata1
# chgrp gpadmin /gpdata1
# mkdir /gpdata2
# chown gpadmin /gpdata2
# chgrp gpadmin /gpdata2

10. Configuring Greenplum Database SNE / CE Environment Variables:
$ vi .bashrc
Then add following entry
source /usr/local/greenplum-db/greenplum_path.sh
now source it
$ source ~/.bashrc

11. Now let’s initialize Greenplum database:
Greenplum provides a utility called gpinitsystem which initializes a Greenplum Database system. After the Greenplum Database SNE system is initialized and started, you can then create and manage databases by connecting to the Greenplum master database process.

12. Log in to the system as the gpadmin user:
# su - gpadmin

13. Copy the single_hostlist example file from your Greenplum Database installation to the current directory:
$ cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/hostlist_singlenode .

14. Copy the gp_init_singlenode example file from your Greenplum Database SNE installation to the current directory:
$ cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/gpinitsystem_singlenode .

15. Edit the gp_init_singlenode file and enter your configuration settings, you can remain them default. Some default parameters in this file are:
ARRAY_NAME="GPDB SNE"
MACHINE_LIST_FILE=./hostlist_singlenode
SEG_PREFIX=gpsne
PORT_BASE=50000
declare -a DATA_DIRECTORY=(/disk1/gpdata1 /disk2//gpdata2)
MASTER_HOSTNAME=sachi
MASTER_DIRECTORY=/home/gpmaster
MASTER_PORT=5432

16. Run the gpssh-exkeys utility to exchange ssh keys for the local host:
$ gpssh-exkeys -h sachi

[gpadmin@dbaref ~]$  gpssh-exkeys -h dbaref
[STEP 1 of 5] create local ID and authorize on local host
[ERROR dbaref] authentication check failed:
     ssh: connect to host dbaref port 22: Connection refused
[ERROR] cannot establish ssh access into the local host

to overcome this issue
1. Disable firewall
2. start sshd service if not started

[root@dbaref Downloads]# /sbin/service sshd status
Redirecting to /bin/systemctl  status sshd.service
sshd.service - OpenSSH server daemon.
      Loaded: loaded (/lib/systemd/system/sshd.service; disabled)
      Active: inactive (dead)
      CGroup: name=systemd:/system/sshd.service
[root@dbaref Downloads]# /sbin/service sshd start
Redirecting to /bin/systemctl  start sshd.service
[root@dbaref Downloads]# /sbin/service sshd status
Redirecting to /bin/systemctl  status sshd.service
sshd.service - OpenSSH server daemon.
      Loaded: loaded (/lib/systemd/system/sshd.service; disabled)
      Active: active (running) since Mon, 20 May 2013 08:45:07 -0400; 5s ago
    Main PID: 2764 (sshd)
      CGroup: name=systemd:/system/sshd.service
          └ 2764 /usr/sbin/sshd -D
[root@dbaref Downloads]#

Now su to gpadmin and run it again

[gpadmin@dbaref ~]$  gpssh-exkeys -h dbaref
[STEP 1 of 5] create local ID and authorize on local host
  ... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] authorize current user on remote hosts

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts

[INFO] completed successfully
[gpadmin@dbaref ~]$ 
======================================================================
Note: I have struggled with this many times. Here is one more option.
[root@sachi ~]# mkdir /etc/ssh/gpadmin
[root@sachi ~]# cp /home/gpadmin/.ssh/authorized_keys /etc/ssh/gpadmin/
[root@sachi ~]# chown -R gpadmin:gpadmin /etc/ssh/gpadmin
[root@sachi ~]# chmod 755 /etc/ssh/gpadmin
[root@sachi ~]# chmod 644 /etc/ssh/gpadmin/authorized_keys
[root@sachi ~]# vi /etc/ssh/sshd_config 
#RSAAuthentication yes
#PubkeyAuthentication yes
# changed .ssh/authorized_keys to /etc/ssh/gpadmin/authorized_keys <<<<<<<
AuthorizedKeysFile      /etc/ssh/gpadmin/authorized_keys
#AuthorizedKeysCommand none
#AuthorizedKeysCommandRunAs nobody

Reboot the server
======================================================================

17. initialize Greenplum Database SNE:
$ gpinitsystem -c gpinitsystem_singlenode

[gpadmin@sachi ~]$ gpinitsystem -c gpinitsystem_singlenode
20130423:19:53:54:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20130423:19:53:54:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Reading Greenplum configuration file gpinitsystem_singlenode
20130423:19:53:54:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Locale has not been set in gpinitsystem_singlenode, will set to default value
20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Locale set to en_US.utf8
20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-No DATABASE_NAME set, will exit following template1 updates
20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250
20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Detected a single host GPDB array build, reducing value of BATCH_DEFAULT from 60 to 4
20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking configuration parameters, Completed
20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
.
20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Configuring build for standard array
20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Commencing multi-home checks, Completed
20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Building primary segment instance array, please wait...
..
20130423:19:53:56:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking Master host
20130423:19:53:56:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking new segment hosts, please wait...
..
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking new segment hosts, Completed
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:---------------------------------------
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master Configuration
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:---------------------------------------
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master instance name       = GPDB SINGLENODE
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master hostname            = sachi
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master port                = 5432
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master instance dir        = /home/gpmaster/gpsne-1
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master LOCALE              = en_US.utf8
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Greenplum segment prefix   = gpsne
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master Database            =
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master connections         = 250
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master buffers             = 128000kB
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Segment connections        = 750
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Segment buffers            = 128000kB
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checkpoint segments        = 8
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Encoding                   = UNICODE
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Postgres param file        = Off
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Initdb to be used          = /usr/local/greenplum-db/./bin/initdb
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/greenplum-db/./lib
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Ulimit check               = Passed
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Array host connect type    = Single hostname per node
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [1]      = ::1
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [2]      = 172.16.72.1
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [3]      = 192.168.122.1
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [4]      = 192.168.133.1
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [5]      = 192.168.1.6
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [6]      = fe80::250:56ff:fec0:1
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [7]      = fe80::250:56ff:fec0:8
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [8]      = fe80::8e89:a5ff:fe80:f8e6
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Standby Master             = Not Configured
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Primary segment #          = 2
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Total Database segments    = 2
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Trusted shell              = ssh
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Number segment hosts       = 1
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Mirroring config           = OFF
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:----------------------------------------
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:----------------------------------------
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-sachi     /disk1/gpdata1/gpsne0     40000     2     0
20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-sachi     /disk2/gpdata2/gpsne1     40001     3     1
Continue with Greenplum creation Yy/Nn>
Y
20130423:19:54:13:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Building the Master instance database, please wait...
20130423:19:55:19:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Starting the Master in admin mode
20130423:19:55:39:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20130423:19:55:39:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
..
20130423:19:55:39:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
...........................................................................................................................................
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:------------------------------------------------
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Parallel process exit status
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:------------------------------------------------
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Total processes marked as completed           = 2
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Total processes marked as killed              = 0
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Total processes marked as failed              = 0
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:------------------------------------------------
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Deleting distributed backout files
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Removing back out file
20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-No errors generated from parallel processes
20130423:19:58:00:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Starting gpstop with args: -a -i -m -d /home/gpmaster/gpsne-1
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Gathering information and validating the environment...
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Obtaining Segment details from master...
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 4.2.2.4 build 1 Community Edition'
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-There are 0 connections to the database
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='immediate'
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Master host=sachi
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=immediate
20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Master segment instance directory=/home/gpmaster/gpsne-1
20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Starting gpstart with args: -a -d /home/gpmaster/gpsne-1
20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Gathering information and validating the environment...
20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.2.2.4 build 1 Community Edition'
20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Greenplum Catalog Version: '201109210'
20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Starting Master instance in admin mode
20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Obtaining Segment details from master...
20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Setting new master era
20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Master Started...
20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Checking for filespace consistency
20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Obtaining current filespace entries used by TRANSACTION_FILES
20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-TRANSACTION_FILES OIDs are consistent for pg_system filespace
20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-TRANSACTION_FILES entries are consistent for pg_system filespace
20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-Checking for filespace consistency
20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-Obtaining current filespace entries used by TEMPORARY_FILES
20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-TEMPORARY_FILES OIDs are consistent for pg_system filespace
20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-TEMPORARY_FILES entries are consistent for pg_system filespace
20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-Shutting down master
20130423:19:58:04:017070 gpstart:sachi:gpadmin-[INFO]:-No standby master configured.  skipping...
20130423:19:58:04:017070 gpstart:sachi:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
..
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-Process results...
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-----------------------------------------------------
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-   Successful segment starts                                            = 2
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-   Failed segment starts                                                = 0
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-----------------------------------------------------
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-Successfully started 2 of 2 segment instances
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-----------------------------------------------------
20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-Starting Master instance sachi directory /home/gpmaster/gpsne-1
20130423:19:58:08:017070 gpstart:sachi:gpadmin-[INFO]:-Command pg_ctl reports Master sachi instance active
20130423:19:58:08:017070 gpstart:sachi:gpadmin-[INFO]:-Database successfully started
20130423:19:58:08:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode
20130423:19:58:08:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Loading gp_toolkit...
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[WARN]:-*******************************************************
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[WARN]:-Scan of log file indicates that some warnings or errors
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[WARN]:-were generated during the array creation
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Please review contents of log file
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-/home/gpadmin/gpAdminLogs/gpinitsystem_20130423.log
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-To determine level of criticality
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[WARN]:-*******************************************************
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Greenplum Database instance successfully created
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-------------------------------------------------------
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-To complete the environment configuration, please
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/home/gpmaster/gpsne-1"
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-   or, use -d /home/gpmaster/gpsne-1 option for the Greenplum scripts
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-   Example gpstate -d /home/gpmaster/gpsne-1
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20130423.log
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Review options for gpinitstandby
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-------------------------------------------------------
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-The Master /home/gpmaster/gpsne-1/pg_hba.conf post gpinitsystem
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-new array must be explicitly added to this file
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-located in the /usr/local/greenplum-db/./docs directory
20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-------------------------------------------------------
[gpadmin@sachi ~]$ pwd
/home/gpadmin
[gpadmin@sachi ~]$

18. After the Greenplum Database SNE system is initialized and started, you can connect to the Greenplum master database process using the psql client program:
$ createdb mydb
$ psql mydb

19. Now export master data directory:
$ vi .bashrc
Then add following entry
export MASTER_DATA_DIRECTORY=/gpmaster/gpsne-1
now source it
$ source ~/.bashrc

20. Now you can perform any database operations using psql program (DDL, DML)

Uninstall Greenplum:


To uninstall run the following commands:
$ gpdeletesystem -d /gpmaster/gpsne-1
$ rm -rf /usr/local/greenplum-db-4.0.0.0
$ rm /usr/local/greenplum-db
You can remove the environment variable and restore the default setting of OS parameters(Optional)



Update: Installing Greenplum 4.3.5.1


[gpadmin@localhost gpconfigs]$ pwd
/usr/local/greenplum-db/docs/cli_help/gpconfigs
[gpadmin@localhost gpconfigs]$ cp hostlist_singlenode /usr/local/greenplum-db
[gpadmin@localhost gpconfigs]$

[gpadmin@localhost greenplum-db]$ ls
bin  demo  docs  etc  ext  GPDB-LICENSE.txt  gpinitsystem_singlenode  greenplum_path.sh  hostlist_singlenode  hosts.seg  include  lib  LICENSE.thirdparty  sbin  share
[gpadmin@localhost greenplum-db]$ cat hostlist_singlenode
localhost

[gpadmin@localhost greenplum-db]$ cat gpinitsystem_singlenode
# FILE NAME: gpinitsystem_singlenode

# A configuration file is needed by the gpinitsystem utility.
# This sample file initializes a Greenplum Database Single Node
# Edition (SNE) system with one master and  two segment instances
# on the local host. This file is referenced when you run gpinitsystem.

################################################
# REQUIRED PARAMETERS
################################################

# A name for the array you are configuring. You can use any name you
# like. Enclose the name in quotes if the name contains spaces.

ARRAY_NAME="GPDB SINGLENODE"


# This specifies the file that contains the list of segment host names
# that comprise the Greenplum system. For a single-node system, this
# file contains the local OS-configured hostname (as output by the
# hostname command). If the file does not reside in the same
# directory where the gpinitsystem utility is executed, specify
# the absolute path to the file.

MACHINE_LIST_FILE=./hostlist_singlenode


# This specifies a prefix that will be used to name the data directories
# of the master and segment instances. The naming convention for data
# directories in a Greenplum Database system is SEG_PREFIX<number>
# where <number> starts with 0 for segment instances and the master
# is always -1. So for example, if you choose the prefix gpsne, your
# master instance data directory would be named gpsne-1, and the segment
# instances would be named gpsne0, gpsne1, gpsne2, gpsne3, and so on.

SEG_PREFIX=gpsne


# Base port number on which primary segment instances will be
# started on a segment host. The base port number will be
# incremented by one for each segment instance started on a host.

PORT_BASE=40000


# This specifies the data storage location(s) where the script will
# create the primary segment data directories. The script creates a
# unique data directory for each segment instance. If you want multiple
# segment instances per host, list a data storage area for each primary
# segment you want created. The recommended number is one primary segment
# per CPU. It is OK to list the same data storage area multiple times
# if you want your data directories created in the same location. The
# number of data directory locations specified will determine the number
# of primary segment instances created per host.
# You must make sure that the user who runs gpinitsystem (for example,
# the gpadmin user) has permissions to write to these directories. You
# may want to create these directories on the segment hosts before running
# gpinitsystem and chown them to the appropriate user.

declare -a DATA_DIRECTORY=(/gpdata1 /gpdata2)


# The OS-configured hostname of the Greenplum Database master instance.

MASTER_HOSTNAME=localhost


# The location where the data directory will be created on the 
# Greenplum master host.
# You must make sure that the user who runs gpinitsystem
# has permissions to write to this directory. You may want to
# create this directory on the master host before running
# gpinitsystem and chown it to the appropriate user.

MASTER_DIRECTORY=/gpmaster


# The port number for the master instance. This is the port number
# that users and client connections will use when accessing the
# Greenplum Database system.

MASTER_PORT=5432


# The shell the gpinitsystem script uses to execute
# commands on remote hosts. Allowed value is ssh. You must set up
# your trusted host environment before running the gpinitsystem
# script. You can use gpssh-exkeys to do this.

TRUSTED_SHELL=ssh


# Maximum distance between automatic write ahead log (WAL)
# checkpoints, in log file segments (each segment is normally 16
# megabytes). This will set the checkpoint_segments parameter
# in the postgresql.conf file for each segment instance in the
# Greenplum Database system.

CHECK_POINT_SEGMENTS=8


# The character set encoding to use. Greenplum supports the
# same character sets as PostgreSQL. See 'Character Set Support'
# in the PostgreSQL documentation for allowed character sets.
# Should correspond to the OS locale specified with the
# gpinitsystem -n option.

ENCODING=UNICODE


################################################
# OPTIONAL PARAMETERS
################################################

# Optional. Uncomment to create a database of this name after the
# system is initialized. You can always create a database later using
# the CREATE DATABASE command or the createdb script.

#DATABASE_NAME=warehouse
[gpadmin@localhost greenplum-db]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
#192.168.1.253 sachi sachi.localdomain
[gpadmin@localhost greenplum-db]$




[gpadmin@localhost greenplum-db]$ gpinitsystem -c hostlist_singlenode
20150523:19:53:01:006471 gpinitsystem:localhost:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20150523:19:53:01:006471 gpinitsystem:localhost:gpadmin-[INFO]:-Reading Greenplum configuration file hostlist_singlenode
hostlist_singlenode: line 1: localhost: command not found
20150523:19:53:01:gpinitsystem:localhost:gpadmin-[FATAL]:-PORT_BASE not specified in hostlist_singlenode file, is this the correct instance configuration file. Script Exiting!
[gpadmin@localhost greenplum-db]$ gpinitsystem -c gpinitsystem_singlenode
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Reading Greenplum configuration file gpinitsystem_singlenode
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Locale has not been set in gpinitsystem_singlenode, will set to default value
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Locale set to en_US.utf8
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[WARN]:-Master hostname localhost does not match hostname output
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Checking to see if localhost can be resolved on this host
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Can resolve localhost to this host
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-No DATABASE_NAME set, will exit following template1 updates
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Detected a single host GPDB array build, reducing value of BATCH_DEFAULT from 60 to 4
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Checking configuration parameters, Completed
20150523:19:53:35:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
.
20150523:19:53:36:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Configuring build for standard array
20150523:19:53:36:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Commencing multi-home checks, Completed
20150523:19:53:36:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Building primary segment instance array, please wait...
..
20150523:19:53:37:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Checking Master host
20150523:19:53:37:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Checking new segment hosts, please wait...
.20150523:19:53:39:007016 gpinitsystem:localhost:gpadmin-[WARN]:-----------------------------------------------------------
20150523:19:53:39:007016 gpinitsystem:localhost:gpadmin-[WARN]:-Host localhost is assigned as localhost in /etc/hosts
20150523:19:53:39:007016 gpinitsystem:localhost:gpadmin-[WARN]:-This will cause segment->master communication failures
20150523:19:53:39:007016 gpinitsystem:localhost:gpadmin-[WARN]:-Remove localhost from local host line in /etc/hosts
20150523:19:53:39:007016 gpinitsystem:localhost:gpadmin-[WARN]:-----------------------------------------------------------
.20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[WARN]:-----------------------------------------------------------
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[WARN]:-Host localhost is assigned as localhost in /etc/hosts
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[WARN]:-This will cause segment->master communication failures
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[WARN]:-Remove localhost from local host line in /etc/hosts
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[WARN]:-----------------------------------------------------------

20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Checking new segment hosts, Completed
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:---------------------------------------
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master Configuration
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:---------------------------------------
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master instance name       = GPDB SINGLENODE
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master hostname            = localhost
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master port                = 5432
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master instance dir        = /gpmaster/gpsne-1
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master LOCALE              = en_US.utf8
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Greenplum segment prefix   = gpsne
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master Database            =
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master connections         = 250
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master buffers             = 128000kB
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Segment connections        = 750
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Segment buffers            = 128000kB
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Checkpoint segments        = 8
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Encoding                   = UNICODE
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Postgres param file        = Off
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Initdb to be used          = /usr/local/greenplum-db/./bin/initdb
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/greenplum-db/./lib
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Ulimit check               = Passed
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Array host connect type    = Single hostname per node
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master IP address [1]      = ::1
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master IP address [2]      = 192.168.1.253
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Master IP address [3]      = fe80::8e89:a5ff:fe80:f8e6
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Standby Master             = Not Configured
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Primary segment #          = 2
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Total Database segments    = 2
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Trusted shell              = ssh
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Number segment hosts       = 1
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Mirroring config           = OFF
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:----------------------------------------
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:----------------------------------------
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-localhost     /gpdata1/gpsne0     40000     2     0
20150523:19:53:40:007016 gpinitsystem:localhost:gpadmin-[INFO]:-localhost     /gpdata2/gpsne1     40001     3     1
Continue with Greenplum creation Yy/Nn>
y
20150523:19:53:48:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Building the Master instance database, please wait...
20150523:19:54:57:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Starting the Master in admin mode
The authenticity of host 'localhost.localdomain (127.0.0.1)' can't be established.
ECDSA key fingerprint is 6e:b1:81:19:b5:e0:6c:6d:3b:b8:60:3d:e3:32:ab:15.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost.localdomain' (ECDSA) to the list of known hosts.
20150523:19:55:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20150523:19:55:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
..
20150523:19:55:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
...........................................................................................................
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:------------------------------------------------
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Parallel process exit status
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:------------------------------------------------
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Total processes marked as completed           = 2
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Total processes marked as killed              = 0
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Total processes marked as failed              = 0
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:------------------------------------------------
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Deleting distributed backout files
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Removing back out file
20150523:19:57:14:007016 gpinitsystem:localhost:gpadmin-[INFO]:-No errors generated from parallel processes
20150523:19:57:15:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-Starting gpstop with args: -a -i -m -d /gpmaster/gpsne-1
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-Gathering information and validating the environment...
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-Obtaining Segment details from master...
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 4.3.5.1 build 1'
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-There are 0 connections to the database
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='immediate'
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-Master host=localhost.localdomain
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=immediate
20150523:19:57:16:024713 gpstop:localhost:gpadmin-[INFO]:-Master segment instance directory=/gpmaster/gpsne-1
20150523:19:57:17:024713 gpstop:localhost:gpadmin-[INFO]:-Attempting forceful termination of any leftover master process
20150523:19:57:17:024713 gpstop:localhost:gpadmin-[INFO]:-Terminating processes for segment /gpmaster/gpsne-1
20150523:19:57:18:024800 gpstart:localhost:gpadmin-[INFO]:-Starting gpstart with args: -a -d /gpmaster/gpsne-1
20150523:19:57:18:024800 gpstart:localhost:gpadmin-[INFO]:-Gathering information and validating the environment...
20150523:19:57:18:024800 gpstart:localhost:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.5.1 build 1'
20150523:19:57:18:024800 gpstart:localhost:gpadmin-[INFO]:-Greenplum Catalog Version: '201310150'
20150523:19:57:18:024800 gpstart:localhost:gpadmin-[INFO]:-Starting Master instance in admin mode
20150523:19:57:20:024800 gpstart:localhost:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20150523:19:57:20:024800 gpstart:localhost:gpadmin-[INFO]:-Obtaining Segment details from master...
20150523:19:57:20:024800 gpstart:localhost:gpadmin-[INFO]:-Setting new master era
20150523:19:57:20:024800 gpstart:localhost:gpadmin-[INFO]:-Master Started...
20150523:19:57:20:024800 gpstart:localhost:gpadmin-[INFO]:-Shutting down master
20150523:19:57:21:024800 gpstart:localhost:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
..
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-Process results...
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-----------------------------------------------------
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-   Successful segment starts                                            = 2
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-   Failed segment starts                                                = 0
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-----------------------------------------------------
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-Successfully started 2 of 2 segment instances
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-----------------------------------------------------
20150523:19:57:23:024800 gpstart:localhost:gpadmin-[INFO]:-Starting Master instance localhost.localdomain directory /gpmaster/gpsne-1
20150523:19:57:24:024800 gpstart:localhost:gpadmin-[INFO]:-Command pg_ctl reports Master localhost.localdomain instance active
20150523:19:57:25:024800 gpstart:localhost:gpadmin-[INFO]:-No standby master configured.  skipping...
20150523:19:57:25:024800 gpstart:localhost:gpadmin-[INFO]:-Database successfully started
20150523:19:57:25:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode
20150523:19:57:25:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Loading gp_toolkit...
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[WARN]:-*******************************************************
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[WARN]:-Scan of log file indicates that some warnings or errors
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[WARN]:-were generated during the array creation
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Please review contents of log file
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-/home/gpadmin/gpAdminLogs/gpinitsystem_20150523.log
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-To determine level of criticality
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-These messages could be from a previous run of the utility
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-that was called today!
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[WARN]:-*******************************************************
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Greenplum Database instance successfully created
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-------------------------------------------------------
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-To complete the environment configuration, please
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/gpmaster/gpsne-1"
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-   or, use -d /gpmaster/gpsne-1 option for the Greenplum scripts
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-   Example gpstate -d /gpmaster/gpsne-1
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20150523.log
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Review options for gpinitstandby
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-------------------------------------------------------
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-The Master /gpmaster/gpsne-1/pg_hba.conf post gpinitsystem
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-new array must be explicitly added to this file
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-located in the /usr/local/greenplum-db/./docs directory
20150523:19:57:27:007016 gpinitsystem:localhost:gpadmin-[INFO]:-------------------------------------------------------
[gpadmin@localhost greenplum-db]$

1-3 of 3