Configuring PostgreSQL source endpoints
When creating or editing an endpoint, you can define:
- Managed Service for PostgreSQL cluster connection or custom installation settings, including those based on Compute Cloud VMs. These are required parameters.
- Additional parameters.
Before you get started, check the Service specifics for PostgreSQL sources and targets.
Managed Service for PostgreSQL cluster
Warning
To create or edit an endpoint of a managed database, you need the managed-postgresql.viewer
role or the primitive viewer
role issued for the folder hosting a cluster of this managed database.
Connecting to the database with the cluster ID specified in Nebius Israel. Available only for clusters deployed in Managed Service for PostgreSQL.
-
Managed Service for PostgreSQL cluster: Specify ID of the cluster to connect to.
- Security groups: Select the cloud network to host the endpoint and security groups for network traffic.
This will let you apply the specified security group rules to the VMs and clusters in the selected network without changing the settings of these VMs and clusters. For more information, see Network in Data Transfer.
-
Database: Specify the name of the database in the selected cluster.
-
User: Specify the username that Data Transfer will use to connect to the database.
-
Password: Enter the user's password to the database.
- Endpoint type:
postgres-source
.
-
--cluster-id
: ID of the cluster you need to connect to. -
--database
: Database name. -
--user
: Username that Data Transfer will use to connect to the database. -
To set a user password to access the DB, use one of the following parameters:
-
--raw-password
: Password as text. -
--password-file
: The path to the password file.
-
- Endpoint type:
postgres_source
.
-
connection.mdb_cluster_id
: ID of cluster to connect to. -
database
— Database name. -
user
: Username that Data Transfer will use to connect to the database. -
password.raw
: Password in text form.
Example of the configuration file structure:
resource "yandex_datatransfer_endpoint" "<endpoint name in Terraform>" {
name = "<endpoint name>"
settings {
postgres_source {
security_groups = [ "list of security group IDs" ]
connection {
mdb_cluster_id = "<Managed Service for PostgreSQL cluster ID>"
}
database = "<name of database being transferred>"
user = "<username for connection>"
password {
raw = "<user password>"
}
<advanced endpoint settings>
}
}
}
For more information, see the Terraform provider documentation
-
mdbClusterId
: ID of the cluster you need to connect to. -
database
: Database name. -
user
: Username that Data Transfer will use to connect to the database. -
password.raw
: Database user password (in text form).
Custom installation
For OnPremise, all fields are filled in manually.
-
Host: Enter the IP address or FQDN of the master host you want to connect to.
-
Port: Set the number of the port that Data Transfer will use for the connection.
-
CA certificate: Upload the certificate file or add its contents as text if transmitted data must be encrypted, for example, to meet PCI DSS requirements.
-
Subnet ID: Select or create a subnet in the desired availability zone.
If the value in this field is specified for both endpoints, both subnets must be hosted in the same availability zone.
-
Database: Specify the name of the database in the selected cluster.
-
User: Specify the username that Data Transfer will use to connect to the database.
-
Password: Enter the user's password to the database.
-
Security groups: Select the cloud network to host the endpoint and security groups for network traffic.
This will let you apply the specified security group rules to the VMs and clusters in the selected network without changing the settings of these VMs and clusters. For more information, see Network in Data Transfer.
- Endpoint type:
postgres-source
.
-
--host
: IP address or FQDN of the master host you want to connect to. -
--port
: Number of the port that Data Transfer will use for the connection. -
--ca-certificate
— If the transmitted data needs to be encrypted, for example, to meet the requirements of PCI DSS. -
--subnet-id
: ID of the subnet the host resides in. -
--database
: Database name. -
--user
: Username that Data Transfer will use to connect to the database. -
To set a user password to access the DB, use one of the following parameters:
-
--raw-password
: Password as text. -
--password-file
: The path to the password file.
-
- Endpoint type:
postgres_source
.
-
on_premise.hosts
: List of IPs or FQDNs of hosts to connect to. Since only single-item lists are supported, specify the master host address. -
on_premise.port
: Port number that Data Transfer will use for connections. -
on_premise.tls_mode.enabled.ca_certificate
: CA certificate if the data to transfer must be encrypted to comply with PCI DSS requirements. -
on_premise.subnet_id
: ID of subnet that host is on. -
database
— Database name. -
user
: Username that Data Transfer will use to connect to the database. -
password.raw
: Password in text form.
Example of the configuration file structure:
resource "yandex_datatransfer_endpoint" "<endpoint name in Terraform>" {
name = "<endpoint name>"
settings {
postgres_source {
security_groups = [ "list of security group IDs" ]
connection {
on_premise {
hosts = ["<host list>"]
port = <connection port>
}
}
database = "<name of database being transferred>"
user = "<username for connection>"
password {
raw = "<user password>"
}
<advanced endpoint settings>
}
}
}
For more information, see the Terraform provider documentation
onPremise
: Database connection parameters:-
hosts
— IP address or FQDN of the master host to connect to. -
port
: The number of the port that Data Transfer will use for the connection. -
tlsMode
: Parameters of encryption of transmitted data if it is required, for example, to meet the requirements of PCI DSS. -
subnetId
: ID of the subnet the host resides in.
-
-
database
: Database name. -
user
: Username that Data Transfer will use to connect to the database. -
password.raw
: Database user password (in text form).
Additional settings
-
Included tables: Data is only transferred from listed tables.
When you add new tables when editing an endpoint used in Snapshot and increment or Replication transfers with the Replicating status, the data history for these tables will not get uploaded. To add a table with its historical data, use the List of objects to be transferred (Preview) field in the transfer settings.
-
Excluded tables: Data from the listed tables is not transferred.
The lists include the name of the schema
(description of DB contents, structure, and integrity constraints) and the table name. Both lists support expressions in the following format:<schema name>.<table name>
: Fully qualified table name.<schema name>.*
: All tables in the schema.
Warning
The service does not transfer custom data types if the endpoint specifies a list of included or excluded tables for the source. If this is the case, please transfer your custom data types manually.
-
Schema migration: If required, select the DB schema elements to transfer when activating or deactivating a transfer.
-
Maximum WAL size for a replication slot: Maximum size of Write-Ahead Log kept in replication slot. If exceeded, the replication process is stopped and the replication slot is deleted. The default value is 50 GB. This setting does not prevent disk overflow in the source database. You can only use it for PostgreSQL version below 13, and we recommend monitoring the WAL slot value in the source database.
-
Database schema for auxiliary tables: Specify the name of the schema to store service tables (
__consumer_keeper
and__data_transfer_mole_finder
). -
Merge inherited tables: Select to merge the contents of tables. For more information, see Service specifics for sources and targets.
-
Parallel table copying settings: If necessary, set detailed settings for parallel copying of tables (if parallel copying parameters are set in the transfer).
-
--include-table
: List of included tables. If this is on, the data will only be transferred from the tables in this list.When you add new tables when editing an endpoint used in Snapshot and increment or Replication transfers with the Replicating status, the data history for these tables will not get uploaded. To add a table with its historical data, use the List of objects to be transferred (Preview) field in the transfer settings.
-
--exclude-table
: List of excluded tables. Data from the listed tables will not be transferred.The lists include the name of the schema (description of DB contents, structure, and integrity constraints) and the table name. Both lists support expressions in the following format:
<schema name>.<table name>
: Fully qualified table name.<schema name>.*
: All tables in the schema.
Warning
The service does not transfer custom data types if the endpoint specifies a list of included or excluded tables for the source. If this is the case, please transfer your custom data types manually.
-
--slot-lag-limit
: Maximum size of the write-ahead log kept in the replication slot. If exceeded, the replication process is stopped and the replication slot is deleted. The default value is 50 GB. -
--service-schema
: Name of the DB schema for service tables. -
Schema transfer settings:
--transfer-before-data
: At the initial stage of the transfer.--transfer-after-data
: At the final stage of the transfer.
-
include_tables
: List of included tables. If this is on, the data will only be transferred from the tables in this list.When you add new tables when editing an endpoint used in Snapshot and increment or Replication transfers with the Replicating status, the data history for these tables will not get uploaded. To add a table with its historical data, use the List of objects to be transferred (Preview) field in the transfer settings.
-
exclude_tables
: List of excluded tables. Data from tables on this list will not be transferred.The lists include the name of the schema (description of DB contents, structure, and integrity constraints) and the table name. Both lists support expressions in the following format:
<schema name>.<table name>
: Fully qualified table name.<schema name>.*
: All tables in the schema.
Warning
The service does not transfer custom data types if the endpoint specifies a list of included or excluded tables for the source. If this is the case, please transfer your custom data types manually.
-
slot_gigabyte_lag_limit
: Maximum size of Write-Ahead Log kept in replication slot. If exceeded, the replication process is stopped and the replication slot is deleted. The default value is 50 GB. -
service_schema
: DB schema name for housekeeping tables. -
object_transfer_settings
: Schema transfer settings:sequence
: Sequences.sequence_owned_by
: User sequences.table
: Tables.primary_key
: Primary keys.fk_constraint
: Foreign keys.default_values
: Default values.constraint
: Constraints.index
: Indexes.view
: Views.function
: Functions.trigger
: Triggers.type
: Types.rule
: Rules.collation
: Collation rules.policy
: Policies.cast
: Type casts.
You can specify one of the following values for each entity:
BEFORE_DATA
: Move at transfer activation.AFTER_DATA
: Move at transfer deactivation.NEVER
: Do not move.
For more information, see the Terraform provider documentation
-
includeTables
: List of included tables. If this is on, the data will only be transferred from the tables in this list.When you add new tables when editing an endpoint used in Snapshot and increment or Replication transfers with the Replicating status, the data history for these tables will not get uploaded. To add a table with its historical data, use the List of objects to be transferred (Preview) field in the transfer settings.
-
excludeTables
: Blacklist of tables. Data from the listed tables will not be transferred.The lists include the name of the schema (description of DB contents, structure, and integrity constraints) and the table name. Both lists support expressions in the following format:
<schema name>.<table name>
: Fully qualified table name.<schema name>.*
: All tables in the schema.
Warning
The service does not transfer custom data types if the endpoint specifies a list of included or excluded tables for the source. If this is the case, please transfer your custom data types manually.
-
slotByteLagLimit
: Maximum size of the write-ahead log kept in the replication slot. If exceeded, the replication process is stopped and the replication slot is deleted. The default value is 50 GB. -
serviceSchema
: Name of the DB schema for service tables. -
objectTransferSettings
: Settings for transferring the schema at the initial and final stages of the transfer (BEFORE_DATA
andAFTER_DATA
values, respectively).
Settings for transferring a DB schema when enabling and disabling a transfer
Note
The default settings of the source endpoint let you successfully perform a transfer for most databases. Change the settings of the initial and final stages of the transfer only if it is necessary.
During a transfer, the database schema is transferred from the source to the target. The transfer is performed in two stages:
-
At the activation stage.
This step is performed when the transfer is activated before copying or replicating data to create a schema on the target. You can choose which parts of the schema will be migrated. By default, this is
TABLE
,VIEW
,PRIMARY KEY
,SEQUENCE
,SEQUENCE OWNED BY
,RULE
,TYPE
,FUNCTION
, andDEFAULT
. -
At the deactivation stage.
This step is performed at the end of the transfer operation when it is deactivated. If the transfer keeps running in replication mode, the final stage of the transfer will be performed only when replication stops. You can choose which parts of the schema will be migrated.
At the final stage, it is assumed that when the transfer is deactivated, there is no writing load on the source. You can ensure this by switching to
read-only
mode. At this stage, the database schema on the target is brought to a state where it will be consistent with the schema on the source.It's recommended to include resource-intensive operations like index migration in the final stage of migration. Migrating indexes at the beginning of the transfer can slow it down.
The transfer of the schema at both the initial and final stages is performed using the pg_dump
utility
Note
When editing the settings of an endpoint of the transfer in the Replicating status, the table schemas on the target are preserved. In this case, only the schemas of the tables that are missing on the target at the time of restart are transferred to the target.
Replication cannot guarantee that sequence values are preserved, so we recommend updating the sequences
on the target.