Configuring a MongoDB source endpoint
When creating or editing an endpoint, you can define:
- MongoDB custom installation settings, including those based on Compute Cloud VMs. These are required parameters.
- Additional parameters.
Custom installation
The settings are given for the OnPremise use case when all fields are filled in manually.
-
Hosts: Specify the IPs or FQDNs of the hosts to connect to.
-
Replica set: Specify the name of the replica set.
-
Port: Set the number of the port that Data Transfer will use for the connection.
-
CA certificate: To encrypt transmitted data, upload the PEM certificate or add its contents as text.
-
Subnet ID: Select or create a subnet in the desired availability zone.
If the value in this field is specified for both endpoints, both subnets must be hosted in the same availability zone.
-
Authentication source: Specify the database name in the cluster.
-
User: Specify the username that Data Transfer will use to connect to the database.
-
Password: Enter the user's password to the database.
-
Security groups: Select the cloud network to host the endpoint and security groups for network traffic.
This will let you apply the specified security group rules to the VMs and clusters in the selected network without changing the settings of these VMs and clusters. For more information, see Network in Data Transfer.
- Endpoint type:
mongo-source
.
-
--host
: IP address or FQDN of the master host you want to connect to. -
--port
: Number of the port that Data Transfer will use for the connection. -
--ca-certificate
— If the transmitted data needs to be encrypted, for example, to meet the requirements of PCI DSS. -
--subnet-id
: ID of the subnet the host resides in. -
--database
: Database name. -
--user
: Username that Data Transfer will use to connect to the database. -
--security-group
: Security groups for network traffic, whose rules will apply to VMs and clusters without changing their settings. For more information, see Network in Data Transfer. -
To set a user password to access the database, use one of the parameters:
-
--raw-password
: Password as text. -
--password-file
: The path to the password file.
-
- Endpoint type:
mongo_source
.
-
on_premise.port
: Port number that Data Transfer will use for connections. -
connection.connection_options.on_premise.tls_mode.enabled.ca_certificate
: CA certificate used if the data being transferred must be encrypted to comply with the PCI DSS requirements. -
security_groups
: Specify the security groups for network traffic.This will let you apply the specified security group rules to the VMs and clusters in the
subnet_id
network without changing the settings of these VMs and clusters. For more information, see Network in Data Transfer. -
subnet_id
: ID of the subnet hosting the cluster. If not specified, the cluster must be accessible from the internet.If the value in this field is specified for both endpoints, both subnets must be hosted in the same availability zone.
-
connection.connection_options.on_premise.replica_set
: Specify the name of the replica set. -
connection.connection_options.on_premise.hosts
: Specify the IP addresses or FQDN of the hosts to connect to. -
auth_source
: Specify the database name in the cluster. -
connection.connection_options.user
: Username that Data Transfer will use to connect to the database. -
connection.connection_options.password.raw
: Password in text form.
Example of the configuration file structure:
resource "yandex_datatransfer_endpoint" "<endpoint name in Terraform>" {
name = "<endpoint name>"
settings {
mongo_source {
security_groups = [ "list of security group IDs" ]
subnet_id = "<subnet ID>"
connection {
connection_options {
on_premise {
hosts = [ "replica collection host list" ]
port = "<connection port>"
replica_set = "<replica collection name>"
tls_mode {
enabled {
ca_certificate = "<certificate in PEM format>"
}
}
}
auth_source = "<database name>"
user = "<username>"
password {
raw = "<user password>"
}
}
}
<advanced endpoint settings>
}
}
}
For more information, see the Terraform provider documentation
onPremise
: Database connection parameters:-
hosts
— IP address or FQDN of the master host to connect to. -
port
: The number of the port that Data Transfer will use for the connection. -
tlsMode
: Parameters of encryption of transmitted data if it is required, for example, to meet the requirements of PCI DSS. -
subnetId
: ID of the subnet the host resides in.
-
-
securityGroups
: Security groups for network traffic, whose rules will apply to VMs and clusters without changing their settings. For more information, see Network in Data Transfer. -
database
: Database name. -
user
: Username that Data Transfer will use to connect to the database. -
password.raw
: Database user password (in text form).
Additional settings
-
Included collections: Data is only transferred from listed collections. All collections are transferred by default.
When you add new collections while editing an endpoint used in Snapshot and increment or Replication transfers with the Replicating status, the data history for these collections will not get uploaded. To add a collection with its historical data, use the List of objects to be transferred (Preview) field in the transfer settings.
-
Excluded collections: Data is transferred from all collections except the specified ones.
-
secondary_preferred_mode
: If there are replicas in the cluster, they will be used for data reads instead of the master host. -
collections
: Data is only transferred from listed collections. All collections are transferred by default.When you add new collections while editing an endpoint used in Snapshot and increment or Replication transfers with the Replicating status, the data history for these collections will not get uploaded. To add a collection with its historical data, use the List of objects to be transferred (Preview) field in the transfer settings.
-
excluded_collections
: Data is transferred from all collections except the specified ones.
If a source is experiencing high workload (over 10000 write transactions per second), we recommend that you select these settings to have no more than ten different databases at each endpoint. This will help avoid database connection errors while the transfer is ongoing.
Note
- If you use several endpoints, you need to create a separate transfer for each one.
- As transfers of timeseries collections
are not supported, you should exclude these collections.