DataSpell 2024.2 Help

AWS S3

Connect to an AWS S3 server

  1. In the Big Data Tools window, click Add a connection and select AWS S3.

  2. In the Big Data Tools dialog that opens, specify the connection parameters:

    Configure S3 connection
    • Name: the name of the connection to distinguish it between the other connections.

    • Select the storage type: AWS S3 or a custom S3 compatible storage.

    • Specify the storage location:

      • For an AWS S3 storage, select the area of your storage region in the Area list.

      • For a custom S3 compatible storage, enter the endpoint URL in the Endpoint field and, optionally, enter the storage region, such as us-east-2.

    • Choose the way to get buckets:

      • Select Custom roots and, in the Roots field, specify the name of the bucket or the path to a directory in the bucket. You can specify multiple names or paths by separating them with a comma.

      • Select All buckets in the account. You can then use the bucket filter to show only buckets with particular names. You can also select Only buckets in the selected region and then select a region if you want to show buckets only from a particular region.

    • Authentication type lets you select the authentication method:

      • Default credential providers chain: use the credentials from the default provider chain. For more information about the chain, refer to Using the Default Credential Provider Chain.

      • Profile from credentials file: select a profile from your credentials file.

      • Explicit access key and secret key: enter your credentials manually.

      • Anonymous: if you do not want to restrict access to a publicly visible bucket.

    With the Default credential providers chain or Profile from credentials file option selected, you can click Open Credentials to locate the directory where the credential file is stored. If you use the default location, it's usually ~/.aws/credentials on Linux or macOS, or C:\Users\<USERNAME>\.aws\credentials on Windows. Or it can be your custom location if you have selected Use custom configs.

    Optionally, you can set up:

    • Enable connection: deselect if you want to disable this connection. By default, the newly created connections are enabled.

    You can also set up Extended Connection Settings:

    • HTTP Proxy: select if you want to use IDE proxy settings or if you want to specify custom proxy settings.

    • Enable tunneling. This option creates an SSH tunnel to the remote host. It can be useful if the target server is in a private network but an SSH connection to the host in the network is available.

    • Operation timeout (s): enter a timeout (in seconds) for operations performed on the remote storage, such as getting file info, listing or deleting objects. The default value is 15 seconds.

    • Trust all SSL certificates: select it if you trust the SSL certificate used for this connection and do not want to verify it. This can be useful if, for development purposes, you have a host with a self-signed certificate – verifying it could result in an error.

  3. Once you fill in the settings, click Test connection to ensure that all configuration parameters are correct. Then click OK.

Once you have established a connection, you can view the storage and work with data files in it.

Last modified: 08 October 2024