This section describes how to connect to a database through an SQL client after you create a data warehouse cluster and before you use the cluster's database. GaussDB(DWS) provides the Linux gsql client that matches the cluster version for you to access the cluster through the cluster's public or private network address.
The gsql command line client provided by GaussDB(DWS) runs on Linux. Before using it to remotely connect to a GaussDB(DWS) cluster, you need to prepare a Linux server for installing and running the gsql client. If you use a public network address to access the cluster, you can install the Linux gsql client on your own Linux server. Ensure that the Linux server has a public network address. If no EIPs are configured for your GaussDB(DWS) cluster, you are advised to create a Linux ECS for convenience purposes. For more information, see (Optional) Preparing an ECS as the gsql Client Server.
For details about how to create an ECS, see "Getting Started > Creating an ECS" in the Elastic Cloud Server User Guide.
The created ECS must meet the following requirements:
The image's OS must be one of the following Linux OSs supported by the gsql client:
For details about VPC operations, see "VPC and Subnet" in the Virtual Private Cloud User Guide.
When creating an ECS, set EIP to Automatically assign or Specify.
For details about security group operations, see "Security Group" in the Virtual Private Cloud User Guide.
Ensure that the security group of the ECS contains rules meeting the following requirements. If the rules do not exist, add them to the security group:
You are advised to download the gsql tool that matches the cluster version. That is, use gsql 8.1.x for clusters of 8.1.0 or later, and use gsql 8.2.x for clusters of 8.2.0 or later. To download gsql 8.2.x, replace dws_client_8.1.x_redhat_x64.zip with dws_client_8.2.x_redhat_x64.zip. The dws_client_8.1.x_redhat_x64.zip is used as an example.
The user who uploads the client must have the full control permission on the target directory on the host to which the client is uploaded.
Alternatively, you can remotely manage the Linux server where the gsql is to be installed in SSH mode and run the following command in the Linux command window to download the Linux gsql client:
1 | wget https://obs.otc.t-systems.com/dws/download/dws_client_8.1.x_redhat_x64.zip --no-check-certificate |
For details about how to log in to an ECS, see "ECSs> Logging In to a Linux ECS > Login Using an SSH Password" in the Elastic Cloud Server User Guide.
The SSL connection mode is more secure than the non-SSL mode. You are advised to connect the client to the cluster in SSL mode.
cd <Path for saving the client> unzip dws_client_8.1.x_redhat_x64.zip
In the preceding commands:
source gsql_env.sh
If the following information is displayed, the gsql client is successfully configured:
All things done.
gsql -d <Database_name> -h <Cluster_address> -U <Database_user> -p <Database_port> -W <Cluster_password> -r
The parameters are described as follows:
For example, run the following command to connect to the default database gaussdb in the GaussDB(DWS) cluster:
1 | gsql -d gaussdb -h 10.168.0.74 -U dbadmin -p 8000 -W password -r |
If the following information is displayed, the connection succeeded:
1 | gaussdb=> |
For more information about the gsql commands, see the Data Warehouse Service (DWS) Tool Guide.
GaussDB(DWS) users can import data from external sources to data warehouse clusters. This section describes how to import sample data from OBS to a data warehouse cluster and perform querying and analysis operations on the sample data. The sample data is generated based on the standard TPC-DS benchmark test.
TPC-DS is the benchmark for testing the performance of decision support. With TPC-DS test data and cases, you can simulate complex scenarios, such as big data set statistics, report generation, online query, and data mining, to better understand functions and performance of database applications.
cd /opt
1 2 | cd sample /bin/bash setup.sh -ak <Access_Key_Id> -sk <Secret_Access_Key> -obs_location obs.otc.t-systems.com |
If the following information is displayed, the settings are successful:
setup successfully!
<Access_Key_Id> and <Secret_Access_Key>: indicate the AK and SK, respectively. For how to obtain the AK and SK, see "Data Import" > "Concurrently Importing Data from OBS" > "Creating Access Keys (AK and SK)" in Data Warehouse Service (DWS) Developer Guide. Then, replace the parameters in the statements with the obtained values.
1 2 3 | cd .. source gsql_env.sh cd bin |
Command format:
1 | gsql -d <Database name> -h <Public network address of the cluster> -U <Administrator> -p <Data warehouse port number> -f <Path for storing the sample data script> -r |
Sample command:
1 | gsql -d gaussdb -h 10.168.0.74 -U dbadmin -p 8000 -f /opt/sample/tpcds_load_data_from_obs.sql -r |
In the preceding command, sample data script tpcds_load_data_from_obs.sql is stored in the sample directory (for example, /opt/sample/) of the GaussDB(DWS) client.
After you enter the administrator password and successfully connect to the database in the cluster, the system will automatically create a foreign table to associate the sample data outside the cluster. Then, the system creates a target table for saving the sample data and imports the data to the target table using the foreign table.
The time required for importing a large dataset depends on the current GaussDB(DWS) cluster specifications. Generally, the import takes about 10 to 20 minutes. If information similar to the following is displayed, the import is successful.
1 | Time:1845600.524 ms |
1 2 | cd /opt/sample/query_sql/ /bin/bash tpcds100x.sh |
After the query is complete, a directory for storing the query result, such as query_output_20170914_072341, will be generated in the current query directory, for example, sample/query_sql/.