This section describes how to connect to a database through an SQL client after you create a data warehouse cluster and before you use the cluster's database. GaussDB(DWS) provides the gsql client that matches the cluster version for you to access the cluster using the cluster's public or private network address.
For details, see Preparing an ECS as the gsql Client Host.
The user who uploads the client must have the full control permission on the target directory on the host to which the client is uploaded.
Alternatively, you can remotely log in to the Linux host where the gsql is to be installed in SSH mode and run the following command in the Linux command window to download the gsql client:
wget https://obs.otc.t-systems.com/dws/download/dws_client_redhat_x64.zip --no-check-certificate
For details about how to log in to an ECS, see "ECSs> Logging In to a Linux ECS > Login Using an SSH Password" in the Elastic Cloud Server User Guide.
The SSL connection mode is more secure than the non-SSL mode. You are advised to connect the client to the cluster in SSL mode.
cd <Path for saving the client> unzip dws_client_8.1.x_redhat_x64.zip
In the preceding commands:
source gsql_env.sh
If the following information is displayed, the GaussDB(DWS) client is successfully configured:
All things done.
gsql -d <Database_name> -h <Cluster_address> -U <Database_user> -p <Database_port> -r
The parameters are described as follows:
For example, run the following command to connect to the default database gaussdb in the GaussDB(DWS) cluster:
gsql -d gaussdb -h 10.168.0.74 -U dbadmin -p 8000 -W password -r
If the following information is displayed, the connection succeeded:
1 | gaussdb=> |
For more information about the gsql commands, see the Data Warehouse Service (DWS) Tool Guide.
GaussDB(DWS) users can import data from external sources to data warehouse clusters. This section describes how to import sample data from OBS to a data warehouse cluster and perform querying and analysis operations on the sample data. The sample data is generated based on the standard TPC-DS benchmark test.
TPC-DS is the benchmark for testing the performance of decision support. With TPC-DS test data and cases, you can simulate complex scenarios, such as big data set statistics, report generation, online query, and data mining, to better understand functions and performance of database applications.
cd /opt
cd sample /bin/bash setup.sh -ak <Access_Key_Id> -sk <Secret_Access_Key> -obs_location obs.otc.t-systems.com
If the following information is displayed, the settings are successful:
setup successfully!
<Access_Key_Id> and <Secret_Access_Key>: indicate the AK and SK, respectively. For details about how to obtain the AK and SK, see "Data Import > Concurrently Importing Data from OBS > Creating Access Keys (AK and SK)" in the Data Warehouse Service (DWS) Developer Guide. Then, replace the parameters in the statements with the obtained values.
cd .. source gsql_env.sh cd bin
Command format:
gsql -d <Database name> -h <Public network address of the cluster> -U <Administrator> -p <Data warehouse port number> -f <Path for storing the sample data script> -r
Sample command:
gsql -d gaussdb -h 10.168.0.74 -U dbadmin -p 8000 -f /opt/sample/tpcds_load_data_from_obs.sql -r
In the preceding command, sample data script tpcds_load_data_from_obs.sql is stored in the sample directory (for example, /opt/sample/) of the GaussDB(DWS) client.
After you enter the database administrator password and successfully connect to the database in the cluster, the system will automatically create a foreign table to associate the sample data outside the cluster. Then, the system creates a target table for saving the sample data and imports the data to the target table using the foreign table.
The time required for importing a large dataset depends on the current GaussDB(DWS) cluster specifications. Generally, the import takes about 10 to 20 minutes. If information similar to the following is displayed, the import is successful.
Time:1845600.524 ms
cd /opt/sample/query_sql/ /bin/bash tpcds100x.sh
After the query is complete, a directory for storing the query result, such as query_output_20170914_072341, will be generated in the current query directory, for example, sample/query_sql/.