doc-exports/docs/dws/umn/dws_01_0171.html
luhuayi 85562e1879 DWS UMN 831.100 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: luhuayi <luhuayi@huawei.com>
Co-committed-by: luhuayi <luhuayi@huawei.com>
2024-11-13 08:55:54 +00:00

165 KiB
Raw Permalink Blame History

Using the Python Library PyGreSQL to Connect to a Cluster

After creating a data warehouse cluster and using the third-party function library PyGreSQL to connect to the cluster, you can use Python to access GaussDB(DWS) and perform various operations on data tables.

Preparations Before Connecting to a Cluster

  • An EIP has been bound to the data warehouse cluster.
  • You have obtained the administrator username and password for logging in to the database in the data warehouse cluster.
    MD5 algorithms may by vulnerable to collision attacks and cannot be used for password verification. Currently, GaussDB(DWS) uses the default security design. By default, MD5 password verification is disabled, and this may cause failures of connections from open source clients. You are advised to set password_encryption_type to 1. For details, see "Modifying Database Parameters" in User Guide.
    • For security purposes, GaussDB(DWS) no longer uses MD5 to store password digests by default. As a result, the open-source drives and clients may fail to connect to the database. To use the MD5 algorithm used in an open-source protocol, you must modify your password policy and create a new user, or change the password of an existing user.
    • The database stores the hash digest of passwords instead of password text. During password verification, the system compares the hash digest with the password digest sent from the client (salt operations are involved). If you change your cryptographic algorithm policy, the database cannot generate a new hash digest for your existing password. For connectivity purposes, you must manually change your password or create a new user. The new password will be encrypted using the hash algorithm and stored for authentication in the next connection.
  • You have obtained the public network address, including the IP address and port number in the data warehouse cluster. For details, see Obtaining the Cluster Connection Address.
  • You have installed the third-party function library PyGreSQL.

    Download address: http://www.pygresql.org/download/index.html

  • For details about the installation and deployment operations, see http://www.pygresql.org/contents/install.html
    • In CentOS and Red Hat OS, run the following yum command:
      1
      yum install PyGreSQL
      
    • PyGreSQL depends on the libpq dynamic library of PostgreSQL (32-bit or 64-bit version, whichever matches the PyGreSQL bit version). In Linux, you can run the yum command and do not need to install the library. Before using PyGreSQL in Windows, you need to install libpq in either of the following ways:
      • Install PostgreSQL and configure the libpq, ssl, and crypto dynamic libraries in the environment variable PATH.
      • Install psqlodbc and use the libpq, ssl, and crypto dynamic libraries carried by the PostgreSQL ODBC driver.

Constraints

PyGreSQL is a PostgreSQL-based client interface, and its functions are not fully supported by GaussDB(DWS). For details, see Table 1.

The following APIs are supported based on Python 3.8.5 and PyGreSQL 5.2.4.

Table 1 PyGreSQL APIs supported by DWS

PyGreSQL

Yes

Remarks

Module functions and constants

connect Open a PostgreSQL connection

Y

-

get_pqlib_version get the version of libpq

Y

-

get/set_defhost default server host [DV]

Y

-

get/set_defport default server port [DV]

Y

-

get/set_defopt default connection options [DV]

Y

-

get/set_defbase default database name [DV]

Y

-

get/set_defuser default database user [DV]

Y

-

get/set_defpasswd default database password [DV]

Y

-

escape_string escape a string for use within SQL

Y

-

escape_bytea escape binary data for use within SQL

Y

-

unescape_bytea unescape data that has been retrieved as text

Y

-

get/set_namedresult conversion to named tuples

Y

-

get/set_decimal decimal type to be used for numeric values

Y

-

get/set_decimal_point decimal mark used for monetary values

Y

-

get/set_bool whether boolean values are returned as bool objects

Y

-

get/set_array whether arrays are returned as list objects

Y

-

get/set_bytea_escaped whether bytea data is returned escaped

Y

-

get/set_jsondecode decoding JSON format

Y

-

get/set_cast_hook fallback typecast function

Y

-

get/set_datestyle assume a fixed date style

Y

-

get/set_typecast custom typecasting

Y

-

cast_array/record fast parsers for arrays and records

Y

-

Type helpers

Y

-

Module constants

Y

-

Connection The connection object

query execute a SQL command string

Y

-

send_query - executes a SQL command string asynchronously

Y

-

query_prepared execute a prepared statement

Y

-

prepare create a prepared statement

Y

-

describe_prepared describe a prepared statement

Y

-

reset reset the connection

Y

-

poll - completes an asynchronous connection

Y

-

cancel abandon processing of current SQL command

Y

-

close close the database connection

Y

-

transaction get the current transaction state

Y

-

parameter get a current server parameter setting

Y

-

date_format get the currently used date format

Y

-

fileno get the socket used to connect to the database

Y

-

set_non_blocking - set the non-blocking status of the connection

Y

-

is_non_blocking - report the blocking status of the connection

Y

-

getnotify get the last notify from the server

N

The database does not support listen/notify.

inserttable insert a list into a table

Y

Use double quotation marks ("") to quote \n in the copy command.

get/set_notice_receiver custom notice receiver

Y

-

putline write a line to the server socket [DA]

Y

-

getline get a line from server socket [DA]

Y

-

endcopy synchronize client and server [DA]

Y

-

locreate create a large object in the database [LO]

N

Operations related to large objects

getlo build a large object from given oid [LO]

N

Operations related to large objects

loimport import a file to a large object [LO]

N

Operations related to large objects

Object attributes

Y

-

The DB wrapper class

Initialization

Y

-

pkey return the primary key of a table

Y

-

get_databases get list of databases in the system

Y

-

get_relations get list of relations in connected database

Y

-

get_tables get list of tables in connected database

Y

-

get_attnames get the attribute names of a table

Y

-

has_table_privilege check table privilege

Y

-

get/set_parameter get or set run-time parameters

Y

-

begin/commit/rollback/savepoint/release transaction handling

Y

-

get get a row from a database table or view

Y

-

insert insert a row into a database table

Y

-

update update a row in a database table

Y

-

upsert insert a row with conflict resolution

Y

-

query execute a SQL command string

Y

-

query_formatted execute a formatted SQL command string

Y

-

query_prepared execute a prepared statement

Y

-

prepare create a prepared statement

Y

-

describe_prepared describe a prepared statement

Y

-

delete_prepared delete a prepared statement

Y

-

clear clear row values in memory

Y

-

delete delete a row from a database table

Y

A tuple must have unique key or primary key.

truncate quickly empty database tables

Y

-

get_as_list/dict read a table as a list or dictionary

Y

-

escape_literal/identifier/string/bytea escape for SQL

Y

-

unescape_bytea unescape data retrieved from the database

Y

-

encode/decode_json encode and decode JSON data

Y

-

use_regtypes determine use of regular type names

Y

-

notification_handler create a notification handler

N

The database does not support listen/notify.

Attributes of the DB wrapper class

Y

-

Query methods

getresult get query values as list of tuples

Y

-

dictresult/dictiter get query values as dictionaries

Y

-

namedresult/namediter get query values as named tuples

Y

-

scalarresult/scalariter get query values as scalars

Y

-

one/onedict/onenamed/onescalar get one result of a query

Y

-

single/singledict/singlenamed/singlescalar get single result of a query

Y

-

listfields list fields names of previous query result

Y

-

fieldname, fieldnum field name/number conversion

Y

-

fieldinfo detailed info about query result fields

Y

-

ntuples return number of tuples in query object

Y

-

memsize return number of bytes allocated by query result

Y

-

LargeObject Large Objects

open open a large object

N

Operations related to large objects

close close a large object

N

Operations related to large objects

read, write, tell, seek, unlink file-like large object handling

N

Operations related to large objects

size get the large object size

N

Operations related to large objects

export save a large object to a file

N

Operations related to large objects

Object attributes

N

Operations related to large objects

The Notification Handler

Instantiating the notification handler

N

The database does not support listen/notify.

Invoking the notification handler

N

The database does not support listen/notify.

Sending notifications

N

The database does not support listen/notify.

Auxiliary methods

N

The database does not support listen/notify.

pgdb

Module functions and constants

connect Open a PostgreSQL connection

Y

-

get/set/reset_typecast Control the global typecast functions

Y

-

Module constants

Y

-

Errors raised by this module

Y

-

Connection The connection object

close close the connection

Y

-

commit commit the connection

Y

-

rollback roll back the connection

Y

-

cursor return a new cursor object

Y

-

Attributes that are not part of the standard

Y

-

Cursor The cursor object

description details regarding the result columns

Y

-

rowcount number of rows of the result

Y

-

close close the cursor

Y

-

execute execute a database operation

Y

-

executemany execute many similar database operations

Y

-

callproc Call a stored procedure

Y

-

fetchone fetch next row of the query result

Y

-

fetchmany fetch next set of rows of the query result

Y

-

fetchall fetch all rows of the query result

Y

-

arraysize - the number of rows to fetch at a time

Y

-

Methods and attributes that are not part of the standard

Y

-

Type Type objects and constructors

Type constructors

Y

-

Type objects

Y

-

Using the Third-Party Function Library PyGreSQL to Connect to a Cluster (Linux)

  1. Log in to the Linux environment as user root.
  2. Run the following command to create the python_dws.py file:

    1
    vi python_dws.py
    

    Copy and paste the following content to the python_dws.py file:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    #!/usr/bin/env python3
    # _*_ encoding:utf-8 _*_
     
    from __future__ import print_function
     
    import pg
     
     
    def create_table(connection):
        print("Begin to create table")
        try:
            connection.query("drop table if exists test;"
                             "create table test(id int, name text);")
        except pg.InternalError as e:
            print(e)
        else:
            print("Table created successfully")
     
     
    def insert_data(connection):
        print("Begin to insert data")
        try:
            connection.query("insert into test values(1,'number1');")
            connection.query("insert into test values(2,'number2');")
            connection.query("insert into test values(3,'number3');")
        except pg.InternalError as e:
            print(e)
        else:
            print("Insert data successfully")
     
     
    def update_data(connection):
        print("Begin to update data")
        try:
            result = connection.query("update test set name = 'numberupdated' where id=1;")
            print("Total number of rows updated :", result)
            result = connection.query("select * from test order by 1;")
            rows = result.getresult()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
        else:
            print("After Update, Operation done successfully")
     
     
    def delete_data(connection):
        print("Begin to delete data")
        try:
            result = connection.query("delete from test where id=3;")
            print("Total number of rows deleted :", result)
            result = connection.query("select * from test order by 1;")
            rows = result.getresult()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
        else:
            print("After Delete,Operation done successfully")
     
     
    def select_data(connection):
        print("Begin to select data")
        try:
            result = connection.query("select * from test order by 1;")
            rows = result.getresult()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1])
        except pg.InternalError as e:
            print(e)
            print("select failed")
        else:
            print("Operation done successfully")
     
     
    if __name__ == '__main__':
        try:
            conn = pg.DB(host='10.154.70.231',
                         port=8000,
                         dbname='gaussdb', # Database to be connected
                         user='dbadmin',
                         passwd='password')  # Database user password
        except pg.InternalError as ex:
            print(ex)
            print("Connect database failed")
        else:
            print("Opened database successfully")
            create_table(conn)
            insert_data(conn)
            select_data(conn)
            update_data(conn)
            delete_data(conn)
            conn.close()
    

    Alternatively, use the dbapi interface.

      1
      2
      3
      4
      5
      6
      7
      8
      9
     10
     11
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
     30
     31
     32
     33
     34
     35
     36
     37
     38
     39
     40
     41
     42
     43
     44
     45
     46
     47
     48
     49
     50
     51
     52
     53
     54
     55
     56
     57
     58
     59
     60
     61
     62
     63
     64
     65
     66
     67
     68
     69
     70
     71
     72
     73
     74
     75
     76
     77
     78
     79
     80
     81
     82
     83
     84
     85
     86
     87
     88
     89
     90
     91
     92
     93
     94
     95
     96
     97
     98
     99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    #!/usr/bin/python
    # -*- coding: UTF-8 -*-
     
    from __future__ import print_function
     
    import pg
    import pgdb
     
     
    def create_table(connection):
        print("Begin to create table")
        try:
            cursor = connection.cursor()
            cursor.execute("drop table if exists test;"
                           "create table test(id int, name text);")
            connection.commit()
        except pg.InternalError as e:
            print(e)
        else:
            print("Table created successfully")
            cursor.close()
     
     
    def insert_data(connection):
        print("Begin to insert data")
        try:
            cursor = connection.cursor()
            cursor.execute("insert into test values(1,'number1');")
            cursor.execute("insert into test values(2,'number2');")
            cursor.execute("insert into test values(3,'number3');")
            connection.commit()
        except pg.InternalError as e:
            print(e)
        else:
            print("Insert data successfully")
            cursor.close()
     
     
    def update_data(connection):
        print("Begin to update data")
        try:
            cursor = connection.cursor()
            cursor.execute("update test set name = 'numberupdated' where id=1;")
            connection.commit()
            print("Total number of rows updated :", cursor.rowcount)
            cursor.execute("select * from test;")
            rows = cursor.fetchall()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
        else:
            print("After Update, Operation done successfully")
     
     
    def delete_data(connection):
        print("Begin to delete data")
        try:
            cursor = connection.cursor()
            cursor.execute("delete from test where id=3;")
            connection.commit()
            print("Total number of rows deleted :", cursor.rowcount)
            cursor.execute("select * from test;")
            rows = cursor.fetchall()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
        else:
            print("After Delete,Operation done successfully")
     
     
    def select_data(connection):
        print("Begin to select data")
        try:
            cursor = connection.cursor()
            cursor.execute("select * from test;")
            rows = cursor.fetchall()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
            print("select failed")
        else:
            print("Operation done successfully")
            cursor.close()
     
     
    if __name__ == '__main__':
        try:
            conn = pgdb.connect(host='10.154.70.231',
                                          port='8000',
                                          database='gaussdb', # Database to be connected
                                          user='dbadmin',
                                          password='password') # Database user password
        except pg.InternalError as ex:
            print(ex)
            print("Connect database failed")
        else:
            print("Opened database successfully")
            create_table(conn)
            insert_data(conn)
            select_data(conn)
            update_data(conn)
            delete_data(conn)
            conn.close()
    

  3. Change the public network address, cluster port number, database name, database username, and database password in the python_dws.py file based on the actual cluster information.

    The PyGreSQL API does not provide the connection retry capability. You need to implement the retry processing in the service code.

    1
    2
    3
    4
    5
            conn = pgdb.connect(host='10.154.70.231',
                                          port='8000',
                                          database='gaussdb', # Database to be connected
                                          user='dbadmin',
                                          password='password') # Database user password
    

  4. Run the following command to connect to the cluster using the third-party function library PyGreSQL:

    1
    python python_dws.py
    

Using the Third-Party Function Library PyGreSQL to Connect to a Cluster (Windows)

  1. In the Windows operating system, click the Start button, enter cmd in the search box, and click cmd.exe in the result list to open the command-line interface (CLI).
  2. In the CLI, run the following command to create the python_dws.py file:

    1
    type nul> python_dws.py
    

    Copy and paste the following content to the python_dws.py file:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    #!/usr/bin/env python3
    # _*_ encoding:utf-8 _*_
     
    from __future__ import print_function
     
    import pg
     
     
    def create_table(connection):
        print("Begin to create table")
        try:
            connection.query("drop table if exists test;"
                             "create table test(id int, name text);")
        except pg.InternalError as e:
            print(e)
        else:
            print("Table created successfully")
     
     
    def insert_data(connection):
        print("Begin to insert data")
        try:
            connection.query("insert into test values(1,'number1');")
            connection.query("insert into test values(2,'number2');")
            connection.query("insert into test values(3,'number3');")
        except pg.InternalError as e:
            print(e)
        else:
            print("Insert data successfully")
     
     
    def update_data(connection):
        print("Begin to update data")
        try:
            result = connection.query("update test set name = 'numberupdated' where id=1;")
            print("Total number of rows updated :", result)
            result = connection.query("select * from test order by 1;")
            rows = result.getresult()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
        else:
            print("After Update, Operation done successfully")
     
     
    def delete_data(connection):
        print("Begin to delete data")
        try:
            result = connection.query("delete from test where id=3;")
            print("Total number of rows deleted :", result)
            result = connection.query("select * from test order by 1;")
            rows = result.getresult()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
        else:
            print("After Delete,Operation done successfully")
     
     
    def select_data(connection):
        print("Begin to select data")
        try:
            result = connection.query("select * from test order by 1;")
            rows = result.getresult()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1])
        except pg.InternalError as e:
            print(e)
            print("select failed")
        else:
            print("Operation done successfully")
     
     
    if __name__ == '__main__':
        try:
            conn = pg.DB(host='10.154.70.231',
                         port=8000,
                         dbname='gaussdb', # Database to be connected
                         user='dbadmin',
                         passwd='password')  # Database user password
        except pg.InternalError as ex:
            print(ex)
            print("Connect database failed")
        else:
            print("Opened database successfully")
            create_table(conn)
            insert_data(conn)
            select_data(conn)
            update_data(conn)
            delete_data(conn)
            conn.close()
    

    Alternatively, use the dbapi interface.

      1
      2
      3
      4
      5
      6
      7
      8
      9
     10
     11
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
     30
     31
     32
     33
     34
     35
     36
     37
     38
     39
     40
     41
     42
     43
     44
     45
     46
     47
     48
     49
     50
     51
     52
     53
     54
     55
     56
     57
     58
     59
     60
     61
     62
     63
     64
     65
     66
     67
     68
     69
     70
     71
     72
     73
     74
     75
     76
     77
     78
     79
     80
     81
     82
     83
     84
     85
     86
     87
     88
     89
     90
     91
     92
     93
     94
     95
     96
     97
     98
     99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    #!/usr/bin/python
    # -*- coding: UTF-8 -*-
     
    from __future__ import print_function
     
    import pg
    import pgdb
     
     
    def create_table(connection):
        print("Begin to create table")
        try:
            cursor = connection.cursor()
            cursor.execute("drop table if exists test;"
                           "create table test(id int, name text);")
            connection.commit()
        except pg.InternalError as e:
            print(e)
        else:
            print("Table created successfully")
            cursor.close()
     
     
    def insert_data(connection):
        print("Begin to insert data")
        try:
            cursor = connection.cursor()
            cursor.execute("insert into test values(1,'number1');")
            cursor.execute("insert into test values(2,'number2');")
            cursor.execute("insert into test values(3,'number3');")
            connection.commit()
        except pg.InternalError as e:
            print(e)
        else:
            print("Insert data successfully")
            cursor.close()
     
     
    def update_data(connection):
        print("Begin to update data")
        try:
            cursor = connection.cursor()
            cursor.execute("update test set name = 'numberupdated' where id=1;")
            connection.commit()
            print("Total number of rows updated :", cursor.rowcount)
            cursor.execute("select * from test;")
            rows = cursor.fetchall()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
        else:
            print("After Update, Operation done successfully")
     
     
    def delete_data(connection):
        print("Begin to delete data")
        try:
            cursor = connection.cursor()
            cursor.execute("delete from test where id=3;")
            connection.commit()
            print("Total number of rows deleted :", cursor.rowcount)
            cursor.execute("select * from test;")
            rows = cursor.fetchall()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
        else:
            print("After Delete,Operation done successfully")
     
     
    def select_data(connection):
        print("Begin to select data")
        try:
            cursor = connection.cursor()
            cursor.execute("select * from test;")
            rows = cursor.fetchall()
            for row in rows:
                print("id = ", row[0])
                print("name = ", row[1], "\n")
        except pg.InternalError as e:
            print(e)
            print("select failed")
        else:
            print("Operation done successfully")
            cursor.close()
     
     
    if __name__ == '__main__':
        try:
            conn = pgdb.connect(host='10.154.70.231',
                                          port='8000',
                                          database='gaussdb', # Database to be connected
                                          user='dbadmin',
                                          password='password') # Database user password
        except pg.InternalError as ex:
            print(ex)
            print("Connect database failed")
        else:
            print("Opened database successfully")
            create_table(conn)
            insert_data(conn)
            select_data(conn)
            update_data(conn)
            delete_data(conn)
            conn.close()
    

  3. Change the public network address, cluster port number, database name, database username, and database password in the python_dws.py file based on the actual cluster information.

    The PyGreSQL API does not provide the connection retry capability. You need to implement the retry processing in the service code.

    1
    2
    3
    4
    5
            conn = pgdb.connect(host='10.154.70.231',
                                          port='8000',
                                          database='gaussdb', # Database to be connected
                                          user='dbadmin',
                                          password='password') # Database user password
    

  4. Run the following command to connect to the cluster using the third-party function library PyGreSQL:

    1
    python python_dws.py