6.2. pilotscope.PilotConfig

class PilotConfig(db_type: pilotscope.PilotEnum.DatabaseEnum, db='stats_tiny', pilotscope_core_host='localhost', user_data_db_name='PilotScopeUserData', sql_execution_timeout=300, once_request_timeout=300)[source]

Bases: object

The PilotConfig class is used for storing and managing configuration information for PilotScope, including the host address of PilotScope, the name of the database to connect to, the username and password to log into the database, etc.

__init__(db_type: pilotscope.PilotEnum.DatabaseEnum, db='stats_tiny', pilotscope_core_host='localhost', user_data_db_name='PilotScopeUserData', sql_execution_timeout=300, once_request_timeout=300)None[source]

Initialize the PilotConfig.

Parameters
  • db_type – the type of database, i.e. PostgreSQL, SparkSQL, etc.

  • db – the name of connected database

  • pilotscope_core_host – the host address of PilotScope in ML side.

  • user_data_db_name – the created database name for saving the user data. If users want to visit these data, they can set db=user_data_db_name.

  • sql_execution_timeout – the timeout of sql execution, unit: second

  • once_request_timeout – the timeout of once request, unit: second

print()[source]

Print the configuration information of PilotScope.

class PostgreSQLConfig(pilotscope_core_host='localhost', db_host='localhost', db_port='5432', db_user='pilotscope', db_user_pwd='pilotscope', db='stats_tiny')[source]

Bases: pilotscope.PilotConfig.PilotConfig

__init__(pilotscope_core_host='localhost', db_host='localhost', db_port='5432', db_user='pilotscope', db_user_pwd='pilotscope', db='stats_tiny')None[source]
Parameters
  • pilotscope_core_host – the host address of PilotScope in ML side.

  • db_host – the host address of database

  • db_port – the port of database

  • db_user – the username to log into the database

  • db_user_pwd – the password to log into the database

enable_deep_control_local(pg_bin_path: str, pg_data_path: str)[source]

Enable deep control for PostgreSQL, such as starting and stopping database, changing config file, etc. If you do not need these functions, it is not necessary to set these values. If the database and PilotScope Core are on the same machine, you can use this function, i.e., pilotscope_core_host != db_host. Otherwise, use enable_deep_control_remote

Parameters
  • pg_bin_path – the directory of binary file of postgresql, e.g., /postgres_install_path/bin

  • pg_data_path – location of the database data storage

  • db_host_user – the username to log into the database host

  • db_host_pwd – the password to log into the database host

enable_deep_control_remote(pg_bin_path, pg_data_path, db_host_user, db_host_pwd, db_host_ssh_port=22)[source]

Enable deep control for PostgreSQL, such as starting and stopping database, changing config file, etc. If you do not need these functions, it is not necessary to set these values. If the database and PilotScope Core are not on the same machine, you can use this function, i.e., pilotscope_core_host != db_host. Otherwise, use enable_deep_control_local

Parameters
  • pg_bin_path – the directory of binary file of postgresql, e.g., /postgres_install_path/bin

  • pg_data_path – location of the database data storage

  • db_host_user – the username to log into the database host

  • db_host_pwd – the password to log into the database host

  • db_host_ssh_port – the port of ssh service on the database host

class SparkConfig(app_name='testApp', master_url='local[*]')[source]

Bases: pilotscope.PilotConfig.PilotConfig

__init__(app_name='testApp', master_url='local[*]')None[source]
Parameters
  • app_name – the name of the application of Spark

  • master_url – the master URL of Spark cluster

set_spark_session_config(config: dict)[source]
enable_cardinality_estimation()[source]

Spark SQL support cost-based optimization but it is disabled by default. If you need to enable pull_subquery_card and push_card, please call this function, and PilotScope will set the corresponding parameters. This will consume more time, but the performance of the SQL will be better.

use_postgresql_datasource(db_host='localhost', db_port='5432', db_user='postgres', db_user_pwd='postgres', db='stats_tiny')[source]

Set up a PostgreSQL data source.

Parameters
  • db_host – the host of postgresql, defaults to “localhost”

  • db_port – the network port of postgresql, defaults to “5432”

  • db_user – the username to log into postgresql, defaults to “postgres”

  • db_user_pwd – the password of the user, defaults to “postgres”

  • db – database name, defaults to “stats_tiny”