SimDB server maintenance guide¶
This guide describes the steps needed to set up and maintain a SimDB server as a production service. The first section details the general steps required to do this, followed by details on how this is done at ITER.
Installing SimDB¶
First clone the master branch of SimDB:
git clone https://github.com/iterorganization/SimDB.git
Next set up the virtual environment:
cd simdb
python3 -m venv venv
source venv/bin/activate
And install SimDB:
pip install -e .[all]
Note: If you plan to run the server with a PostgreSQL database you will also need to install the psycopg2-binary library.
You can test the SimDB installation by running:
simdb --version
Running the server (using built-in http server)¶
Note: Running the SimDB server using the built-in http server is for testing/development only and should not be used in production. In production you should run the SimDB server behind a dedicated web-server such as NGinx (see the Running the server behind nginx & gunicorn section below).
Once simdb has been installed, before you can run the server you need to create the server configuration file. This file should be created in the application configuration directory which can be located by using:
dirname "$(simdb config path)"
For example on Linux this would be:
/home/$USER/.config/simdb
On macOS this would be:
/Users/$USER/Library/Application Support/simdb
In this directory you should create a file ‘app.cfg’ specifying the server configuration. This file must have permissions set to 0600 i.e. user read only.
Options for the server configuration are:
Section |
Option |
Required |
Description |
|---|---|---|---|
database |
type |
yes |
Database type [sqlite, postgres]. |
database |
file |
yes (type=sqlite) |
Database file (for sqlite) - defaults to remote.db in the user data directory if not specified. |
database |
host |
yes (type=postgres) |
Database host (for postgres). |
database |
port |
yes (type=postgres) |
Database port (for postgres). |
database |
name |
yes (type=postgres) |
Database name (for postgres). |
server |
upload_folder |
yes |
Root directory where SimDB simulation files are stored. |
server |
ssl_enabled |
no |
Flag [True, False] to specify whether the debug server uses SSL - this should be set to False for production servers behind dedicated webserver. Defaults to False. |
server |
ssl_cert_file |
yes (ssl_enabled=True) |
Path to SSL certificate file if ssl_enabled is True. |
server |
ssl_key_file |
yes (ssl_enabled=True) |
Path to SSL key file if ssl_enabled is True. |
server |
admin_password |
yes |
Password for admin superuser. |
server |
token_lifetime |
no |
Number of days generated tokens are valid for - defaults to 30 days. |
server |
imas_remote_host |
no |
Host name to set on ingested IMAS URIs which will be used to fetch the data via the specified IMAS remote access server. I.e. imas:hdf5?path=foo becomes imas://<imas_remote_host>:<imas_remote_port>/uda?path=foo&backend=hdf5 on ingest. |
server |
imas_remote_port |
no |
Port to set on ingested IMAS URIs on ingest. See imas_remote_host for more details. |
flask |
flask_env |
no |
Flask server environment [development, production] - defaults to production. |
flask |
debug |
no |
Flag [True, Flase] to specify whether Flask server is run with debug mode enabled - defaults to True if flask_env=’development’, otherwise False. |
flask |
testing |
no |
Flag [True, False] to specify whether exceptions are propagated rather than being handled by Flask’s error handlers - defaults to False. |
flask |
secret_key |
yes |
Secret key used to encrypt server messages including authentication tokens - should be at least 20 characters long. |
flask |
swagger_ui_doc_expansion |
no |
Default state of the Swagger UI documentations [none, list, full]. |
validation |
auto_validate |
no |
Flag [True, False] to set whether the server should run validation on uploaded simulations (including running any selected file_validation) automatically. Defaults to False. |
validation |
error_on_fail |
no |
Flag [True, False] to set whether simulations that fail validation should be rejected - auto_validate must be set to True if this flag is set to True. Defaults to False |
server |
yes |
SMTP server used to send emails from the SimDB server. |
|
port |
yes |
SMTP server port port. |
|
user |
yes |
SMTP server user to send emails from . |
|
password |
yes |
SMTP server user password. |
|
development |
disable_checksum |
yes |
Flag [True, False] to set whether integrity checks should be perform or not. Defaults to False |
File validation options¶
Section |
Option |
Required |
Description |
|---|---|---|---|
file_validation |
type |
no |
Name of file Validator to use to validate simulation data i.e “ids_validator”, “test_validator” etc. At the moment only “ids_validator” is available. |
file_validation |
extra_rule_dirs |
yes (ids_validator only) |
Paths to directory containing additional rulesets used by ids_validator i.e. “./path/to/ruleset_dir_1,/path_to_ruleset_dir_2,etc.” |
file_validation |
rulesets |
yes (ids_validator only) |
Name of rulesets [generic, extra_ruleset] (directory containing python scripts) used for IDS validation i.e. “my_custom_ruleset,summary_time,etc.” |
file_validation |
bundled_ruleset |
yes (ids_validator only) |
Flag [True, False] to load the rulesets bundled with ids_validator. Defaults to True |
file_validation |
apply_generic |
yes (ids_validator only) |
Flag [True, False] to apply generic rulesets. Defaults to True |
file_validation |
rule_filter_name |
yes (ids_validator only) |
Only rulesets containing specified names will be applied, i.e. “summary_test_1,core_profiles_test_1,etc.” |
file_validation |
rule_filter_ids |
yes (ids_validator only) |
Only rulesets concerning specified IDS will be applied, i.e. “summary,equilibrium,etc.” |
Authentication options¶
Section |
Option |
Required |
Description |
|---|---|---|---|
authentication |
type |
yes |
Name of the authentication method used by the server to authenticate users - current options are [ActiveDirectory, LDAP, None]. See sections below for details of extra options required for the Active Directory and LDAP authentication options. |
authentication |
firewall_auth |
no |
Flag [True, False] to specify that the server is being run behind a firewall and that the authentication should be read from the firewall headers. |
authentication |
firewall_user |
no |
Name of the firewall header to use for the user name. Required if firewall_auth is True. |
authentication |
firewall_email |
no |
Name of the firewall header to use for the user’s email. Required if firewall_auth is True. |
Activate Directory authentication options¶
Section |
Option |
Required |
Description |
|---|---|---|---|
authentication |
ad_server |
yes |
Active directory server used for user authentication. |
authentication |
ad_domain |
yes |
Active directory domain used for user authentication. |
authentication |
ad_cert |
no |
Path to the root ca certificate used for user authentication. |
LDAP authentication options¶
Section |
Option |
Required |
Description |
|---|---|---|---|
authentication |
ldap_server |
yes |
LDAP server URI. |
authentication |
ldap_bind |
yes |
Bind string - this can contain {username} which will be replaced by the username of the user attempting to authenticate, i.e. “uid={username},ou=Users,dc=eufus,dc=eu”. |
authentication |
ldap_query_user |
no |
Bind user used to run LDAP queries, i.e. “uid=f2bind,ou=Users,dc=eufus,dc=eu” - if not provided the queries are run as the authenticated user. |
authentication |
ldap_query_password |
no |
Password corresponding to ldap_query_user. Only required if ldap_query_user is specified. |
authentication |
ldap_query_base |
yes |
Base point to start the query from, i.e. “dc=eufus,dc=eu”. |
authentication |
ldap_query_filter |
yes |
Query filter used to find the user - this can contain {username} which will be replaced by the username of the user attempting to authenticate, i.e. “(uid={username})”. |
authentication |
ldap_query_uid |
no |
Name of the user parameter in the LDAP search query - defaults to ‘uid’. |
authentication |
ldap_query_mail |
no |
Name of the email parameter in the LDAP search query - defaults to ‘mail’. |
Caching options¶
Section |
Option |
Required |
Description |
|---|---|---|---|
cache |
type |
no |
Type of caching to use. Options include NullCache (default), SimpleCache, FileSystemCache. SimpleCache is a memory based cache and FileSystemCache caches using files. Configuration options for these are given below. |
cache |
dir |
no |
Directory to store cache. Used only for FileSystemCache. |
cache |
default_timeout |
no |
The default timeout that is used if no timeout is specified. Unit of time is seconds. |
cache |
threshold |
no |
The maximum number of items the cache will store before it starts deleting some. Used only for SimpleCache and FileSystemCache |
More caching options can be found in the Flask-Caching documentation. You can convert the caching options for the library to SimDB configuration by removing the CACHE_ prefix and converting to lowercase, i.e. CACHE_ARGS becomes args in the [cache] section.
Role options¶
Section |
Option |
Required |
Description |
|---|---|---|---|
role |
users |
yes |
A comma separated list of the the users assigned to this role. |
Each role must be given a name in the section header, and whilst defining any roles is optional each role section must
have a users option.
For example:
[role "admin"]
users = admin,user1,user2
Currently only the admin role is used in SimDB (this is the set of users able to perform CLI command in the admin
command subgroup).
Example configuration files¶
Example of app.cfg for SQLite:
[flask]
flask_env = development
debug = True
testing = True
secret_key = CHANGE_ME
[server]
upload_folder = /tmp/simdb/simulations
ssl_enabled = False
admin_password = admin
[database]
type = sqlite
[validation]
auto_validate = True
error_on_fail = True
[email]
server = smtp.email.com
port = 465
user = test@email.com
password = abc123
[development]
disable_checksum = True
Example of app.cfg for PostgreSQL (see Setting up PostgreSQL database):
...
[database]
type = postgres
host = localhost
port = 5432
DB_TYPE = "postgres"
DB_HOST = "localhost"
DB_PORT = 5432
UPLOAD_FOLDER = "/tmp/simdb/simulations"
DEBUG = False
SSL_ENABLED = True
...
Now create a validation schema in the application configuration directory, which can be located by using:
dirname "$(simdb config path)"
In this directory, you should create a file ‘validation-schema.yaml’ specifying the validation schema. Example of validation-schema.yaml:
description:
required: true
type: string
Once the server configuration has been created you should be able to run
simdb_server
And see some console output such as:
* Serving Flask app "simdb.remote.app" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: on
* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
Note: If it fails to run with an error stating that it cannot bind to a port then you will need to see check whatever service is running on port 5000 and shut this down if possible. If you need to modify the port you will need to edit the simdb_server script (which you can locate using which simdb_server), changing the port number.
Follow the url in the output (you can do this using a browser or using curl, e.g. curl http://0.0.0.0:5000), and you should see the returned JSON data:
{ urls: [ "http://0.0.0.0:5000/api/v0.1.1" ] }
This is running the Flask’s internal webserver and should only be used for development or testing. For production the server should be run behind a dedicated webserver and load balancer, see below for details for how to do this using Gunicorn and Nginx.
Using SSL¶
If you want to run using SSL encryption you will need to provide a server certificate and private key in the application configuration directory.
A way to generate these, is using the openssl command:
openssl req -x509 -out server.crt -keyout server.key \
-newkey rsa:2048 -nodes -sha256 \
-subj '/CN=localhost' -extensions EXT -config <( \
printf "[dn]\nCN=localhost\n[req]\ndistinguished_name = dn\n[EXT]\nsubjectAltName=DNS:localhost\nkeyUsage=digitalSignature\nextendedKeyUsage=serverAuth")
However, you will want to use a valid signing authority in production.
Running the server behind nginx & gunicorn¶
To run the server in production you should run it as wsgi service behind a dedicated web server. To run using nginx (as a load-balancer/proxy) and gunicorn (as the web server) we need to set up the services as follows.
Note: The instructions below assume you already have nginx and gunicorn installed.
Set up gunicorn service¶
Copy the init.d script from src/simdb/remote/scripts/simdb.initd in the simdb install directory (i.e. /usr/local/lib/python3.7/site-packages/simdb/remote) as /etc/init.d/simdb.
You will need to modify the line USER=simdb to change to user to whichever user you wish to run the simdb as (the gunicorn service will run as root but the workers will run in user space). You will also need to modify the line DAEMON=/home/simdb/venv/bin/gunicorn to change the path to point towards the gunicorn installed in your virtual environment - you can find this path by running which gunicorn whilst the virtual environment is active.
Once you have copied and modified the init.d script you can start the gunicorn service using:
service simdb start
And check that it is running using:
service simdb status
Set up nginx service¶
Create a simdb.conf script in /etc/nginx/conf.d/simdb.conf
server {
listen 80;
server_name localhost; # or the address of the server you are running
location / {
include proxy_params;
proxy_pass http://unix:/var/run/simdb.sock;
}
}
Alternatively, copy the script provided as simdb/remote/simdb.nginx (in the simdb installation directory, i.e. /usr/local/lib/python3.7/site-packages/simdb/remote) to:
/etc/nginx/conf.d/simdb.conf
The proxy_pass line should point to the endpoint of the gunicorn service (set by the BIND variable in the init.d script).
Note: If you do not have a proxy_params file in /etc/nginx you can create one containing the following:
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
Note: check that the line include /etc/nginx/conf.d/*.conf; is defined in your /etc/nginx/nginx.conf script, if not you can add it inside the http {} section.
Now you can restart nginx using:
service nginx restart
You should now be able to check the simdb server is running by going to the http address defined in your nginx site (localhost:80 in the example above).
Nginx Request Entity Size¶
You may need to increase the size of uploaded files that Nginx will accept. For SimDB this should be at least 100MB.
You can set this by changing the following option in your /etc/nginx/nginx.conf file:
client_max_body_size 100m;
Using SSL with the Gunicorn/Nginx¶
In production, you should be using HTTPS not HTTP for the SimDB server. To do this with Nginx you can change the simdb.conf in the /etc/nginx/sites-available that you created in the previous section.
Change this to be:
server {
listen 443 ssl;
server_name localhost; # or the address of the server you are running
# Use only TLS
ssl_protocols TLSv1.1 TLSv1.2;
# Tell client which ciphers are available
ssl_prefer_server_ciphers on;
ssl_ciphers ECDH+AESGCM:ECDH+AES256:ECDH+AES128:DH+3DES:!ADH:!AECDH:!MD5;
# Certificates
ssl_certificate /etc/pki/nginx/server.crt;
ssl_certificate_key /etc/pki/nginx/private/server.key;
location / {
include proxy_params;
proxy_pass http://unix:/var/run/simdb.sock;
}
}
server {
# Redirect HTTP traffic to HTTPS
if ($host = localhost) { # or the address of the server you are running
return 301 https://$host$request_uri;
}
server_name localhost; # or the address of the server you are running
listen 80;
return 404;
}
The ssl_certificate and ssl_certificate_key should be set to point to the SSL certificate and key that you have generated using a valid signing authority for the server.
Setting up PostgreSQL database¶
For the production server you should be using a production DBMS. To use PostgreSQL as the DBMS you can use the following instructions.
First, install PostgreSQL:
sudo yum -y install postgresql-server postgresql-contrib
Next, initialise the database:
postgresql-setup initdb
You then need to connect to the database as the postgres user. You can do this using:
sudo -u postgres psql
And run the following:
CREATE DATABASE simdb;
CREATE ROLE simdb;
ALTER DATABASE simdb OWNER TO simdb;
ALTER ROLE "simdb" WITH LOGIN;
This is assuming your webserver is running as user simdb. If not, you should change the role name above to match whichever user you are running the server under.