Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

How do database migrations fail?

...

This can be caused by problems with configuration, such as missing usernames or passwords in /etc/xroad.properties or /etc/xroad/db.properties

Excerpt from a package upgrade output, where xroad-opmonitor database migrations failed due to missing op-monitor.database.admin_password from /etc/xroad.properties:

Code Block
...
Setting up xroad-opmonitor (6.25.0-0.20201013130750git2e3558f.ubuntu18.04) ...
Installing new version of config file /etc/xroad/conf.d/op-monitor-logback.xml ...
Installing new version of config file /etc/xroad/services/opmonitor.conf ...
Database and user exists, skipping database creation.
Unexpected error running Liquibase: ERROR: must be owner of relation operational_data [Failed SQL: ALTER TABLE public.operational_data ADD service_type VARCHAR(255)]
Connection to database has failed, please check database availability and configuration in /etc/xroad/db.properties file

...

Code Block
# dpkg -l | grep opmon
ii xroad-addon-opmonitoring 6.25.0-0.20201013130750git2e3558f.ubuntu18.04 all X-Road AddOn: operations monitoring service
ii xroad-opmonitor 6.25.0-0.20201013130750git2e3558f.ubuntu18.04 all X-Road operations monitoring daemon

However, /var/log/xroad/op-monitor.log shows (after some messages have been sent through security server) signs of database access trouble due to missing migrations: 

Code Block
2020-10-14T09:56:50.813Z [qtp1348115836-43] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - ERROR: column "request_size" of relation "operational_data" does not exist
Position: 559
2020-10-14T09:56:50.814Z [qtp1348115836-43] ERROR e.r.x.o.OpMonitorDaemonRequestHandler - Error while handling data store request
javax.persistence.PersistenceException: org.hibernate.exception.SQLGrammarException: could not execute batch
at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)

In this example problem was with migrations for op-monitor database. Similar issues can occur also for serverconf and messagelog database migrations. 

Finding the root cause and fixing it

To fix migrations, first you need to find the root cause that prevented migrations from being run. In the earlier example, error message "Connection to database has failed, please check database availability and configuration in /etc/xroad/db.properties file" provided some important hints on this. Typically the problem is with either of these configuration files:

  • /etc/xroad/db.properties

    • relevant parameters for regular database users

      • <database name>.hibernate.connection.url

      • <database name>.hibernate.connection.username

      • <database name>.hibernate.connection.password

  • /etc/xroad.properties

    • relevant parameters for database admin users

      • <database name>.database.admin_user

      • <database name>.database.admin_password

Naturally the root cause might be something else as well, such as problems with network connectivity, etc.

...

As a first step, you should check whether the configuration in db.properties and xroad.properties is correct, and verify that the usernames and passwords match to existing users. 

...

Error message "password authentication failed for user "opmonitor_admin"" tells us that

  • login to database was attempted as user "opmonitor_admin"

  • login did not succeed. Maybe the password is wrong (or missing)?

To fix, verify that /etc/xroad.properties looks healthy, and maybe test configured admin username and password with e.g.:

Code Block
# psql -d op-monitor -U opmonitor_admin --host=localhost

In case password is wrong and you don't know the right one, you may need to change the password of relevant database user, and update property files accordingly. 

Example of a healthy /etc/xroad.properties

Code Block
op-monitor.database.admin_password = <your_password>
op-monitor.database.admin_user = opmonitor_admin

...

Error message "must be owner of relation operational_data" tells us that

  • login to database was successful as some database user (output does not tell us which one)

  • this database user did not own relation operational_data, and hence could not make changes to it

This could occur if opmonitor_admin user exists but /etc/xroad.properties contains neither admin_password nor admin_user properties. In this case migrations are attempted as regular user "opmonitor" (or whichever username is configured in db.properties), which may not have correct permissions.

To fix, add correct op-monitor.database.admin_password and op-monitor.database.admin_user properties to /etc/xroad.properties.

Example 3: Wrong serverconf password (RHEL)

...

Here, update from 6.23 to 6.25 was attempted when db.properies had wrong serverconf password. There was no admin user for serverconf, and no admin user configuration in xroad.properties.

As a result package upgrade was successful, but serverconf database does not have the latest migrations, which causes problems when application is run.

Installer has also attempted to create serverconf admin user. It has failed, but xroad.properties has been modified to contain (invalid) admin user name and password.

...

Looking at the database, user serverconf_admin does not exist:

Code Block
serverconf=# \du
List of roles
Role name | Attributes | Member of 
-----------------+------------------------------------------------+-------------------
messagelog | | {}
opmonitor | | {}
opmonitor_admin | | {}
postgres | Superuser, Create role, Create DB, Replication | {opmonitor_admin}
serverconf | | {}

Now, fixing regular serverconf password in db.properties and then running yum reinstall xroad-proxy fails. Installer tries to use serverconf_admin details from xroad.properties. To fix migrations successfully,

  • update correct regular user password in db.properties

  • remove serverconf_admin properties from xroad.properties

  • sudo yum reinstall xroad-proxy

Main steps for fixing failed database migrations

All in all, steps to recover from failed migrations are (adapt to the correct database opmonitor/serverconf/messagelog):

  1.  Find the root cause for failed migrations, and fix it

  2. Repair broken package installation

    1. for op-monitor db migrations 

      1. Ubuntu

        1. try to run

...

        1. sudo apt install -f

...

        1. If that does not help, running

...

        1. sudo apt install --reinstall xroad-opmonitor

...

        1. should do the trick

      1. RHEL

        1. running

...

        1. sudo yum reinstall xroad-opmonitor

...

        1. should do the trick

    1. for serverconf db migrations: follow op-monitor instuctions and replace xroad-opmonitor with xroad-proxy in the reinstall commands

    2. for messagelog db migrations: follow op-monitor instuctions and replace xroad-opmonitor with xroad-addon-messagelog in the reinstall commands

  1. If completing the steps 2 does not fix the problem for some reason, as a last resort, it is possible to run the database setup/migration manually

    1. op-monitor 

      Code Block
      # sudo /usr/share/xroad/scripts/setup_opmonitor_db.sh
    2. serverconf 

      Code Block
      # sudo /usr/share/xroad/scripts/setup_serverconf_db.sh
    3. messagelog 

      Code Block
      # sudo /usr/share/xroad/scripts/setup_messagelog_db.sh

Filter by label (Content by label)
showLabelsfalse
max5
spacesXRDKB
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel = "security-server" and type = "page" and space = "XRDKB"
labelssecurity-server


Page Properties
hiddentrue


Related issues