How to Fix a Failed Security Server Database Migration?
How do database migrations fail?
Sometimes database migrations may fail when they are executed during package installation or upgrade.
This can be caused by problems with configuration, such as missing usernames or passwords in /etc/xroad.properties
or /etc/xroad/db.properties
Excerpt from a package upgrade output, where xroad-opmonitor
database migrations failed due to missing op-monitor.database.admin_password
from /etc/xroad.properties
:
...
Setting up xroad-opmonitor (6.25.0-0.20201013130750git2e3558f.ubuntu18.04) ...
Installing new version of config file /etc/xroad/conf.d/op-monitor-logback.xml ...
Installing new version of config file /etc/xroad/services/opmonitor.conf ...
Database and user exists, skipping database creation.
Unexpected error running Liquibase: ERROR: must be owner of relation operational_data [Failed SQL: ALTER TABLE public.operational_data ADD service_type VARCHAR(255)]
Connection to database has failed, please check database availability and configuration in /etc/xroad/db.properties file
In this case installation finished successfully and the installed packages look healthy:Â
# dpkg -l | grep opmon
ii xroad-addon-opmonitoring 6.25.0-0.20201013130750git2e3558f.ubuntu18.04 all X-Road AddOn: operations monitoring service
ii xroad-opmonitor 6.25.0-0.20201013130750git2e3558f.ubuntu18.04 all X-Road operations monitoring daemon
However, /var/log/xroad/op-monitor.log
shows (after some messages have been sent through security server) signs of database access trouble due to missing migrations:Â
2020-10-14T09:56:50.813Z [qtp1348115836-43] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - ERROR: column "request_size" of relation "operational_data" does not exist
Position: 559
2020-10-14T09:56:50.814Z [qtp1348115836-43] ERROR e.r.x.o.OpMonitorDaemonRequestHandler - Error while handling data store request
javax.persistence.PersistenceException: org.hibernate.exception.SQLGrammarException: could not execute batch
at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)
In this example problem was with migrations for op-monitor database. Similar issues can occur also for serverconf and messagelog database migrations.Â
Finding the root cause and fixing it
To fix migrations, first you need to find the root cause that prevented migrations from being run. In the earlier example, error message "Connection to database has failed, please check database availability and configuration in /etc/xroad/db.properties file" provided some important hints on this. Typically the problem is with either of these configuration files:
/etc/xroad/db.properties
relevant parameters for regular database users
<database name>.hibernate.connection.url
<database name>.hibernate.connection.username
<database name>.hibernate.connection.password
/etc/xroad.properties
relevant parameters for database admin users
<database name>.database.admin_user
<database name>.database.admin_password
Naturally the root cause might be something else as well, such as problems with network connectivity, etc.
Use of regular and admin database user depends on the installed / upgraded security server version and the used configuration (local or remote database). In some configurations only regular db user exists, and this user does both migrations and data access for the application. In other configurations regular user does data access for the application, and admin user does the migrations.Â
As a first step, you should check whether the configuration in db.properties and xroad.properties
is correct, and verify that the usernames and passwords match to existing users.Â
Different error messages point to different problems. A couple of examples:
Example 1: Wrong opmonitor_admin password (Ubuntu)
Error message "password authentication failed for user "opmonitor_admin"" tells us that
login to database was attempted as user "opmonitor_admin"
login did not succeed. Maybe the password is wrong (or missing)?
To fix, verify that /etc/xroad.properties
looks healthy, and maybe test configured admin username and password with e.g.:
In case password is wrong and you don't know the right one, you may need to change the password of relevant database user, and update property files accordingly.Â
Example of a healthy /etc/xroad.properties
:Â
Example 2: Missing opmonitor_admin user (Ubuntu)
Error message "must be owner of relation operational_data" tells us that
login to database was successful as some database user (output does not tell us which one)
this database user did not own relation operational_data, and hence could not make changes to it
This could occur if opmonitor_admin user exists but /etc/xroad.properties
contains neither admin_password nor admin_user properties. In this case migrations are attempted as regular user "opmonitor" (or whichever username is configured in db.properties), which may not have correct permissions.
To fix, add correct op-monitor.database.admin_password
and op-monitor.database.admin_user properties
to /etc/xroad.properties
.
Example 3: Wrong serverconf password (RHEL)
In this case the error message from installer is not very helpful.
Here, update from 6.23 to 6.25 was attempted when db.properies
had wrong serverconf password. There was no admin user for serverconf, and no admin user configuration in xroad.properties
.
As a result package upgrade was successful, but serverconf database does not have the latest migrations, which causes problems when application is run.
Installer has also attempted to create serverconf admin user. It has failed, but xroad.properties
has been modified to contain (invalid) admin user name and password.
Looking at the database, user serverconf_admin
does not exist:
Now, fixing regular serverconf password in db.properties
and then running yum reinstall xroad-proxy
fails. Installer tries to use serverconf_admin
details from xroad.properties
. To fix migrations successfully,
update correct regular user password in
db.properties
remove
serverconf_admin
properties fromxroad.properties
sudo yum reinstall xroad-proxy
Main steps for fixing failed database migrations
All in all, steps to recover from failed migrations are (adapt to the correct database opmonitor/serverconf/messagelog):
 Find the root cause for failed migrations, and fix it
Repair broken package installation
for op-monitor db migrationsÂ
Ubuntu:Â
try to run
sudo apt install -f
If that does not help, running
sudo apt install --reinstall xroad-opmonitor
should do the trick
RHEL:Â
running
sudo yum reinstall xroad-opmonitor
should do the trick
for serverconf db migrations: follow op-monitor instuctions and replace xroad-opmonitor with xroad-proxy in the reinstall commands
for messagelog db migrations: follow op-monitor instuctions and replace xroad-opmonitor with xroad-addon-messagelog in the reinstall commands
If completing the steps 2 does not fix the problem for some reason, as a last resort, it is possible to run the database setup/migration manually
op-monitorÂ
serverconfÂ
messagelogÂ