PostgreSQL: Trezor Performance

customer · April 21, 2020, 11:35am

Some PostgreSQL trezors are slow to perform when selecting data. Selectes take a long time (in minutes) or directly from pgAdmin or HI.

We have TEDIS_TREZOR_1 to TEDIS_TREZOR_18 safe deposit boxes.

Safes from # 1 to # 7 are yearly safes imported from Sybase (first and last have a smaller interval) and have a size of about 170GB.

Safes # 8 and # 9 are the last 2 months of 2018 (12GB and 13GB size). Safes from # 10 and above are from this year with a period of 1 month (size from 15GB to 20GB). The sizes listed are the sizes of the entire database, including indexes.

Safes from # 9 are generated automatically by the D2000 system.

The problem is set when removing from vaults this year (# 10 and above). Simply select takes anywhere from 5 to 20 minutes. When I do the same select on the largest vault (170GB), it will take 10-20 seconds. On safes 8 and 9 within 1 second.

I noticed that the trezors 8 and 9 have a clustered index, so I tried to change the new ones, we also tried REINDEX TABLE and VACUUM ALL, but nothing helped, it is still extremely slow.

Please prioritize the solution, blocking the development of the project. Well thank you.

helpdesk · April 21, 2020, 11:36am

Good day.

Take a look at https://doc.ipesoft.com/label/D2DOCEN/trezorove_databazy

At the end of that page there is a text:

An example of a batch file used to perform a depository database maintenance and export as well as export of table trezors located in archive database MyApp.Archiv . The batch file requires as a parameter the name of depository database, which can be provided by setting the parameter TrezorPostCompressPar to value #TREZOR# .

And there are 4 things done (before the export itself):

permitting write access to the depository database:

rem permit write access to depository database and cluster the data table
echo alter database “%1” set default_transaction_read_only=false | psql -S -U postgres MyApp.Archiv >> %MyDir%%1.log

configuring clustering of data based on an index:

echo alter table data on ix_data_rc | psql -S -U postgres% 1 >>% MyDir %% 1.log

clustering of data (it may take from a few minutes to several hours, according to DB size and HDD speed):

echo cluster data | psql -S -U postgres% 1 >>% MyDir %% 1.log

setting read-only access to the database:

rem set access to storage database back to read only
echo alter database “% 2” set default_transaction_read_only = true | psql -S -U postgres MyApp.Archiv >>% MyDir %% 1.log

Try it out (first manually on some depository database). It would be a good idea to prevent the archive’s access to the depository database during the maintenance (because it will either fail or block the maintenance job), so DISMOUNT the depository database first.

However, when a database is dismounted, the access to the database is forbidden - see above in the same document:

update pg_database set datallowconn = false where datname = ‘APLIKACIA_TREZOR_#ID#’

So you have to manually perform
update pg_database set datallowconn = true where datname = ‘APPLICATION_TREZOR_#ID#’
to be able to connect to the database.

Then you can automate it by putting it into a batch file that is run as the TrezorPostCompressCmd. And set the TrezorCompressOffline to 2 - according to the manual:

Value 2 means that D2000 Archiv will not access the depository database while TrezorPostCompressCmd is running, so that this command may perform various maintenance tasks which could otherwise block the archiv.

This is how we have it configured in a few places. So when the archive stops using the depository database, it will mark it inaccessible (but it won’t forbid the access to it), it will run the TrezorPostCompressCmd command which will perform the maintenance in the depository database. When the maintenance is over, the archive will start using the depository again.

Last but not least, if you have any data with a short archiving depth (e.g. 3 days) and you rely on the fact that they are in the depository database, then during the maintenance, you have only the last 3 days available (from the archive database), as the depository database from the last month is not available and, only older depositories are… => it is not a good practice to rely on such a reading from the depository database.

By the way, depository databases should serve for occasional reading, therefore I suggest that you extend archiving depth so that at least 90% of the necessary data is in the archive database.

customer · April 21, 2020, 11:36am

Good day. We solved the problem based on the supplied script.

We thank you.