• Java UDFs to compress and uncompress BLOBs

    From Jeremy Rickard@21:1/5 to All on Fri Mar 31 20:53:02 2023
    For interest in case useful. I've been working on this a little while. Due to the need to raise a case it took longer than I imagined, but here at last...

    Release 0.1, see: https://github.com/easydataservices/db2-compress

    Please note...
    * An IBM case has confirmed that Java UDFs can return LOB data types. The documentation that says otherwise is out-of-date, and will be fixed later on.

    * On my laptop running Db2 11.5.8 on Ubuntu, an uncompressed 25MB JSON document compresses 6.8x in about 2 seconds, and uncompresses in about 1 second.

    * If you want to work with BLOBs larger than 64MB you will need to increase JAVA_HEAP_SZ.

    * Performance is now perhaps adequate for archive databases of an appropriate design. I would not suggest using this in any database where response matters.

    * Smaller LOBs containing similar data run faster, more-or-less proportional to size.

    If you think you have a possible use for these functions, please read the notes in the README, and be sure to test it works well for you before deploying.

    Jeremy Rickard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Alexander@21:1/5 to Jeremy Rickard on Wed Apr 5 16:37:16 2023
    Jeremy Rickard <jrickard27@gmail.com> wrote:
    For interest in case useful. I've been working on this a little while.
    Due to the need to raise a case it took longer than I imagined, but here at last...

    It’s interesting exercise, but generally it’s wrong approach.
    Just don’t store LOBs in the DB if DB size or LOBs access performance could be an issue.

    Alexander Veremev.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeremy Rickard@21:1/5 to Alexander on Thu Apr 6 05:36:19 2023
    On Thursday, 6 April 2023 at 04:37:20 UTC+12, Alexander wrote:
    Jeremy Rickard <jrick...@gmail.com> wrote:
    For interest in case useful. I've been working on this a little while.
    Due to the need to raise a case it took longer than I imagined, but here at last...
    It’s interesting exercise, but generally it’s wrong approach.
    Just don’t store LOBs in the DB if DB size or LOBs access performance could t
    be an issue.

    Alexander Veremev.

    I already advised caution, but that's a sweeping statement. In the real world, unfortunately, some of these archive databases can grow and grow. With the right kind of data (compresses well, seldom retrieved), there may be worthwhile benefits. For
    example:
    * Cheaper HADR replication (logs are smaller), if needed
    * Faster, smaller backups and recovery
    * Leverage standard database access controls
    * Leverage standard database solutions to encrypting sensitive data

    I would generally not compress any LOB in a non-archiving context. By archiving, I mean writing data to a separate store where you don't (generally) expect to retrieve it again.

    Jeremy Rickard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)