Mindoo Blog - Cutting edge technologies - About Java, Lotus Notes and iPhone

  • Major rewrite of Domino JNA for improved performance, now with incremental indexing of Domino data

    Karsten Lehmann  8 January 2018 11:14:30
    I spend a few days during the Christmas holidays and first week of January to work on our Domino JNA project. The result is version 0.9.11, which is now on its way to Maven Central and available for download as an OSGi plugin for XPages developers.

    Here are the three main features that I have been working on:

    Improved performance
    The project source code has been completely rewritten to use JNA direct mapping. This significantly improves performance in areas with lots of native method invocations like document item access or when adding note ids to ID tables.
    Other performance work has been done to speed up LMBCS-Java-String conversion and to lazily convert summary buffer values when doing a database search (the old version always converted the whole buffer).

    Incremental data indexing
    As I recently wrote on Twitter, I do not know how many code I have written in the past to sync Domino data to external databases or indexers, because they are more powerful doing adhoc-queries than Domino. The good news is that IBM is actively looking into this topic for Domino 10.

    So here is another approach, this time using C API calls, incrementally searching for database changes filtered by an @-formula and not requiring any lookup views.

    The key to getting it implemented this way was the discovery that NSFSearchExtended3, which I am using to search for formula matches, can not only run incrementally by passing a "since" value and returns all changed notes matching the formula, but also calls the callback for all changed and deleted notes not matching the formula.

    This way my code knows what to add/update in the external db or index and also knows what to remove if it existed in the index before.

    To make the algorithm easily reusable, I haven't hard coded a specific sync target. The whole sync process is running against a simple Java interface.

    A sample project is already available with code that synchronized Domino data with the Java based CQEngine indexer. The project contains a testcase that indexes the fakenames database and makes sure everything is working as expected.
    My plan is to create more implementations, e.g. for SQLite or Lucene, but no promises :-).

    The generic sync process handles replication against multiple replicas of the same database (e.g. when replicating with a Domino cluster) and changing the selection formula between syncs, where we do a performance improved comparison what needs to be added to the target and what does not match the new selection formula anymore and gets purged.

    Better testcases
    This is far from being finished, but I started reworking the available testcases so that they do not any longer only "work on my machine", because the sample database that is available for download does not contain some newly added lookup views or documents.

    The plan is to automatically create the sample database with the required structure the first time the test cases are executed, based on a Domino address book template, creating lookup views in code and 50.000 fake person documents.


    For other changes, here are the release notes posted  on Github:
    • Complete project rewrite, now using JNA direct mapping to improve native method invocation performance
    • other performance improvement: faster LMBCS->Java String conversion, lazy conversion of summary value items in NotesSearch.search(...)
    • Improved database search (NotesSearch.search) that now also optionally returns non-matches and deletions when searching incrementally with a "since" date (see callback class com.mindoo.domino.jna.NotesSearch.SearchCallback)
    • New generic class com.mindoo.domino.jna.sync.SyncUtil to incrementally read Domino data for indexing and migration purpose
    • Sample implementation for SyncUtil that indexes data using CQEngine to index Domino data is available as separate project (to reduce the Domino JNA dependencies to a minimum)
    • New methods NotesNote.hasReadersField() and NotesNote.getReadersFields() to get reader fields of a note using an optimized C call
    • New methods for NotesDatabase:
      • getTitle() / setTitle(String title)
      • getCategories() / setCategories(String cat)
      • getTemplateName() / setTemplateName(String name)
      • getDesignTemplateName() / setDesignTemplateName(String name)
      • refreshDesign(String server)
    • added method DirectoryScanner.scan(String formula) to filter directory entries using Domino formula language
    • Bugfixing and package refactoring to hide internal code (struct package moved to com.mindoo.domino.jna.internal)
    • Removed the unfinished com.mindoo.domino.jna.queries package;
      will probably not continue this path, incremental indexing is the way to go


    Comments

    1ubaTaeCJ    Major rewrite of Domino JNA for improved performance, now with incremental indexing of Domino data

    2ubaTaeCJ    Major rewrite of Domino JNA for improved performance, now with incremental indexing of Domino data