Avoiding XDMP-EXPNTREECACHEFULL Errors

If you have spent any length of time having to process large numbers of records in MarkLogic, then you must have already experienced the dreadful

XMDP-EXPNTREECACHEFULL

error message. This is simply, but mercilessly, telling you that there is not enough memory in your system to process your query and generate the final result(s). Most of the times, it is the result of poor code techniques and/or lack of indexing on the relevant elements/attributes. The simple method below has proven useful on many occasions and is rather straightforward to implement.

Basically, instead of accumulating the processing of all records in memory, you are handing the processing of each record to a separate thread and letting the Task Server spawn them all and return the result(s). In this example, I am assuming you want to update the dataset in MarkLogic.

Explaining the code:
  • efficiently identify the records to process - lines 1 to 9
  • process each record separately - line 11
  • use

    xdmp:spawn-function()

    for each record - line 12
  • update each record in the database (e.g. change value of Title element) - lines 15-20
  • make sure to commit the update - line 19
  • specify it is an update - line 24
 
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
let $uris :=
  cts:uris(
    '',
    (),
    cts:and-query((
      cts:collection-query('RecordCollection'),
      cts:element-query(xs:QName('Title'),cts:and-query(()))
    ))
  )

for $uri in $uris
  return xdmp:spawn-function(
    function() {
      (
        let $r := fn:doc($uri)/Record
        let $cur-el := $r/Title
        let $new-el := <Title>New Title</Title>
        return (
          xdmp:node-replace($cur-el,$new-el),xdmp:commit()
        )
      )
    },
    <options xmlns="xdmp:eval">
      <transaction-mode>update</transaction-mode>
    </options>
  )

Comments