An efficient way of removing objects from an indexedDB object store that are missing a property

I am thinking about how to make a certain operation in a project of mine more efficient. The operation in its current implementation loads all objects from an object store, iterates through the array and testing whether an object is missing a property or if the property is undefined, collecting the set of such objects in a second array, and then removing all of these objects.

I am already using getAll for its obvious performance benefit over cursor iteration.

  • IndexedDB: Store file as File or Blob or ArrayBuffer. What is the best option?
  • What difference does the _(underscore) before the function call add() make?
  • Indexeddb IDBKeyRange.only is not a function
  • “An IndexedDB transaction that was not yet complete has been aborted due to page navigation”
  • HTML5 Offline Storage / Web SQL
  • How can I wait until an IndexedDB instance is closed?
  • I am concurrently calling individual delete requests, so there is no speed up there but for the the indexedDB api evolving to support batch deletes on non-indexed non-keypath props that are missing values.

    The issue is that I have no way of checking against the property when the property is not in the keypath of the object store without fully loading each object into memory. The objects are rather large in some cases. Sometimes one of the properties of each object is extremely large (essentially a string of an html document).

    I cannot use an index, because properties not present in objects, or that do not have a value, do not appear in an index.

    Is there a way to avoid loading such objects into memory?

    I have thought about partitioning, and using two object stores, one for queryable props and one for full data. But then this devolves into having to do extra requests every read. My app does a lot more reading then this occasional batch delete operation.

    I have thought about storing an extra property per object that always has a value like myObject.doesOtherPropertyHaveValue that contains 0/1 and therefore is indexable, but this doesn’t seem great either. Sure I could query by just this index and use getAllKeys and that solves the problem. However, now every add/put has to maintain this functional dependency.

    Any advice is appreciated.

  • Detect Firefox IndexedDB or Web Storage storage limit, without filling up the disk?
  • indexedDB setVersion request being blocked?
  • Javascript: Searching indexeddb using multiple indexes
  • How can I enable users to efficiently save the contents of an indexedDB object store to a file?
  • How to receive error while inserting in store?
  • Processing a large (12K+ rows) array in JavaScript
  • 2 Solutions collect form web for “An efficient way of removing objects from an indexedDB object store that are missing a property”

    If the records have the form {key, prop} where prop is the one that may not be present, you can create an index on [key, prop]. This will only have index records when prop is present. Then open two key-only cursors: one on the store (C1) and one on the index (C2). Check to see if the C1.primaryKey equals C2.primaryKey[0]. If so, the prop is present, advance both. If the keys are not equal then C1 points at a record that doesn’t have the prop; delete it and advance C1. Repeat. (Watch out for edge cases when you hit the end of the range.)

    Two problems with this: (1) you’re still using cursors, so still paying the cost of round trips (unlike getAll()), and (2) if prop is large (i.e. the body of an HTML document as you mention) then even just using key cursors you’re still shuffling the large amount of data around.

    (In the future we’d like to tackle this by adding either a more general query mechanism or custom indexing possibly combined with delete-on-index – either of which would make this much easier and more efficient)

    …collecting the set of such objects in a second array, and then removing all of these objects…

    If you stick with this approach, remember that you can issue delete() calls as you find them; no need to collect all the objects, and you don’t need to wait for the deletes to finish; you can use IDB in a “fire and forget” fashion for write operations.

    You’re on the right path that denormalization may be necessary to improve performance. From the indexedDB docs, there’s no way to query the way you need to unfortunately.

    If the true bottleneck is I/O or marshalling data into JS-land, then maybe try compressing the data before writing and uncompressing when you actually need to do something with it? GZIP can compress text very well, sometimes up to 70% less. There are a few GZIP libraries for JS that could work:

    But, as always, benchmark!