Squashed commit of the following:

commit 3b229cad538ad88ef2d366964c4261bc0e02fb7c
Author: Simon Cambier <simon.cambier@protonmail.com>
Date:   Sat Nov 5 14:30:08 2022 +0100

    1.8.0-beta.1

commit f43c369b2dd0a1083b171724e3f7466429505629
Author: Simon Cambier <simon.cambier@protonmail.com>
Date:   Sat Nov 5 13:39:45 2022 +0100

    Squashed commit of the following:

    commit 93508ee95046385baf62475e5bd835ed9fafe6d3
    Author: Simon Cambier <simon.cambier@protonmail.com>
    Date:   Sat Nov 5 13:35:56 2022 +0100

        Cleaning

    commit 205e6a7cce4c1939338820f366f7ae8a067ec7fb
    Author: Simon Cambier <simon.cambier@protonmail.com>
    Date:   Fri Nov 4 08:53:46 2022 +0100

        Added logs

    commit ea19b94e164581829908ac71d09a60e230925a7f
    Author: Simon Cambier <simon.cambier@protonmail.com>
    Date:   Thu Nov 3 22:27:24 2022 +0100

        Notices

    commit 53ff4e822b3c292a56da150b94a1cfe43e199d44
    Author: Simon Cambier <simon.cambier@protonmail.com>
    Date:   Thu Nov 3 22:27:09 2022 +0100

        Custom minisearch build + Notice when the cache could be corrupted

    commit 498408afd1c350dd68969318c3533fff8aa6c172
    Author: Simon Cambier <simon.cambier@protonmail.com>
    Date:   Thu Nov 3 22:26:22 2022 +0100

        Added a button to manually clear the cache

    commit 90afe5d3868989626ba4613b064e24ac7efa88be
    Author: Simon Cambier <simon.cambier@protonmail.com>
    Date:   Thu Nov 3 22:03:41 2022 +0100

        Optimized loading minisearch from cache

    commit 719dcb9c82f09f56dabb828ac13c9c1db7f795bb
    Author: Simon Cambier <simon.cambier@protonmail.com>
    Date:   Thu Nov 3 21:43:49 2022 +0100

        #92 - Refactored cache to make it behave like pre-indexedDb

    commit 2164ccfa39d83eef23231d01e8aa35ac30e0d31c
    Author: Simon Cambier <simon.cambier@protonmail.com>
    Date:   Wed Nov 2 23:13:59 2022 +0100

        Removed cache & tmp engine

    commit 50eb33bbd4d074be9a9952eaf871cd8f58b327e6
    Author: Simon Cambier <simon.cambier@protonmail.com>
    Date:   Wed Nov 2 22:56:04 2022 +0100

        More efficient loading of PDFs

commit a6342a675f
Author: Simon Cambier <simon.cambier@protonmail.com>
Date:   Wed Nov 2 10:34:02 2022 +0100

    #120 - Cleaning of old cache databases

commit b6890567f3
Author: Simon Cambier <simon.cambier@protonmail.com>
Date:   Mon Oct 31 17:28:17 2022 +0100

    Updated Readme
This commit is contained in:
Simon Cambier
2022-11-05 14:58:25 +01:00
parent 777b172904
commit 087ec5cc99
12 changed files with 367 additions and 164 deletions

View File

@@ -14,17 +14,20 @@ Under the hood, it uses the excellent [MiniSearch](https://github.com/lucaong/mi
## Features ## Features
- Find your notes faster than ever
- Workflow similar to the "Quick Switcher" core plugin
- Automatic document scoring using the [BM25 algorithm](https://github.com/lucaong/minisearch/issues/129#issuecomment-1046257399) - Automatic document scoring using the [BM25 algorithm](https://github.com/lucaong/minisearch/issues/129#issuecomment-1046257399)
- The relevance of a document against a query depends on the number of times the query terms appear in the document, its filename, and its headings - The relevance of a document against a query depends on the number of times the query terms appear in the document, its filename, and its headings
- Can search other plaintext files and PDFs (configurable in settings) - Can search other plaintext files and PDFs
- Workflow similar to "Quick Switcher" plugins - Opt-in in settings
- PDF indexing is disabled on iOS
- Keyboard first: you never have to use your mouse - Keyboard first: you never have to use your mouse
- Resistance to typos - Resistance to typos
- Switch between Vault and In-file search to quickly skim multiple results in a single note - Switch between Vault and In-file search to quickly skim multiple results in a single note
- Supports `"expressions in quotes"` and `-exclusions` - Supports `"expressions in quotes"` and `-exclusions`
- Directly Insert a `[[link]]` from the search results - Directly Insert a `[[link]]` from the search results
- Respects Obsidian's "Excluded Files" list - results are downranked, not hidden - Respects Obsidian's "Excluded Files" list - results are downranked, not hidden
- Optional support for Vim navigation keys (ctrl + j, k, n, p) - Supports Vim navigation keys (ctrl + j, k, n, p)
**Note:** support of Chinese, Japanese, Korean, etc. depends on [this additional plugin](https://github.com/aidenlx/cm-chs-patch). Please read its documentation for more information. **Note:** support of Chinese, Japanese, Korean, etc. depends on [this additional plugin](https://github.com/aidenlx/cm-chs-patch). Please read its documentation for more information.
@@ -121,17 +124,15 @@ See [styles.css](./assets/styles.css) for more information.
**Omnisearch makes Obsidian sluggish at startup.** **Omnisearch makes Obsidian sluggish at startup.**
- You may have _big_ documents. Huge notes (like novels) can freeze the interface for a short time when being indexed. Enabling the setting "_Persist cache on disk_" may help you in this case. - You may have _big_ documents. Huge notes (like novels) can freeze the interface for a short time when being indexed. While Omnisearch uses a cache between sessions, it's still rebuilt at startup to keep it up-to-date.
**I have thousands of notes, and at startup I have to wait a few seconds before making a query, or else Omnisearch does not return all the expected results.** **I have thousands of notes, and at startup I have to wait a few seconds before Omnisearch gives me the context of a result.**
- Enabling the setting "_Persist cache on disk_" may help you in this case. - Omnisearch refreshes its index at startup. During this time, you can still find notes, but Omnisearch is not able to show you the excerpts.
**Omnisearch gives inconsistent/invalid results, or there are errors in the developer console.** **Omnisearch gives inconsistent/invalid results, or there are errors in the developer console.**
- Go in Omnisearch settings. - Restart Obsidian to force a reindex of Omnisearch
- If applicable, disable and re-enable "*Persist cache on disk*".
- Restart Obsidian to clear the cache and force a reindex.
**A query should return a result that does not appear.** **A query should return a result that does not appear.**

View File

@@ -48,7 +48,7 @@
"@vanakat/plugin-api": "0.1.0", "@vanakat/plugin-api": "0.1.0",
"dexie": "^3.2.2", "dexie": "^3.2.2",
"lodash-es": "4.17.21", "lodash-es": "4.17.21",
"minisearch": "5.0.0", "minisearch": "github:scambier/minisearch#callback_desync",
"p-limit": "^4.0.0", "p-limit": "^4.0.0",
"pako": "^2.0.4", "pako": "^2.0.4",
"pure-md5": "^0.1.14" "pure-md5": "^0.1.14"

14
pnpm-lock.yaml generated
View File

@@ -21,7 +21,7 @@ specifiers:
dexie: ^3.2.2 dexie: ^3.2.2
jest: ^27.5.1 jest: ^27.5.1
lodash-es: 4.17.21 lodash-es: 4.17.21
minisearch: 5.0.0 minisearch: github:scambier/minisearch#callback_desync
obsidian: latest obsidian: latest
p-limit: ^4.0.0 p-limit: ^4.0.0
pako: ^2.0.4 pako: ^2.0.4
@@ -45,7 +45,7 @@ dependencies:
'@vanakat/plugin-api': 0.1.0 '@vanakat/plugin-api': 0.1.0
dexie: 3.2.2 dexie: 3.2.2
lodash-es: 4.17.21 lodash-es: 4.17.21
minisearch: 5.0.0 minisearch: github.com/scambier/minisearch/adf11cab46d851220a41c9ad95ed986b630f0f3c
p-limit: 4.0.0 p-limit: 4.0.0
pako: 2.0.4 pako: 2.0.4
pure-md5: 0.1.14 pure-md5: 0.1.14
@@ -3866,10 +3866,6 @@ packages:
resolution: {integrity: sha512-Jsjnk4bw3YJqYzbdyBiNsPWHPfO++UGG749Cxs6peCu5Xg4nrena6OVxOYxrQTqww0Jmwt+Ref8rggumkTLz9Q==} resolution: {integrity: sha512-Jsjnk4bw3YJqYzbdyBiNsPWHPfO++UGG749Cxs6peCu5Xg4nrena6OVxOYxrQTqww0Jmwt+Ref8rggumkTLz9Q==}
dev: true dev: true
/minisearch/5.0.0:
resolution: {integrity: sha512-VEwBhl8aFtc2UG2XmP7a4XaZxVfNhe7GvB2W/ZRGbLL3P3LbBhkoOezBWsMqG8Mr5VonqXAMRWth79XXKja1bQ==}
dev: false
/mkdirp/0.5.6: /mkdirp/0.5.6:
resolution: {integrity: sha512-FP+p8RB8OWpF3YZBCrP5gtADmtXApB5AMLn+vdyA+PyxCjrCs00mjyUozssO33cwDeT3wNGdLxJ5M//YqtHAJw==} resolution: {integrity: sha512-FP+p8RB8OWpF3YZBCrP5gtADmtXApB5AMLn+vdyA+PyxCjrCs00mjyUozssO33cwDeT3wNGdLxJ5M//YqtHAJw==}
hasBin: true hasBin: true
@@ -4951,3 +4947,9 @@ packages:
resolution: {integrity: sha512-9bnSc/HEW2uRy67wc+T8UwauLuPJVn28jb+GtJY16iiKWyvmYJRXVT4UamsAEGQfPohgr2q4Tq0sQbQlxTfi1g==} resolution: {integrity: sha512-9bnSc/HEW2uRy67wc+T8UwauLuPJVn28jb+GtJY16iiKWyvmYJRXVT4UamsAEGQfPohgr2q4Tq0sQbQlxTfi1g==}
engines: {node: '>=12.20'} engines: {node: '>=12.20'}
dev: false dev: false
github.com/scambier/minisearch/adf11cab46d851220a41c9ad95ed986b630f0f3c:
resolution: {tarball: https://codeload.github.com/scambier/minisearch/tar.gz/adf11cab46d851220a41c9ad95ed986b630f0f3c}
name: minisearch
version: 5.0.0
dev: false

View File

@@ -1,11 +1,13 @@
import type { TFile } from 'obsidian' import { Notice, type TFile } from 'obsidian'
import type { IndexedDocument } from './globals' import type { IndexedDocument } from './globals'
import { database } from './database' import { database } from './database'
import MiniSearch from 'minisearch' import MiniSearch from 'minisearch'
import { minisearchOptions } from './search/search-engine' import { minisearchOptions } from './search/search-engine'
import { makeMD5, wait } from './tools/utils'
import { settings } from './settings'
class CacheManager { class CacheManager {
private documentsCache: Map<string, IndexedDocument> = new Map() private liveDocuments: Map<string, IndexedDocument> = new Map()
/** /**
* Show an empty input field next time the user opens Omnisearch modal * Show an empty input field next time the user opens Omnisearch modal
*/ */
@@ -35,48 +37,147 @@ class CacheManager {
return data return data
} }
public async updateDocument(path: string, note: IndexedDocument) { /**
this.documentsCache.set(path, note) * Important: keep this method async for the day it _really_ becomes async.
* This will avoid a refactor.
* @param path
* @param note
*/
public async updateLiveDocument(
path: string,
note: IndexedDocument
): Promise<void> {
this.liveDocuments.set(path, note)
} }
public deleteDocument(key: string): void { public deleteLiveDocument(key: string): void {
this.documentsCache.delete(key) this.liveDocuments.delete(key)
} }
public getDocument(key: string): IndexedDocument | undefined { public getLiveDocument(key: string): IndexedDocument | undefined {
return this.documentsCache.get(key) return this.liveDocuments.get(key)
}
public getNonExistingNotesFromMemCache(): IndexedDocument[] {
return Object.values(this.documentsCache).filter(note => note.doesNotExist)
} }
public isDocumentOutdated(file: TFile): boolean { public isDocumentOutdated(file: TFile): boolean {
const indexedNote = this.getDocument(file.path) const indexedNote = this.getLiveDocument(file.path)
return !indexedNote || indexedNote.mtime !== file.stat.mtime return !indexedNote || indexedNote.mtime !== file.stat.mtime
} }
//#region Minisearch //#region Minisearch
public getDocumentsChecksum(documents: IndexedDocument[]): string {
return makeMD5(
JSON.stringify(
documents.sort((a, b) => {
if (a.path < b.path) {
return -1
} else if (a.path > b.path) {
return 1
}
return 0
})
)
)
}
public async getMinisearchCache(): Promise<MiniSearch | null> { public async getMinisearchCache(): Promise<MiniSearch | null> {
const cache = (await database.minisearch.toArray())[0] // Retrieve documents and make their checksum
if (!cache) { const cachedDocs = await database.documents.toArray()
const checksum = this.getDocumentsChecksum(cachedDocs.map(d => d.document))
// Add those documents in the live cache
cachedDocs.forEach(doc =>
cacheManager.updateLiveDocument(doc.path, doc.document)
)
// Retrieve the search cache, and verify the checksum
const cachedIndex = (await database.minisearch.toArray())[0]
if (cachedIndex?.checksum !== checksum) {
console.warn("Omnisearch - Cache - Checksums don't match, clearing cache")
// Invalid (or null) cache, clear everything
await database.minisearch.clear()
await database.documents.clear()
return null return null
} }
try { try {
return MiniSearch.loadJSON(cache.data, minisearchOptions) return MiniSearch.loadJS(cachedIndex.data, minisearchOptions)
} catch (e) { } catch (e) {
if (settings.showIndexingNotices) {
new Notice(
'Omnisearch - Cache missing or invalid. Some freezes may occur while Omnisearch indexes your vault.'
)
}
console.error('Omnisearch - Error while loading Minisearch cache') console.error('Omnisearch - Error while loading Minisearch cache')
console.error(e) console.error(e)
return null return null
} }
} }
public async writeMinisearchCache(minisearch: MiniSearch): Promise<void> { /**
* Get a dict listing the deleted/added documents since last cache
* @param documents
*/
public async getDiffDocuments(documents: IndexedDocument[]): Promise<{
toDelete: IndexedDocument[]
toAdd: IndexedDocument[]
toUpdate: { old: IndexedDocument; new: IndexedDocument }[]
}> {
let cachedDocs = await database.documents.toArray()
const toAdd = documents.filter(
d => !cachedDocs.find(c => c.path === d.path)
)
const toDelete = cachedDocs
.filter(c => !documents.find(d => d.path === c.path))
.map(d => d.document)
const toUpdate = cachedDocs
.filter(c =>
documents.find(d => d.path === c.path && d.mtime !== c.mtime)
)
.map(c => ({
old: c.document,
new: documents.find(d => d.path === c.path)!,
}))
return {
toDelete,
toAdd,
toUpdate,
}
}
public async writeMinisearchCache(
minisearch: MiniSearch,
documents: IndexedDocument[]
): Promise<void> {
const { toDelete, toAdd, toUpdate } = await this.getDiffDocuments(documents)
// Delete
// console.log(`Omnisearch - Cache - Will delete ${toDelete.length} documents`)
await database.documents.bulkDelete(toDelete.map(o => o.path))
// Add
// console.log(`Omnisearch - Cache - Will add ${toAdd.length} documents`)
await database.documents.bulkAdd(
toAdd.map(o => ({ document: o, mtime: o.mtime, path: o.path }))
)
// Update
// console.log(`Omnisearch - Cache - Will update ${toUpdate.length} documents`)
await database.documents.bulkPut(
toUpdate.map(o => ({
document: o.new,
mtime: o.new.mtime,
path: o.new.path,
}))
)
await database.minisearch.clear() await database.minisearch.clear()
await database.minisearch.add({ await database.minisearch.add({
date: new Date().toISOString(), date: new Date().toISOString(),
data: JSON.stringify(minisearch.toJSON()), checksum: this.getDocumentsChecksum(documents),
data: minisearch.toJSON(),
}) })
console.log('Omnisearch - Search cache written') console.log('Omnisearch - Search cache written')
} }

View File

@@ -10,7 +10,7 @@
$: reg = stringsToRegex(note.foundWords) $: reg = stringsToRegex(note.foundWords)
$: cleanedContent = makeExcerpt(note.content, note.matches[0]?.offset ?? -1) $: cleanedContent = makeExcerpt(note.content, note.matches[0]?.offset ?? -1)
$: glyph = cacheManager.getDocument(note.path)?.doesNotExist $: glyph = cacheManager.getLiveDocument(note.path)?.doesNotExist
$: title = settings.showShortName ? note.basename : note.path $: title = settings.showShortName ? note.basename : note.path
</script> </script>

View File

@@ -1,21 +1,72 @@
import Dexie from 'dexie' import Dexie from 'dexie'
import type { AsPlainObject } from 'minisearch'
import type { IndexedDocument } from './globals'
class OmnisearchCache extends Dexie { export class OmnisearchCache extends Dexie {
pdf!: Dexie.Table< public static readonly dbVersion = 7
{ path: string; hash: string; size: number; text: string }, public static readonly dbPrefix = 'omnisearch/cache/'
public static readonly dbName = OmnisearchCache.dbPrefix + app.appId
private static instance: OmnisearchCache
/**
* Deletes Omnisearch databases that have an older version than the current one
*/
public static async clearOldDatabases(): Promise<void> {
const toDelete = (await indexedDB.databases()).filter(
db =>
db.name?.startsWith(OmnisearchCache.dbPrefix) &&
// version multiplied by 10 https://github.com/dexie/Dexie.js/issues/59
db.version !== OmnisearchCache.dbVersion * 10
)
if (toDelete.length) {
console.log('Omnisearch - Those IndexedDb databases will be deleted:')
for (const db of toDelete) {
if (db.name) {
console.log(db.name + ' ' + db.version)
indexedDB.deleteDatabase(db.name)
}
}
}
}
//#region Table declarations
pdf!: Dexie.Table<{ path: string; hash: string; text: string }, string>
documents!: Dexie.Table<
{ path: string; mtime: number; document: IndexedDocument },
string string
> >
searchHistory!: Dexie.Table<{ id?: number; query: string }, number> searchHistory!: Dexie.Table<{ id?: number; query: string }, number>
minisearch!: Dexie.Table<{ date: string; data: string }, string> minisearch!: Dexie.Table<
{ date: string; checksum: string; data: AsPlainObject },
string
>
constructor() { //#endregion Table declarations
super('omnisearch/cache/' + app.appId)
this.version(5).stores({ public static getInstance() {
if (!OmnisearchCache.instance) {
OmnisearchCache.instance = new OmnisearchCache()
}
return OmnisearchCache.instance
}
private constructor() {
super(OmnisearchCache.dbName)
// Database structure
this.version(OmnisearchCache.dbVersion).stores({
pdf: 'path, hash, size', pdf: 'path, hash, size',
searchHistory: '++id', searchHistory: '++id',
documents: 'path',
minisearch: 'date', minisearch: 'date',
}) })
} }
public async clearCache() {
await this.minisearch.clear()
await this.documents.clear()
}
} }
export const database = new OmnisearchCache() export const database = OmnisearchCache.getInstance()

View File

@@ -11,6 +11,7 @@ import type { TFile } from 'obsidian'
import type { IndexedDocument } from './globals' import type { IndexedDocument } from './globals'
import { pdfManager } from './pdf/pdf-manager' import { pdfManager } from './pdf/pdf-manager'
import { getNonExistingNotes } from './tools/notes' import { getNonExistingNotes } from './tools/notes'
import { database } from './database'
/** /**
* Return all plaintext files as IndexedDocuments * Return all plaintext files as IndexedDocuments
@@ -21,7 +22,7 @@ export async function getPlainTextFiles(): Promise<IndexedDocument[]> {
for (const file of allFiles) { for (const file of allFiles) {
const doc = await fileToIndexedDocument(file) const doc = await fileToIndexedDocument(file)
data.push(doc) data.push(doc)
await cacheManager.updateDocument(file.path, doc) await cacheManager.updateLiveDocument(file.path, doc)
} }
return data return data
} }
@@ -31,15 +32,19 @@ export async function getPlainTextFiles(): Promise<IndexedDocument[]> {
* If a PDF isn't cached, it will be read from the disk and added to the IndexedDB * If a PDF isn't cached, it will be read from the disk and added to the IndexedDB
*/ */
export async function getPDFFiles(): Promise<IndexedDocument[]> { export async function getPDFFiles(): Promise<IndexedDocument[]> {
const allFiles = app.vault.getFiles().filter(f => f.path.endsWith('.pdf')) const fromDisk = app.vault.getFiles().filter(f => f.path.endsWith('.pdf'))
const data: IndexedDocument[] = [] const fromDb = await database.pdf.toArray()
const data: IndexedDocument[] = []
const input = [] const input = []
for (const file of allFiles) { for (const file of fromDisk) {
input.push( input.push(
NotesIndex.processQueue(async () => { NotesIndex.processQueue(async () => {
const doc = await fileToIndexedDocument(file) const doc = await fileToIndexedDocument(
await cacheManager.updateDocument(file.path, doc) file,
fromDb.find(o => o.path === file.path)?.text
)
await cacheManager.updateLiveDocument(file.path, doc)
data.push(doc) data.push(doc)
}) })
) )
@@ -52,38 +57,45 @@ export async function getPDFFiles(): Promise<IndexedDocument[]> {
* Convert a file into an IndexedDocument. * Convert a file into an IndexedDocument.
* Will use the cache if possible. * Will use the cache if possible.
* @param file * @param file
* @param content If we give a text content, will skip the fetching part
*/ */
export async function fileToIndexedDocument( export async function fileToIndexedDocument(
file: TFile file: TFile,
content?: string
): Promise<IndexedDocument> { ): Promise<IndexedDocument> {
let content: string if (!content) {
if (isFilePlaintext(file.path)) { if (isFilePlaintext(file.path)) {
content = removeDiacritics(await app.vault.cachedRead(file)) content = await app.vault.cachedRead(file)
} else if (file.path.endsWith('.pdf')) { } else if (file.path.endsWith('.pdf')) {
content = removeDiacritics(await pdfManager.getPdfText(file)) content = await pdfManager.getPdfText(file)
} else { } else {
throw new Error('Invalid file: ' + file.path) throw new Error('Invalid file: ' + file.path)
} }
}
content = removeDiacritics(content) content = removeDiacritics(content)
const metadata = app.metadataCache.getFileCache(file) const metadata = app.metadataCache.getFileCache(file)
// EXCALIDRAW
// Remove the json code
if (metadata?.frontmatter?.['excalidraw-plugin']) {
const comments = metadata.sections?.filter(s => s.type === 'comment') ?? []
for (const { start, end } of comments.map(c => c.position)) {
content = content.substring(0, start.offset-1) + content.substring(end.offset)
}
}
// Look for links that lead to non-existing files, // Look for links that lead to non-existing files,
// and add them to the index. // and add them to the index.
if (metadata) { if (metadata) {
const nonExisting = getNonExistingNotes(file, metadata) const nonExisting = getNonExistingNotes(file, metadata)
for (const name of nonExisting.filter(o => !cacheManager.getDocument(o))) { for (const name of nonExisting.filter(
o => !cacheManager.getLiveDocument(o)
)) {
NotesIndex.addNonExistingToIndex(name, file.path) NotesIndex.addNonExistingToIndex(name, file.path)
} }
// EXCALIDRAW
// Remove the json code
if (metadata.frontmatter?.['excalidraw-plugin']) {
const comments =
metadata.sections?.filter(s => s.type === 'comment') ?? []
for (const { start, end } of comments.map(c => c.position)) {
content =
content.substring(0, start.offset - 1) + content.substring(end.offset)
}
}
} }
return { return {

View File

@@ -1,4 +1,4 @@
import { Notice, Plugin, TFile } from 'obsidian' import { Notice, Platform, Plugin, TFile } from 'obsidian'
import { SearchEngine } from './search/search-engine' import { SearchEngine } from './search/search-engine'
import { import {
OmnisearchInFileModal, OmnisearchInFileModal,
@@ -11,17 +11,17 @@ import api from './tools/api'
import { isFilePlaintext, wait } from './tools/utils' import { isFilePlaintext, wait } from './tools/utils'
import * as NotesIndex from './notes-index' import * as NotesIndex from './notes-index'
import * as FileLoader from './file-loader' import * as FileLoader from './file-loader'
import { OmnisearchCache } from './database'
import { cacheManager } from './cache-manager'
export default class OmnisearchPlugin extends Plugin { export default class OmnisearchPlugin extends Plugin {
private ribbonButton?: HTMLElement private ribbonButton?: HTMLElement
async onload(): Promise<void> { async onload(): Promise<void> {
await cleanOldCacheFiles() await cleanOldCacheFiles()
await OmnisearchCache.clearOldDatabases()
await loadSettings(this) await loadSettings(this)
// Initialize minisearch
await SearchEngine.initFromCache()
_registerAPI(this) _registerAPI(this)
if (settings.ribbonIcon) { if (settings.ribbonIcon) {
@@ -105,37 +105,68 @@ export default class OmnisearchPlugin extends Plugin {
* Read the files and feed them to Minisearch * Read the files and feed them to Minisearch
*/ */
async function populateIndex(): Promise<void> { async function populateIndex(): Promise<void> {
const tmpEngine = SearchEngine.getTmpEngine() console.time('Omnisearch - Indexing duration')
// Initialize minisearch
let engine = SearchEngine.getEngine()
// No cache for iOS
if (!Platform.isIosApp) {
engine = await SearchEngine.initFromCache()
}
// Load plaintext files // Load plaintext files
console.time('Omnisearch - Timing') const plainTextFiles = await FileLoader.getPlainTextFiles()
const files = await FileLoader.getPlainTextFiles() let allFiles = [...plainTextFiles]
// Index them // iOS: since there's no cache, directly index the documents
await tmpEngine.addAllToMinisearch(files) if (Platform.isIosApp) {
console.log(`Omnisearch - Indexed ${files.length} notes`) await wait(1000)
console.timeEnd('Omnisearch - Timing') await engine.addAllToMinisearch(plainTextFiles)
}
// Load normal notes into the main search engine
SearchEngine.loadTmpDataIntoMain()
// Load PDFs // Load PDFs
if (settings.PDFIndexing) { if (settings.PDFIndexing) {
console.time('Omnisearch - Timing')
const pdfs = await FileLoader.getPDFFiles() const pdfs = await FileLoader.getPDFFiles()
// Index them // iOS: since there's no cache, just index the documents
await tmpEngine.addAllToMinisearch(pdfs) if (Platform.isIosApp) {
console.log(`Omnisearch - Indexed ${pdfs.length} PDFs`) await wait(1000)
console.timeEnd('Omnisearch - Timing') await engine.addAllToMinisearch(pdfs)
}
// Load PDFs into the main search engine, and write cache // Add PDFs to the files list
SearchEngine.loadTmpDataIntoMain() allFiles = [...allFiles, ...pdfs]
} }
SearchEngine.isIndexing.set(false) // Other platforms: make a diff of what's to add/update/delete
await tmpEngine.writeToCache() if (!Platform.isIosApp) {
// Check which documents need to be removed/added/updated
const diffDocs = await cacheManager.getDiffDocuments(allFiles)
// Add
await engine.addAllToMinisearch(diffDocs.toAdd)
diffDocs.toAdd.forEach(doc =>
cacheManager.updateLiveDocument(doc.path, doc)
)
// Clear memory // Delete
SearchEngine.clearTmp() diffDocs.toDelete.forEach(d => engine.removeFromMinisearch(d))
diffDocs.toDelete.forEach(doc => cacheManager.deleteLiveDocument(doc.path))
// Update (delete + add)
diffDocs.toUpdate
.map(d => d.old)
.forEach(d => {
engine.removeFromMinisearch(d)
cacheManager.updateLiveDocument(d.path, d)
})
await engine.addAllToMinisearch(diffDocs.toUpdate.map(d => d.new))
}
// Load PDFs into the main search engine, and write cache
// SearchEngine.loadTmpDataIntoMain()
SearchEngine.isIndexing.set(false)
if (!Platform.isIosApp) {
await SearchEngine.getEngine().writeToCache(allFiles)
}
console.timeEnd('Omnisearch - Indexing duration')
} }
async function cleanOldCacheFiles() { async function cleanOldCacheFiles() {

View File

@@ -27,19 +27,19 @@ export async function addToIndexAndMemCache(
// Check if the file was already indexed as non-existent. // Check if the file was already indexed as non-existent.
// If so, remove it from the index, and add it again as a real note. // If so, remove it from the index, and add it again as a real note.
if (cacheManager.getDocument(file.path)?.doesNotExist) { if (cacheManager.getLiveDocument(file.path)?.doesNotExist) {
removeFromIndex(file.path) removeFromIndex(file.path)
} }
try { try {
if (cacheManager.getDocument(file.path)) { if (cacheManager.getLiveDocument(file.path)) {
throw new Error(`${file.basename} is already indexed`) throw new Error(`${file.basename} is already indexed`)
} }
// Make the document and index it // Make the document and index it
const note = await fileToIndexedDocument(file) const note = await fileToIndexedDocument(file)
SearchEngine.getEngine().addSingleToMinisearch(note) SearchEngine.getEngine().addSingleToMinisearch(note)
await cacheManager.updateDocument(note.path, note) await cacheManager.updateLiveDocument(note.path, note)
} catch (e) { } catch (e) {
// console.trace('Error while indexing ' + file.basename) // console.trace('Error while indexing ' + file.basename)
console.error(e) console.error(e)
@@ -55,7 +55,7 @@ export async function addToIndexAndMemCache(
export function addNonExistingToIndex(name: string, parent: string): void { export function addNonExistingToIndex(name: string, parent: string): void {
name = removeAnchors(name) name = removeAnchors(name)
const filename = name + (name.endsWith('.md') ? '' : '.md') const filename = name + (name.endsWith('.md') ? '' : '.md')
if (cacheManager.getDocument(filename)) return if (cacheManager.getLiveDocument(filename)) return
const note: IndexedDocument = { const note: IndexedDocument = {
path: filename, path: filename,
@@ -73,7 +73,7 @@ export function addNonExistingToIndex(name: string, parent: string): void {
parent, parent,
} }
SearchEngine.getEngine().addSingleToMinisearch(note) SearchEngine.getEngine().addSingleToMinisearch(note)
cacheManager.updateDocument(filename, note) cacheManager.updateLiveDocument(filename, note)
} }
/** /**
@@ -84,10 +84,10 @@ export function removeFromIndex(path: string): void {
console.info(`"${path}" is not an indexable file`) console.info(`"${path}" is not an indexable file`)
return return
} }
const note = cacheManager.getDocument(path) const note = cacheManager.getLiveDocument(path)
if (note) { if (note) {
SearchEngine.getEngine().removeFromMinisearch(note) SearchEngine.getEngine().removeFromMinisearch(note)
cacheManager.deleteDocument(path) cacheManager.deleteLiveDocument(path)
// FIXME: only remove non-existing notes if they don't have another parent // FIXME: only remove non-existing notes if they don't have another parent
// cacheManager // cacheManager

View File

@@ -76,7 +76,7 @@ class PDFManager {
// Add it to the cache // Add it to the cache
database.pdf database.pdf
.add({ hash, text, path: file.path, size: file.stat.size }) .add({ hash, text, path: file.path })
.then(() => { .then(() => {
resolve(text) resolve(text)
}) })
@@ -84,7 +84,7 @@ class PDFManager {
// In case of error (unreadable PDF or timeout) just add // In case of error (unreadable PDF or timeout) just add
// an empty string to the cache // an empty string to the cache
database.pdf database.pdf
.add({ hash, text: '', path: file.path, size: file.stat.size }) .add({ hash, text: '', path: file.path })
.then(() => { .then(() => {
resolve('') resolve('')
}) })

View File

@@ -1,8 +1,4 @@
import MiniSearch, { import MiniSearch, { type Options, type SearchResult } from 'minisearch'
type AsPlainObject,
type Options,
type SearchResult,
} from 'minisearch'
import { import {
chsRegex, chsRegex,
type IndexedDocument, type IndexedDocument,
@@ -19,6 +15,7 @@ import type { Query } from './query'
import { settings } from '../settings' import { settings } from '../settings'
import { cacheManager } from '../cache-manager' import { cacheManager } from '../cache-manager'
import { writable } from 'svelte/store' import { writable } from 'svelte/store'
import { Notice } from 'obsidian'
const tokenize = (text: string): string[] => { const tokenize = (text: string): string[] => {
const tokens = text.split(SPACE_OR_PUNCTUATION) const tokens = text.split(SPACE_OR_PUNCTUATION)
@@ -45,11 +42,15 @@ export const minisearchOptions: Options<IndexedDocument> = {
'headings3', 'headings3',
], ],
storeFields: ['tags'], storeFields: ['tags'],
callbackWhenDesync() {
new Notice(
'Omnisearch - Your index cache may be incorrect or corrupted. If this message keeps appearing, go to Settings to clear the cache.'
)
},
} }
export class SearchEngine { export class SearchEngine {
private static engine?: SearchEngine private static engine?: SearchEngine
private static tmpEngine?: SearchEngine
public static isIndexing = writable(true) public static isIndexing = writable(true)
/** /**
@@ -63,41 +64,23 @@ export class SearchEngine {
return this.engine return this.engine
} }
/**
* The secondary instance. This one is indexed in the background,
* while the main instance is quickly filled with cache data
*/
public static getTmpEngine(): SearchEngine {
if (!this.tmpEngine) {
this.tmpEngine = new SearchEngine()
}
return this.tmpEngine
}
/** /**
* Instantiates the main instance with cache data (if it exists) * Instantiates the main instance with cache data (if it exists)
*/ */
public static async initFromCache(): Promise<void> { public static async initFromCache(): Promise<SearchEngine> {
try { try {
const cache = await cacheManager.getMinisearchCache() const cache = await cacheManager.getMinisearchCache()
if (cache) { if (cache) {
this.getEngine().minisearch = cache this.getEngine().minisearch = cache
} }
} catch (e) { } catch (e) {
new Notice(
'Omnisearch - Cache missing or invalid. Some freezes may occur while Omnisearch indexes your vault.'
)
console.error('Omnisearch - Could not init engine from cache')
console.error(e) console.error(e)
} }
} return this.getEngine()
/**
* Loads the freshest indexed data into the main instance.
*/
public static loadTmpDataIntoMain(): void {
const tmpData = this.getTmpEngine().minisearch.toJSON()
this.getEngine().minisearch = MiniSearch.loadJS(tmpData, minisearchOptions)
}
public static clearTmp(): void {
this.getTmpEngine().minisearch = new MiniSearch(minisearchOptions)
} }
private minisearch: MiniSearch private minisearch: MiniSearch
@@ -147,9 +130,10 @@ export class SearchEngine {
const exactTerms = query.getExactTerms() const exactTerms = query.getExactTerms()
if (exactTerms.length) { if (exactTerms.length) {
results = results.filter(r => { results = results.filter(r => {
const title = cacheManager.getDocument(r.id)?.path.toLowerCase() ?? '' const title =
cacheManager.getLiveDocument(r.id)?.path.toLowerCase() ?? ''
const content = stripMarkdownCharacters( const content = stripMarkdownCharacters(
cacheManager.getDocument(r.id)?.content ?? '' cacheManager.getLiveDocument(r.id)?.content ?? ''
).toLowerCase() ).toLowerCase()
return exactTerms.every(q => content.includes(q) || title.includes(q)) return exactTerms.every(q => content.includes(q) || title.includes(q))
}) })
@@ -160,7 +144,7 @@ export class SearchEngine {
if (exclusions.length) { if (exclusions.length) {
results = results.filter(r => { results = results.filter(r => {
const content = stripMarkdownCharacters( const content = stripMarkdownCharacters(
cacheManager.getDocument(r.id)?.content ?? '' cacheManager.getLiveDocument(r.id)?.content ?? ''
).toLowerCase() ).toLowerCase()
return exclusions.every(q => !content.includes(q.value)) return exclusions.every(q => !content.includes(q.value))
}) })
@@ -240,9 +224,10 @@ export class SearchEngine {
// Map the raw results to get usable suggestions // Map the raw results to get usable suggestions
return results.map(result => { return results.map(result => {
let note = cacheManager.getDocument(result.id) let note = cacheManager.getLiveDocument(result.id)
if (!note) { if (!note) {
// throw new Error(`Omnisearch - Note "${result.id}" not indexed`) // throw new Error(`Omnisearch - Note "${result.id}" not indexed`)
console.warn(`Omnisearch - Note "${result.id}" not in the live cache`)
note = { note = {
content: '', content: '',
basename: result.id, basename: result.id,
@@ -286,8 +271,11 @@ export class SearchEngine {
// #region Read/write minisearch index // #region Read/write minisearch index
public async addAllToMinisearch(documents: IndexedDocument[]): Promise<void> { public async addAllToMinisearch(
await this.minisearch.addAllAsync(documents) documents: IndexedDocument[],
chunkSize = 10
): Promise<void> {
await this.minisearch.addAllAsync(documents, { chunkSize })
} }
public addSingleToMinisearch(document: IndexedDocument): void { public addSingleToMinisearch(document: IndexedDocument): void {
@@ -300,7 +288,7 @@ export class SearchEngine {
// #endregion // #endregion
public async writeToCache(): Promise<void> { public async writeToCache(documents: IndexedDocument[]): Promise<void> {
await cacheManager.writeMinisearchCache(this.minisearch) await cacheManager.writeMinisearchCache(this.minisearch, documents)
} }
} }

View File

@@ -1,4 +1,5 @@
import { import {
Notice,
Platform, Platform,
Plugin, Plugin,
PluginSettingTab, PluginSettingTab,
@@ -6,6 +7,7 @@ import {
SliderComponent, SliderComponent,
} from 'obsidian' } from 'obsidian'
import { writable } from 'svelte/store' import { writable } from 'svelte/store'
import { database } from './database'
import type OmnisearchPlugin from './main' import type OmnisearchPlugin from './main'
interface WeightingSettings { interface WeightingSettings {
@@ -143,8 +145,7 @@ export class SettingsTab extends PluginSettingTab {
}) })
) )
// PDF Indexing - disabled on iOS // PDF Indexing
if (!Platform.isIosApp) {
const indexPDFsDesc = new DocumentFragment() const indexPDFsDesc = new DocumentFragment()
indexPDFsDesc.createSpan({}, span => { indexPDFsDesc.createSpan({}, span => {
span.innerHTML = `Omnisearch will include PDFs in search results. span.innerHTML = `Omnisearch will include PDFs in search results.
@@ -165,7 +166,7 @@ export class SettingsTab extends PluginSettingTab {
await saveSettings(this.plugin) await saveSettings(this.plugin)
}) })
) )
}
// #endregion Behavior // #endregion Behavior
// #region User Interface // #region User Interface
@@ -276,6 +277,29 @@ export class SettingsTab extends PluginSettingTab {
.addSlider(cb => this.weightSlider(cb, 'weightH3')) .addSlider(cb => this.weightSlider(cb, 'weightH3'))
// #endregion Results Weighting // #endregion Results Weighting
// #region Danger Zone
new Setting(containerEl).setName('Danger Zone').setHeading()
const resetCacheDesc = new DocumentFragment()
resetCacheDesc.createSpan({}, span => {
span.innerHTML = `Erase all Omnisearch cache data.
Use this if Omnisearch results are inconsistent, missing, or appear outdated.<br>
<strong style="color: var(--text-accent)">Needs a restart to fully take effect.</strong>`
})
new Setting(containerEl)
.setName('Clear cache data')
.setDesc(resetCacheDesc)
.addButton(cb => {
cb.setButtonText('Clear cache')
cb.onClick(async () => {
await database.clearCache()
new Notice('Omnisearch - Cache cleared. Please restart Obsidian.')
})
})
//#endregion Danger Zone
} }
weightSlider(cb: SliderComponent, key: keyof WeightingSettings): void { weightSlider(cb: SliderComponent, key: keyof WeightingSettings): void {
@@ -317,8 +341,6 @@ export const DEFAULT_SETTINGS: OmnisearchSettings = {
weightH2: 1.3, weightH2: 1.3,
weightH3: 1.1, weightH3: 1.1,
// persistCache: false,
welcomeMessage: '', welcomeMessage: '',
} as const } as const
@@ -327,11 +349,6 @@ export let settings = Object.assign({}, DEFAULT_SETTINGS) as OmnisearchSettings
export async function loadSettings(plugin: Plugin): Promise<void> { export async function loadSettings(plugin: Plugin): Promise<void> {
settings = Object.assign({}, DEFAULT_SETTINGS, await plugin.loadData()) settings = Object.assign({}, DEFAULT_SETTINGS, await plugin.loadData())
// Make sure that PDF indexing is disabled on iOS
if (Platform.isIosApp) {
settings.PDFIndexing = false
}
showExcerpt.set(settings.showExcerpt) showExcerpt.set(settings.showExcerpt)
} }