System
ES相似度Similarity Search:https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html
ES向量化搜索
https://www.elastic.co/blog/text-similarity-search-with-vectors-in-elasticsearch
https://juejin.cn/post/7086775047091666952
https://lucene.apache.org/core/8_0_0/core/org/apache/lucene/search/similarities/Similarity.html
https://github.com/adrg/strutil
召回率
ES Term的Fuzzy查询基于: Levenshtein edit distance(es string distance)
ES Match的查询基于:TF/IDF或者BM2.5
ES的Suggest基于string_distance(字符串距离算法): https://www.elastic.co/guide/en/elasticsearch/reference/8.8/search-suggesters.html#_other_term_suggest_options
internal: The default based on damerau_levenshtein but highly optimized for comparing string distance for terms inside the index.
damerau_levenshtein: String distance algorithm based on Damerau-Levenshtein algorithm.
levenshtein: String distance algorithm based on Levenshtein edit distance algorithm.
jaro_winkler: String distance algorithm based on Jaro-Winkler algorithm.
ngram: String distance algorithm based on character n-grams.
golang字符串距离:https://github.com/adrg/strutil