training.shoppinpal.com
  • Introduction
  • 1. The Ideal Workspace
    • The Perfect Machine
      • For Biz Team
      • For Developers
      • For Designers
    • Setup a machine in the cloud
      • Solution
      • Setup box on Azure
        • Create a machine on Azure
        • Test drive your remote machine
        • Setup Dropbox On Azure
      • Setup box on DigitalOcean
        • Setup UI
        • Shared FileSystem
          • Dropbox
            • Use locally developed node modules in another project
          • sshfs
        • Long Running Sessions
      • Feedback
  • 2. Learning Git
    • Static Code Analysis
  • 3. The Backend
    • Use Containers
    • Setup a loopback project
    • Lockdown
    • Build a better mousetrap
    • The abyss stares back
    • Built-in models
    • Extending built-in models
    • Understanding UserModel
    • Boot Scripts
    • Promises
    • Find roles for current user
    • Loopback Console
    • Current User
  • 4. Multi-tenancy With Loopback
    • What is Multi-Tenancy
    • Architecting with Loopback
    • Define scope for Roles
    • Role Resolvers
    • Access Control For Tenants
    • Better Programming with multi-tenancy
  • 5. The Frontend
    • The Browser
    • Unit Testing
      • Motivation behind this blog
      • How to write a test
      • Karma and Jasmin
      • Writing Tests
    • End-2-End Testing
    • Angular 1.x
    • Angular 2
      • Testing
  • 6. ElasticSearch
    • Better Search with NGram
    • NGram with Elasticsearch
    • Fun with Path Hierarchy Tokenizer
    • Working with Mappings and Analyzers
  • 7. Promises
    • What are Promises
    • Promise Implementations
    • Nuances
    • What should we use
  • 8. Learning Docker
    • Docker Swarm
  • 9. Queues & Workers
    • PHP workers in AWS EBS
    • NodeJS workers in AWS EBS
      • SQS Daemon by AWS
      • SQS Daemon in NodeJS
      • SQS polling by worker
    • Gearman
  • 10. Docker
    • Capabilities
  • Appendix
    • Bug in WebStorm deployments
    • The Perfect Terminal
    • Scalable App Deployment with AWS
    • Chrome Tips & Tricks
    • Host your own Gitbook
    • Gitbook Tips & Tricks
    • How to handle support incidents
    • Dev Resources
    • Debug e2e Tests on CircleCI
    • Logging
    • Authentication Principles
    • Mac
    • nvm
    • Unify testing with npm
      • Debugging Mocha
    • Sequence Diagrams
    • Project Sync via IDE
      • SFTP with WebStorm
      • SFTP with Visual Studio
    • Soft Linking
    • NodeJS Profiling
      • How to find node.js performance optimization killers
    • Setup Packer on Azure
Powered by GitBook
On this page
  • Version Tools
  • Useful plugins
  • Curated
  • Hear-Say
  • Useful tips
  • Analyzers
  • Rebuilding an index
  • Examples & Exercises

6. ElasticSearch

PreviousTestingNextBetter Search with NGram

Last updated 7 years ago

Version Tools

- Elasticsearch Version Manager is a command line application used for development to manage different versions of Elasticsearch. Like nvm is for NodeJS, similarly esvm is for ElasticSearch.

Useful plugins

Curated

  • - simplest admin console for ES.

  • helps you understand:

    • how ES breaks down your text into tokens for storage, and

    • your search into tokens for lookups.

    • Access it at: <proto>//<host>:<port>/_plugin/inquisitor/#/analyzers

  • - An extension for the Chrome Browser. Very useful, you can find it in the chrome web store.

Hear-Say

  • - An angularJS client for elasticsearch as a plugin.

  • - A native version of the sense plugin for elasticsearch

  • - for building good UI on top of CouchDB.

  • - to do approximate or exact distinct counts, and fast term lists

  • - A set of facets and related tools for ElasticSearch.

  • - Sense would suffice in our opinion. The only additional merit we see, is that it renders images inline, when presenting search results.

TODO for Authors: Need to create a docker-compose file with an entrypoint script that installs this plugin for readers to play around with the most appropriate version of ES. Plugins usually can't keep up with the lightning fast progress of ES.

Useful tips

Analyzers

  • Gram-based approach:

      • ngrams don’t attempt language heuristics such as stemming that don’t apply to strings like “cmdrtaco”. They also handle mis-spelled terms well, since a search just needs to have a plurality of matches of sub-parts of a given term

      • ngrams allow developers to trade storage for CPU

  • Autocomplete and various ways to get there:

Rebuilding an index

      • cloned it

      • ran build:

        • mvn -DskipTests clean package

      • uploaded the zip from "target” directory to hosted ES

      • reindex operation errored out during trial & error

Examples & Exercises

TODO for Authors: Need to create a docker-compose file to setup and play with analyzers quickly.

TODO for Authors: Use sense chrome plugin or CURL to demonstrate.

GET /my_index/_analyze?field=product.image_url&text="t112_1059_Cinnamon - Incense Stick"

GET /_analyze?tokenizer=keyword&filters=lowercase&text="t112_1059_Cinnamon - Incense Stick"

GET /_analyze?token_filters=word_delimiter&text="O’Neil’s hello---there, dude SD500 PowerShot Wi-Fi"

GET /_analyze?tokenizer=standard&text="t112_1059_Cinnamon - Incense Stick"

GET /_analyze?analyzer=simple&text="t112_1059_Cinnamon - Incense Stick"

GET /_analyze?tokenizer=keyword&token_filters=word_delimiter,lowercase&text="t112_1059_Cinnamon - Incense Stick"

GET /my_index/_analyze?field=product.name&text="t112_1059_Cinnamon - Incense Stick"

POST /my_index/product/_search
{"query":{"bool":{"must":[{"query_string":{"default_field":"_all","query":"cinna"}}]}}}

POST /my_index/product/_search
{"query":{"bool":{"must":[{"query_string":{"default_field":"name","query":"cinnamon"}}]}}}

GET /my_index/_analyze?field=product.barcodes&text="['20015','20016']"

POST /my_index/product/_search
{
   "query": {
      "term": {
         "barcodes": "MANUAL:20015"
      }
   }
}
POST /my_index/product/_search
{
   "query": {
      "multi_match": {
         "query": "20015",
         "fields": [
            "barcodes"
         ]
      }
   }
}

POST /my_index/product/_search
{
   "query": {
      "match_all": {}
   },
   "facets": {
      "department_name": {
         "terms": {
            "field": "barcodes"
         }
      }
   }
}

It is possible to for an index. But it is required to close the index first and open it after the changes are made.

Use - ask for closest match of a search term from the terms you have already indexed. It won't complete your text, but produce similar lookalikes from the terms present in ES.

Scan /

You can avoid rebuilding an index due to mapping changes by using or mappings.

esvm
ES Head
ES Inquisitor
Sense
ES GUI
Sensitive
ReclineJS
Approx
Elastic Facets
Elastic Hammer
define new analyzers
http://jontai.me/blog/2013/02/adding-autocomplete-to-an-elasticsearch-search-application/
http://exploringelasticsearch.com/searching_usernames_and_tokenish_text.html#ch-strangetext
https://www.found.no/foundation/fuzzy-search/
https://www.found.no/foundation/text-analysis-part-1/#using-ngrams-for-advanced-token-searches
https://www.found.no/foundation/text-analysis-part-2/
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
https://www.elastic.co/guide/en/elasticsearch/guide/master/_index_time_search_as_you_type.html
term suggester
Scroll
https://github.com/karussell/elasticsearch-reindex
Word Delimiter
aliases
multi-fields