Comment

Chris Warrick

Instead of deleting the index and breaking search for 3.5 minutes every day, couldn’t you just store the last index date with each document, and use `delete_by_query` to delete documents not updated in the latest run? Or alternatively, put that date in index names (mdn_YYYYMMDD for example) and use index aliases (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html) to point clients to the current index? Both are simple solutions, they don’t cost much resource-wise, and having slightly stale data (how often are pages actually deleted?) for a few minutes is better than no data at all.

Replies

Peter Bengtsson

The idea of using aliases is discussed here: https://github.com/mdn/yari/issues/3098