Version 2.5 =========== - A bug was causing a lot of data going over the wire each time FSCrawler was running. To fix this issue, we changed the default mapping and we set ``store: true`` on field ``file.filename``. If this field is not stored and ``remove_deleted`` is ``true`` (default), FSCrawler will fail while crawling your documents. You need to create the new mapping accordingly and reindex your existing data either by deleting the old index and running again FSCrawler or by using the `reindex API `__ as follows: :: # Backup old index data POST _reindex { "source": { "index": "job_name" }, "dest": { "index": "job_name_backup" } } # Remove job_name index DELETE job_name Restart FSCrawler with the following command. It will just create the right mapping again. .. code:: sh $ bin/fscrawler job_name --loop 0 Then restore old data: :: POST _reindex { "source": { "index": "job_name_backup" }, "dest": { "index": "job_name" } } # Remove backup index DELETE job_name_backup The default mapping changed for FSCrawler for ``meta.raw.*`` fields. Might be better to reindex your data. - The ``excludes`` parameter is also used for directory names. But this new implementation also brings a breaking change if you were using ``excludes`` previously. In the previous implementation, the regular expression was only applied to the filename. It's now applied to the full virtual path name. For example if you have a ``/tmp`` dir as follows: .. code:: /tmp └── folder ├── foo.txt └── bar.txt Previously excluding ``foo.txt`` was excluding the virtual file ``/folder/foo.txt``. If you still want to exclude any file named ``foo.txt`` whatever its directory you now need to specify ``*/foo.txt``: .. code:: json { "name" : "test", "fs": { "excludes": [ "*/foo.txt" ] } } For more information, read :ref:`includes_excludes`. - For new indices, FSCrawler now uses ``_doc`` as the default type name for clusters running elasticsearch 6.x or superior.