Version 2.5
===========

-   A bug was causing a lot of data going over the wire each time
    FSCrawler was running. To fix this issue, we changed the default
    mapping and we set ``store: true`` on field ``file.filename``. If
    this field is not stored and ``remove_deleted`` is ``true``
    (default), FSCrawler will fail while crawling your documents. You
    need to create the new mapping accordingly and reindex your existing
    data either by deleting the old index and running again FSCrawler or
    by using the `reindex
    API <https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html>`__
    as follows:

    ::

       # Backup old index data
       POST _reindex
       {
         "source": {
           "index": "job_name"
         },
         "dest": {
           "index": "job_name_backup"
         }
       }
       # Remove job_name index
       DELETE job_name

    Restart FSCrawler with the following command. It will just create the
    right mapping again.

    .. code:: sh

       $ bin/fscrawler job_name --loop 0

    Then restore old data:

    ::

       POST _reindex
       {
         "source": {
           "index": "job_name_backup"
         },
         "dest": {
           "index": "job_name"
         }
       }
       # Remove backup index
       DELETE job_name_backup

    The default mapping changed for FSCrawler for ``meta.raw.*`` fields.
    Might be better to reindex your data.

-   The ``excludes`` parameter is also used for directory names. But this
    new implementation also brings a breaking change if you were using ``excludes``
    previously. In the previous implementation, the regular expression was only applied
    to the filename. It's now applied to the full virtual path name.

    For example if you have a ``/tmp`` dir as follows:

    .. code::

        /tmp
        └── folder
            ├── foo.txt
            └── bar.txt

    Previously excluding ``foo.txt`` was excluding the virtual file ``/folder/foo.txt``.
    If you still want to exclude any file named ``foo.txt`` whatever its directory
    you now need to specify ``*/foo.txt``:

    .. code:: json

       {
         "name" : "test",
         "fs": {
           "excludes": [
             "*/foo.txt"
           ]
         }
       }

    For more information, read :ref:`includes_excludes`.

- For new indices, FSCrawler now uses ``_doc`` as the default type name for clusters
  running elasticsearch 6.x or superior.