Tips and tricks =============== Moving files to a “watched” directory ------------------------------------- When moving an existing file to the directory FSCrawler is watching, you need to explicitly ``touch`` all the files as when moved, the files are keeping their original date intact: .. code:: sh # single file touch file_you_moved # all files find -type f -exec touch {} + # all .txt files find -type f -name "*.txt" -exec touch {} + Or you need to :ref:`restart ` from the beginning with the ``--restart`` option which will reindex everything. Workaround for huge temporary files ----------------------------------- fscrawler uses a media library that currently does not clean up their temporary files. Parsing MP4 files may create very large temporary files in /tmp. The following commands could be useful e.g. as a cronjob to automatically delete those files once they are old and no longer in use. Adapt the commands as needed. .. code:: sh # Check all files in /tmp find /tmp \( -name 'apache-tika-*.tmp-*' -o -name 'MediaDataBox*' \) -type f -mmin +15 ! -exec fuser -s {} \; -delete # When using a systemd service with PrivateTMP enabled find $(find /tmp -maxdepth 1 -type d -name 'systemd-private-*-fscrawler.service-*') \( -name 'apache-tika-*.tmp-*' -o -name 'MediaDataBox*' \) -type f -mmin +15 ! -exec fuser -s {} \; -delete Indexing from HDFS drive ------------------------ There is no specific support for HDFS in FSCrawler. But you can `mount your HDFS on your machine `__ and run FS crawler on this mount point. You can also read details about `HDFS NFS Gateway `__. Using docker ------------ See :ref:`docker`.