Directory layout

The directory layout of the project is as follows:

.
├── NOTICE
├── LICENSE
├── README.md
├── bin
│   ├── fscrawler
│   └── fscrawler.bat
├── config
│   ├── log4j2.xml
│   └── log4j2-file.xml
├── external
├── lib
└── logs
    ├── documents.log
    └── fscrawler.log

The bin directory contains the scripts to run FSCrawler.

The lib directory contains the FSCrawler jar file and all the dependencies.

New in version 2.10.

The config directory contains the configuration files. See Configuring the logger.

The external directory contains the external libraries you could add to FSCrawler. For example, if you want to add the jai-imageio-jpeg2000 library to add support for JPEG2000 images, you can download it from Maven Central and put the jai-imageio-jpeg2000-1.4.0.jar file in the external directory.

As this directory is empty by default, you can also mount it when using Docker images:

docker run -it --rm \
     -v ~/.fscrawler:/root/.fscrawler \
     -v ~/tmp:/tmp/es:ro \
     -v "$PWD/external:/usr/share/fscrawler/external" \
     dadoonet/fscrawler fscrawler job_name

See also Using docker and Using docker compose.

The logs directory contains the log files. See Configuring the logger.