CLI options

  • --help displays help
  • --silent runs in silent mode. No output is generated.
  • --debug runs in debug mode.
  • --trace runs in trace mode (more verbose than debug).
  • --config_dir defines directory where jobs are stored instead of default ~/.fscrawler.
  • --username defines the username to use when using an secured version of elasticsearch cluster. Read Using Credentials (X-Pack).
  • --upgrade runs a reindex operation for indices created with an older version. See Upgrade.
  • --loop x defines the number of runs we want before exiting. See Loop.
  • --restart restart a job from scratch. See Restart.
  • --rest starts the REST service. See Rest.

Upgrade

--upgrade runs a reindex operation for indices created with an older version which was using multiple types within the same index. More on this in Upgrade to 2.3 section.

Loop

New in version 2.2.

--loop x defines the number of runs we want before exiting:

  • X where X is a negative value means infinite, like -1 (default)
  • 0 means that we don’t run any crawling job (useful when used with rest).
  • X where X is a positive value is the number of runs before it stops.

If you want to scan your hard drive only once, run with --loop 1.

Restart

New in version 2.2.

You can tell FSCrawler that it must restart from the beginning by using --restart option:

bin/fscrawler job_name --restart

In that case, the {job_name}/_status.json file will be removed.

Rest

New in version 2.3.

If you want to run the REST service without scanning your hard drive, launch with:

bin/fscrawler --rest --loop 0