Testing FSCrawler CLI
A convenience shell script is provided to spin up the full EDOT observability stack and run FSCrawler against the built-in test documents in a single command.
Prerequisites
Java 17+
Maven 3.3+
Docker and either
docker compose(v2) ordocker-compose(v1)curl,unzip
Quick start
From the project root:
# Full run: build → start docker stack → crawl with OTel tracing
./distribution/test-scripts/test-fscrawler-cli.sh
Note
distribution/test-scripts/test-fscrawler-cli.sh is a generated file — the project
version is injected by Maven resource filtering at build time.
The source template is distribution/src/test/scripts/test-fscrawler-cli.sh.
After a version bump, regenerate it with:
mvn generate-test-resources -pl distribution
The script will:
Build the distribution ZIP (
mvn clean package -DskipTests -Ddocker.skip)Start Elasticsearch + Kibana + EDOT Collector via docker-compose
Unzip the distribution into
/tmp/fscrawler-edot-test/Create a job config pointing to
test-documents/src/main/resources/documents/Set
OTEL_*environment variables and launch FSCrawler with the REST API enabled (--rest)
FSCrawler runs until you press Ctrl+C. While it is running:
Elasticsearch —
http://localhost:9200/test-edot/_searchKibana APM —
http://localhost:5601→ Observability → APM → servicefscrawler
Options
Flag |
Effect |
|---|---|
|
Reuse the existing distribution ZIP (skip Maven build) |
|
Assume the docker-compose stack is already running |
|
Disable OTel tracing ( |
|
Set the FSCrawler log level (default: |
|
Print usage |
Examples:
# Iterate quickly: keep docker stack running, only rebuild + recrawl
./distribution/test-scripts/test-fscrawler-cli.sh --skip-docker
# Rebuild and crawl, but disable tracing (baseline comparison)
./distribution/test-scripts/test-fscrawler-cli.sh --no-otel
# Fastest iteration: nothing to (re)build, stack already up
./distribution/test-scripts/test-fscrawler-cli.sh --skip-build --skip-docker
# Debug log level with stack already running
./distribution/test-scripts/test-fscrawler-cli.sh --skip-build --skip-docker --log-level=debug
What to look for in Kibana APM
After the crawl, open Kibana at http://localhost:5601. Navigate to Observability → APM → Services → fscrawler.
You should see traces containing the following spans in a waterfall view:
fscrawler.crawl ← one span per run
└─ fscrawler.directory.traverse
└─ fscrawler.directory.process ← one per directory
└─ fscrawler.file.index ← one per file
└─ fscrawler.tika.extract ← Tika text extraction
fscrawler.es.bulk ← Elasticsearch bulk calls
Auto-instrumented spans (HTTP, ES client, etc.) also appear as children of the
root trace thanks to the elastic-otel-javaagent.
Stopping the stack
When you’re done, stop the docker-compose services:
docker compose -f contrib/docker-compose-example-edot/docker-compose.yml down
Or to also remove the volumes (wipes Elasticsearch data):
docker compose -f contrib/docker-compose-example-edot/docker-compose.yml down -v