You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to index files using fscrawler 6-2.6.
it works fine with local folders.
when i try to deal with remote server via ssh, it doesnt work.
with the debug option, it seems as if documents are seen as folders
debug log
DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/...path.../folder, /...path.../folder) = /
DEBUG [f.p.e.c.f.FsParserAbstract] Indexing contenu_folder/_doc/6094a0b4ed1330ad3f73d69ef1d3f97?pipeline=null
DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/...path.../folder] content
DEBUG [f.p.e.c.f.c.FileAbstractor] Listing local files from /...path.../folder
DEBUG [f.p.e.c.f.c.FileAbstractor] 173 local files found
DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/...path.../folder, /...path.../folder/5e8c8404515d64513b5d2ce56ae3d9ec.pdf) = /5e8c8404515d64513b5d2ce56ae3d9ec.pdf
DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [true], filename = [/5e8c8404515d64513b5d2ce56ae3d9ec.pdf], includes = [[*.pdf]], excludes = [null]
DEBUG [f.p.e.c.f.FsParserAbstract] [/5e8c8404515d64513b5d2ce56ae3d9ec.pdf] can be indexed: [true]
DEBUG [f.p.e.c.f.FsParserAbstract] - folder: 5e8c8404515d64513b5d2ce56ae3d9ec.pdf
DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/...path.../folder, /...path.../folder/5e8c8404515d64513b5d2ce56ae3d9ec.pdf) = /5e8c8404515d64513b5d2ce56ae3d9ec.pdf
DEBUG [f.p.e.c.f.FsParserAbstract] Indexing contenu_folder/_doc/3af8b5835d2fe7605cc9fbca1bfe519c?pipeline=null
DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/...path.../folder/5e8c8404515d64513b5d2ce56ae3d9ec.pdf] content
DEBUG [f.p.e.c.f.c.FileAbstractor] Listing local files from /...path.../folder/5e8c8404515d64513b5d2ce56ae3d9ec.pdf
DEBUG [f.p.e.c.f.c.FileAbstractor] 1 local files found
DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/...path.../folder, /...path.../folder/5e8c8404515d64513b5d2ce56ae3d9ec.pdf/5e8c8404515d64513b5d2ce56ae3d9ec.pdf) = /5e8c8404515d64513b5d2ce56ae3d9ec.pdf/5e8c8404515d64513b5d2ce56ae3d9ec.pdf
DEBUG [f.p.e.c.f.f.FsCrawlerUtil] directory = [true], filename = [/5e8c8404515d64513b5d2ce56ae3d9ec.pdf/5e8c8404515d64513b5d2ce56ae3d9ec.pdf], includes = [[*.pdf]], excludes = [null]
[/5e8c8404515d64513b5d2ce56ae3d9ec.pdf/5e8c8404515d64513b5d2ce56ae3d9ec.pdf] can be indexed: [true]
DEBUG [f.p.e.c.f.FsParserAbstract] - folder: 5e8c8404515d64513b5d2ce56ae3d9ec.pdf
DEBUG [f.p.e.c.f.f.FsCrawlerUtil] computeVirtualPathName(/...path.../folder, /...path.../folder/5e8c8404515d64513b5d2ce56ae3d9ec.pdf/5e8c8404515d64513b5d2ce56ae3d9ec.pdf) = /5e8c8404515d64513b5d2ce56ae3d9ec.pdf/5e8c8404515d64513b5d2ce56ae3d9ec.pdf
DEBUG [f.p.e.c.f.FsParserAbstract] Indexing contenu_folder/_doc/bb49c3ae6067d131922716aa534261c?pipeline=null
DEBUG [f.p.e.c.f.FsParserAbstract] indexing [/...path.../folder/5e8c8404515d64513b5d2ce56ae3d9ec.pdf/5e8c8404515d64513b5d2ce56ae3d9ec.pdf] content
DEBUG [f.p.e.c.f.c.FileAbstractor] Listing local files from /...path.../folder/5e8c8404515d64513b5d2ce56ae3d9ec.pdf/5e8c8404515d64513b5d2ce56ae3d9ec.pdf
WARN [f.p.e.c.f.FsParserAbstract] Error while crawling /...path.../folder: No such file
WARN [f.p.e.c.f.FsParserAbstract] Full stacktrace
com.jcraft.jsch.SftpException: No such file
at com.jcraft.jsch.ChannelSftp.throwStatusError(ChannelSftp.java:2873) ~[jsch-0.1.54.jar:?]
at com.jcraft.jsch.ChannelSftp._stat(ChannelSftp.java:2225) ~[jsch-0.1.54.jar:?]
at com.jcraft.jsch.ChannelSftp._stat(ChannelSftp.java:2242) ~[jsch-0.1.54.jar:?]
at com.jcraft.jsch.ChannelSftp.ls(ChannelSftp.java:1592) ~[jsch-0.1.54.jar:?]
at com.jcraft.jsch.ChannelSftp.ls(ChannelSftp.java:1553) ~[jsch-0.1.54.jar:?]
at fr.pilato.elasticsearch.crawler.fs.crawler.ssh.FileAbstractorSSH.getFiles(FileAbstractorSSH.java:80) ~[fscrawler-crawler-ssh-2.6.jar:?]
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:241) ~[fscrawler-core-2.6.jar:?]
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:299) ~[fscrawler-core-2.6.jar:?]
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.addFilesRecursively(FsParserAbstract.java:299) ~[fscrawler-core-2.6.jar:?]
at fr.pilato.elasticsearch.crawler.fs.FsParserAbstract.run(FsParserAbstract.java:157) [fscrawler-core-2.6.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
Thanks for testing it. I must confess that I'm not testing very often the SSH mode. And I believe I did not test it for a year... That's probably why it's buggy. 🐛
I'm trying to index files using fscrawler 6-2.6.
it works fine with local folders.
when i try to deal with remote server via ssh, it doesnt work.
with the debug option, it seems as if documents are seen as folders
debug log
#settings
settings
```json { "name" : "contenu", "fs" : { "url" : "/..path.../folder", "update_rate" : "1m", "includes": [ "*.pdf" ], "json_support" : false, "filename_as_id" : false, "add_filesize" : true, "remove_deleted" : true, "add_as_inner_object" : false, "store_source" : true, "index_content" : true, "indexed_chars": "100%", "attributes_support" : false, "raw_metadata" : true, "xml_support" : false, "index_folders" : true, "ignore_above": "5mb", "lang_detect" : false, "continue_on_error" : false, "pdf_ocr" : false, "ocr" : { "language" : "eng+fra" } }, "server" : { "hostname" : "remoteip", "port" : "22", "username" : "account", "password" : "password", "protocol" : "ssh" }, "elasticsearch" : { "nodes" : [ { "url" : "http://127.0.0.1:9200" } ], "bulk_size" : 100, "flush_interval" : "5s", "byte_size" : "10mb" }, } ```Versions:
The text was updated successfully, but these errors were encountered: