Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch 7.2 throws an error he length of [content] field of [files:61364] doc of [fulltextsearch] index has exceeded [1000000] #73

Open
e-alfred opened this issue Jun 30, 2019 · 9 comments

Comments

@e-alfred
Copy link

e-alfred commented Jun 30, 2019

Elasticsearch 7.2 throws the following error:

# sudo -u www-data php occ fulltextsearch:search user searchterm
search

In Connection.php line 620:
                                                                                                                                                                                                                                       
  {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The length of [content] field of [files:61364] doc of [fulltextsearch] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This   
  maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!"}],"type":"search_phase_execution_exception","reason":"all sha  
  rds failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"fulltextsearch","node":"KVV5wbadRSm-Q1VrAZJWwQ","reason":{"type":"illegal_argument_exception","reason":"The length of [content] field of [files:613  
  64] doc of [fulltextsearch] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, ind  
  exing with offsets or term vectors is recommended!"}}],"caused_by":{"type":"illegal_argument_exception","reason":"The length of [content] field of [files:61364] doc of [fulltextsearch] index has exceeded [1000000] - maximum all  
  owed to be analyzed for highlighting. This maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!","caused_by":{"type  
  ":"illegal_argument_exception","reason":"The length of [content] field of [files:61364] doc of [fulltextsearch] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This maximum can be set by changing  
   the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!"}}},"status":400}                                                                             
                                                                                                                                                                                                                                    

I could fix this by applying the following Curl command:

curl -XPUT "localhost:9200/fulltextsearch/_settings" -H 'Content-Type: application/json' -d' {
    "index" : {
        "highlight.max_analyzed_offset" : 60000000
    }
}
'
@gruentee
Copy link

gruentee commented Jul 1, 2019

I experienced the same error message and also fixed it by setting a higher value to highlight.max_analyzed_offset.

For large texts, indexing with offsets or term vectors is recommended!

I'm not too much into Elastic search but is there a way to change the mapping / index configuration so as to use term vectors? Or are they already used as suggested by the mapping?

@shmakovpn
Copy link

https://www.elastic.co/guide/en/elasticsearch/reference/current/term-vector.html
I changed all of 'term_vector' => 'yes' in apps/fulltextsearch_elasticsearch/lib/Service/IndexMappingService.php and in apps/fulltextsearch_elasticsearch/vendor/elasticsearch/elasticsearch/docs/index-operations.asciidoc to: 'term_vector' => 'with_positions_offsets'.
then, occ fulltextsearch:reset
then, occ fulltextsearch:index
It works!

@prolibre
Copy link

prolibre commented Nov 27, 2019

I have the same problem. I index a lot of pdf/images.
Indexing works but I have the same error when I do a search.
I tried the solution of @shmakovpn but it doesn't work for me : : I always have the same error.
On the other hand, if I modify the highlight.max_analyzed_offset it works.

Anthony

@zjxlinux
Copy link

@prolibre you can disable highlighting completely to get rid of this error message by disabling the advanced setting, doc_table:highlight (in kibana).

@Irillit
Copy link

Irillit commented May 6, 2020

As far as I know, Kibana could search several indexes started with specific prefix.
Can we somehow split the old index and search the key there with this app?

@balusarakesh
Copy link

Hi,
I'm also seeing the same issue, the default value for highlight.max_analyzed_offset was 100000, I changed it to 1000000 and I still see the same error.

"Caused by: java.lang.IllegalArgumentException: The length of [message.keyword] field of [Nrc0SnIBqCgzdQbvihai] doc of [filebeat-7.3.1-2020.05.25] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!",
{"type": "server", "timestamp": "2020-05-25T15:42:14,930Z", "level": "DEBUG", "component": "o.e.a.s.TransportSearchAction", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-data-2", "message": "[619863] Failed to execute fetch phase", "cluster.uuid": "5AW4UhbMR1u4awVsVXeCRA", "node.id": "YVLJDQL-RfKOmhd6gBLgow" , 

Not really sure how to fix this issue.

  • Thank you

@shmakovpn
Copy link

shmakovpn commented May 26, 2020

Hi,
I'm also seeing the same issue, the default value for highlight.max_analyzed_offset was 100000, I changed it to 1000000 and I still see the same error.

"Caused by: java.lang.IllegalArgumentException: The length of [message.keyword] field of [Nrc0SnIBqCgzdQbvihai] doc of [filebeat-7.3.1-2020.05.25] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!",
{"type": "server", "timestamp": "2020-05-25T15:42:14,930Z", "level": "DEBUG", "component": "o.e.a.s.TransportSearchAction", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-data-2", "message": "[619863] Failed to execute fetch phase", "cluster.uuid": "5AW4UhbMR1u4awVsVXeCRA", "node.id": "YVLJDQL-RfKOmhd6gBLgow" , 

Not really sure how to fix this issue.

  • Thank you

Please check your index settings:
curl http://your.elasticsearch.server:9200/your_index/ 2>/dev/null | perl -MJSON -e 'print(JSON->new->ascii->pretty->encode(decode_json(<>)))' | grep term_vector

A result should be:
"term_vector" : "with_positions_offsets", "term_vector" : "with_positions_offsets", "term_vector" : "with_positions_offsets"

Please look at the commit: 5256e3d
where the default highlight limit problem was solved

R0Wi pushed a commit to R0Wi/fulltextsearch_elasticsearch that referenced this issue Aug 8, 2020
@Zhang21
Copy link

Zhang21 commented Oct 12, 2020

Elasticsearch 7.2 throws the following error:

# sudo -u www-data php occ fulltextsearch:search user searchterm
search

In Connection.php line 620:
                                                                                                                                                                                                                                       
  {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The length of [content] field of [files:61364] doc of [fulltextsearch] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This   
  maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!"}],"type":"search_phase_execution_exception","reason":"all sha  
  rds failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"fulltextsearch","node":"KVV5wbadRSm-Q1VrAZJWwQ","reason":{"type":"illegal_argument_exception","reason":"The length of [content] field of [files:613  
  64] doc of [fulltextsearch] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, ind  
  exing with offsets or term vectors is recommended!"}}],"caused_by":{"type":"illegal_argument_exception","reason":"The length of [content] field of [files:61364] doc of [fulltextsearch] index has exceeded [1000000] - maximum all  
  owed to be analyzed for highlighting. This maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!","caused_by":{"type  
  ":"illegal_argument_exception","reason":"The length of [content] field of [files:61364] doc of [fulltextsearch] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This maximum can be set by changing  
   the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!"}}},"status":400}                                                                             
                                                                                                                                                                                                                                    

I could fix this by applying the following Curl command:

curl -XPUT "localhost:9200/index-name/_settings" -H 'Content-Type: application/json' -d' {
    "index" : {
        "highlight.max_analyzed_offset" : 6000000
    }
}
'

This can resolve problem, but not recommended to modify the configuration. It will cause kibana and es memory problem.

If you take this way, do not change this number is too large. The reason of the problem is the message is too large.

@Ark74
Copy link
Contributor

Ark74 commented Oct 12, 2020

IIRC this has been fixed long time ago.
Or is this very similar?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants