Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logstash lost data during log rotate #214

Closed
Tsukiand opened this issue Sep 17, 2018 · 4 comments
Closed

Logstash lost data during log rotate #214

Tsukiand opened this issue Sep 17, 2018 · 4 comments

Comments

@Tsukiand
Copy link

Tsukiand commented Sep 17, 2018

I have use logstash-input-file(4.1.4) to ingest from file. I found data loss during log rotation.
I have set my 3 files to rotate. And when the file over 1k the file rotate happen.

My configuration of log rotate:
{
missingok
size 1k
notifempty
sharedscripts
rotate 3
}

My script to generate log and rotate:
for (( i=1 ; i <= 100000; i++ ))
do
echo "$i this is a bunch of test data blah blah" >> /tmp/log/test

if ! ((i % 1000)); then
sleep 1
fi

if ! ((i % 30000 || i == 100000)); then
/usr/sbin/logrotate -f /etc/logrotate.d/test &
fi
done

My configuration of logstash:
input {
file {
path => "/tmp/log/test*"
}
}

output {
file {
path => "/tmp/output.txt"
codec => line { format => "custom format: %{message}" }
}
}

Data loss happened as below:
I found that creating new "log" file caused data loss. I have checked the source code and found that new "log" file lost some logs in the beginning. (create_initial.rb seek operation cause this issue).It means that logs that written to the "log" files during the file rotation will lost.

Please give me some advice on this issue.

Thanks,
Tsukiand

@Tsukiand Tsukiand reopened this Sep 17, 2018
@lrbsunday
Copy link

I got the same issue, any ideas?

@Tsukiand
Copy link
Author

Tsukiand commented Sep 26, 2018

@lrbsunday I have test with input path as "/tmp/log/test" and "/tmp/log/test*". And data loss is the same.
When i use "/tmp/log/test", filewatch only monitor the "test" file. And logstash will lost data during file rotation.
When i use "/tmp/log/test*", filewatch monitor "test" "test.1" "test.2". And logstash will not lost data during file rotation. But we also lost data. I will explain the data loss:

  1. We have test test.1 test.2 and test.3
  2. File rotation happened.
    2.1 File rotation1: test.2 change to test.3 (new test.3 will rotate from old test.2, no data loss)
    2.2 File rotation2: test.1 change to test.2 (new test.2 will rotate from old test.1, no data loss)
    2.3 File rotation3 : test change to test.1 (new test.1 will rotate from old test, no data loss)
    2.4 File rotation4: new test generated (As old test change to test.1, the watched_file changed, and it caused new test to rotate as initial file, and seek to the current size. The seek operation result in data loss)

I have add a flag(:rotate_flag) in "watced_file.rb" to avoid data loss. But i am not sure whether my change will bring other issues. Maybe you can give me some advice.

attr_reader :bytes_read, :state, :file, :buffer, :recent_states, :bytes_unread, :rotate_flag
attr_reader :path, :accessed_at, :modified_at, :pathname, :filename
attr_reader :listener, :read_loop_count, :read_chunk_size, :stat
attr_reader :loop_count_type, :loop_count_mode
attr_accessor :last_open_warning_at

def initialize(pathname, stat, settings)
@settings = settings
@pathName = Pathname.new(pathname)
@path = @pathname.to_path
@filename = @pathname.basename.to_s
full_state_reset(stat)
watch
set_standard_read_loop
set_accessed_at
@rotate_flag = false
end

def flag?
@rotate_flag
end

def set_flag
@rotate_flag = true
end

def rotate_from(other)
# move all state from other to this one
set_standard_read_loop
file_close
@bytes_read = other.bytes_read
@bytes_unread = other.bytes_unread
@Listener = nil
@initial = false
@recent_states = other.recent_states
@accessed_at = other.accessed_at
if !other.delayed_delete?
# we don't know if a file exists at the other.path yet
# so no reset
other.full_state_reset
other.set_flag
end
set_stat PathStatClass.new(pathname)
ignore
end

def rotate_as_initial_file
# rotation, when no sincedb record exists for new inode - we have never seen this inode before.
rotate_as_file
if !flag?
@initial = true
end
#@initial = true
end

@guyboertje
Copy link
Contributor

The temporary work around is to use start_position => "beginning" as this forces processing to start at the beginning of the latest file.
However, it is a bug that this should be necessary. The docs say... If you have old data you want to import, set this to 'beginning'. but clearly this is not old data.

@Tsukiand
Copy link
Author

Tsukiand commented Oct 8, 2018

@guyboertje

Thanks for your reply. I have test with start_position => "beginning" and it works. But as you said, it is a temporary work around. Maybe we need a fix.

guyboertje pushed a commit to guyboertje/logstash-input-file that referenced this issue Oct 25, 2018
guyboertje pushed a commit that referenced this issue Oct 29, 2018
#217)

* Force all files under rotation to start at 0 or at the sincedb record.
* Update travis.yml to update versions.

Fixes #214
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants