Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random test failure: TestNeverDelete.TestIndexing #1090

Open
1 task done
paulirwin opened this issue Jan 6, 2025 · 4 comments
Open
1 task done

Random test failure: TestNeverDelete.TestIndexing #1090

paulirwin opened this issue Jan 6, 2025 · 4 comments

Comments

@paulirwin
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

Got a random test failure on this PR run, ubuntu-latest, net9.0, x64. Re-running the test (with new random values) succeeded.

Error Message:
   System.ArgumentOutOfRangeException : The capacity may not be smaller than the file size. (Parameter 'capacity')
(Test: Lucene.Net.Index.TestNeverDelete.TestIndexing)

To reproduce this test result:

Option 1:

 Apply the following assembly-level attributes:

[assembly: Lucene.Net.Util.RandomSeed("0x778d6589462b9ad4")]
[assembly: NUnit.Framework.SetCulture("es-BO")]

Option 2:

 Use the following .runsettings file:

<RunSettings>
  <TestRunParameters>
    <Parameter name="tests:seed" value="0x778d6589462b9ad4" />
    <Parameter name="tests:culture" value="es-BO" />
  </TestRunParameters>
</RunSettings>

Option 3:

 Create the following lucene.testsettings.json file somewhere between the test assembly and the root of your drive:

{
  "tests": {
     "seed": "0x778d6589462b9ad4",
     "culture": "es-BO"
  }
}

  Stack Trace:
     at System.IO.MemoryMappedFiles.MemoryMappedFile.CreateCore(SafeFileHandle fileHandle, String mapName, HandleInheritability inheritability, MemoryMappedFileAccess access, MemoryMappedFileOptions options, Int64 capacity, Int64 fileSize)
   at Lucene.Net.Store.MMapDirectory.MMapIndexInput..ctor(MMapDirectory outerInstance, String resourceDescription, FileStream fc) in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net/Store/MMapDirectory.cs:line 240
   at Lucene.Net.Store.MMapDirectory.OpenInput(String name, IOContext context) in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net/Store/MMapDirectory.cs:line 186
   at Lucene.Net.Util.LuceneTestCase.SlowFileExists(Directory dir, String fileName) in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net.TestFramework/Util/LuceneTestCase.cs:line 2938
   at Lucene.Net.Store.MockDirectoryWrapper.OpenInput(String name, IOContext context) in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net.TestFramework/Store/MockDirectoryWrapper.cs:line 799
   at Lucene.Net.Index.SegmentInfos.Read(Directory directory, String segmentFileName) in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net/Index/SegmentInfos.cs:line 513
   at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit) in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net/Index/SegmentInfos.cs:line 1158
   at Lucene.Net.Index.SegmentInfos.Read(Directory directory) in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net/Index/SegmentInfos.cs:line 625
   at Lucene.Net.Index.StandardDirectoryReader.DoOpenNoWriter(IndexCommit commit) in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net/Index/StandardDirectoryReader.cs:line 380
   at Lucene.Net.Index.DirectoryReader.OpenIfChanged(DirectoryReader oldReader) in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net/Index/DirectoryReader.cs:line 174
   at Lucene.Net.Index.TestNeverDelete.TestIndexing() in /home/runner/work/lucenenet/lucenenet/src/Lucene.Net.Tests/Index/TestNeverDelete.cs:line 88
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr)

Expected Behavior

The test passes

Steps To Reproduce

No response

Exceptions (if any)

No response

Lucene.NET Version

No response

.NET Version

9

Operating System

ubuntu-latest

Anything else?

x64

@paulirwin paulirwin added up-for-grabs This issue is open to be worked on by anyone is:bug test-failure pri:normal labels Jan 6, 2025
@paulirwin paulirwin added this to the 4.8.0-beta00018 milestone Jan 6, 2025
@NightOwl888
Copy link
Contributor

NightOwl888 commented Jan 6, 2025

I have a theory. We added a hack to the original calculation because if the file size of the last buffer was 0, it would throw an UnauthorizedAccessException. We changed the offset to subtract 1 in this case.

https://github.com/apache/lucenenet/blob/master/src/Lucene.Net/Store/MMapDirectory.cs#L331-L338

But I am not sure the hack makes sense to begin with.

The original logic was to allocate a zero byte buffer at the end:

https://github.com/apache/lucenenet/blob/master/src/Lucene.Net/Store/MMapDirectory.cs#L307-L308

So, that was intended to be the case, but Microsoft doesn't allow such a thing. However, rather than doing whacky math with the offset, it would probably make more sense to skip the call to the BCL API when the last buffer length is 0, and instead allocate it like:

buffers[bufNr] = ByteBuffer.Allocate(0).AsReadOnlyBuffer()

@paulirwin
Copy link
Contributor Author

I can't easily reproduce this one with that seed, which points to a concurrency bug (or at least one that is not reproducible due to concurrency), and that is indicated by the use of ThreadJobs in the test. But I think your theory is sound here, it lines up with the message and stack trace (assuming aggressive inlining likely from PGO), and the solution makes sense to me. I also reviewed ByteBufferIndexInput just to be on the safe side, and since Remaining would be 0 for an empty byte buffer, it should not cause any problems (and the code above it returns that anyways if length is 0). I'll get a PR for this one.

@paulirwin paulirwin self-assigned this Jan 7, 2025
@paulirwin paulirwin removed the up-for-grabs This issue is open to be worked on by anyone label Jan 7, 2025
@paulirwin
Copy link
Contributor Author

@NightOwl888 Actually, once I started digging into this, I'm no longer sure about that. I think the exception is coming from here:

            if (input.memoryMappedFile is null)
            {
                input.memoryMappedFile = MemoryMappedFile.CreateFromFile(
                    fileStream: fc,
                    mapName: null,
                    capacity: length,
                    access: MemoryMappedFileAccess.Read,
#if FEATURE_MEMORYMAPPEDFILESECURITY
                    memoryMappedFileSecurity: null,
#endif
                    inheritability: HandleInheritability.Inheritable,
                    leaveOpen: true); // LUCENENET: We explicitly dispose the FileStream separately.
            }

This is the place where MemoryMappedFile.CreateCore could be inlined, and it also does the capacity parameter check. (The later call with the "adjust" math ultimately calls CreateViewAccessor which does not have a capacity. I do think we can still do what you suggest, though.) I can't currently explain the failure, because the capacity is just set to the FileStream length, and that is also what that method checks for internally, so they should be equal - that is, unless the FileStream changed length somehow between the MMapIndexInput ctor and the call to CreateFromFile... if that is true, we could pass fc.Length instead of length as the capacity to help cut down on the likelihood of that changing, but probably wouldn't eliminate it. Let me know if I'm missing something on this or if you have any other ideas.

Where it throws the exception in MemoryMappedFile.VerifyMemoryMappedFileAccess:

if (fileSize > capacity)
{
    throw new ArgumentOutOfRangeException(nameof(capacity), SR.ArgumentOutOfRange_CapacityGEFileSizeRequired);
}

https://source.dot.net/#System.IO.MemoryMappedFiles/System/IO/MemoryMappedFiles/MemoryMappedFile.Unix.cs,26

paulirwin added a commit to paulirwin/lucene.net that referenced this issue Jan 7, 2025
@NightOwl888
Copy link
Contributor

The cast here is potentially a problem: https://github.com/apache/lucenenet/blob/master/src/Lucene.Net/Store/MMapDirectory.cs#L329

It can overflow int because the original value is long. We might also want to check the math on nrBuffers to make sure that is right, also. There were significant gaps between Java and .NET here, so we were making it up as we went along.

One thing to note is that length on ByteBuffer is int, but MMapDirectory always uses long, so indexing into it might be overflowing (although, I would think the chunk size would be small enough so that isn't a problem). But a similar overflow problem could be creeping in because of this gap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants