Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: ioutil.ReadFile() returns a truncated buffer on Windows 10 #26923

Closed
AWoloszyn opened this issue Aug 10, 2018 · 8 comments
Closed

os: ioutil.ReadFile() returns a truncated buffer on Windows 10 #26923

AWoloszyn opened this issue Aug 10, 2018 · 8 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Milestone

Comments

@AWoloszyn
Copy link

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

1.10.3

Does this issue reproduce with the latest release?

This is the latest binary release I can find on windows.

What operating system and processor architecture are you using (go env)?

Windows 10, amd64

What did you do?

I called ioutil.ReadFile() on a file that is > 4GB (4470475739 bytes).

bytes, err := ioutil.ReadFile("C:\\Path\\To\\Big\\File.txt")
if err != nil {
   panic(err)
}
fmt.Printf("%v", len(bytes))

What did you expect to see?

I expected to see an output of 4470475739

What did you see instead?

I got an output of 175508955

@ianlancetaylor ianlancetaylor changed the title ioutil.ReadFile() returns a truncated buffer on Windows 10 os: ioutil.ReadFile() returns a truncated buffer on Windows 10 Aug 10, 2018
@ianlancetaylor ianlancetaylor added OS-Windows NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 10, 2018
@ianlancetaylor ianlancetaylor added this to the Go1.12 milestone Aug 10, 2018
@ianlancetaylor
Copy link
Member

Some Windows person will have to look at this, but I'll note that the fix may be to copy the use of maxRW from internal/poll/fd_unix.go to internal/poll/fd_windows.go.

@odeke-em
Copy link
Member

/cc @alexbrainman

@alexbrainman
Copy link
Member

@AWoloszyn,

Sure. I can reproduce this. Thank you.

I'll note that the fix may be to copy the use of maxRW from internal/poll/fd_unix.go to internal/poll/fd_windows.go.

What maxRW value do you suggest, @ianlancetaylor ? This particular issue calls Windows ReadFile API https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-readfile , and the API uses DWORD (uint32), to pass file size. Also do you know if we have a test for this? Should I add new test?

Alex

@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Aug 12, 2018
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 12, 2018
@alexbrainman
Copy link
Member

What maxRW value do you suggest, @ianlancetaylor ? This particular issue calls Windows ReadFile API https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-readfile , and the API uses DWORD (uint32), to pass file size. Also do you know if we have a test for this? Should I add new test?

Never mind all my questions. I made decisions myself. And I am not adding test, because playing with 4GB file takes too long on my computer.

Alex

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/129137 mentions this issue: internal/poll: cap reads and writes to 1GB on windows

@silbinarywolf
Copy link
Contributor

I've spent a little bit of time investigating and testing code to fix this. I'm not sure on what approach should be taken for testing this as writing out such a large file and then reading it back in (as per my rough test code below) would be super slow!

I'm more than willing to put together some PR(s) together if this looks OK!

Ideas:

  • ReadFile / WriteFile 4GB error catching (see code below)
    • We could add a check insyscall\zsyscall_windows.go ReadFile/WriteFile so its an invalid argument error (or similar) if you try to read/write more than 4GB.
  • Apply maxRW logic from fd_unix
    • As suggested by @ianlancetaylor we can simply take the maxRW logic from internal\poll\fd_unix.go. I've put together an example of what that might look like below. I decided to only make that logic apply for file reading/writing cases and not for console printing or socket operations as I'm not sure if you would want that behaviour for those so I'd rather leave it for the time being.
    • Not sure how you would test this in a way that is adequately fast in execution time.

ReadFile / WriteFile 4GB error catching

// NOTE: ReadFile looks the same at the start of the function
func WriteFile(handle Handle, buf []byte, done *uint32, overlapped *Overlapped) (err error) {
	var _p0 *byte
	if len(buf) > 0 {
		_p0 = &buf[0]
                // NOTE: Don't allow writing buffers larger than 4GB as ReadFile takes uint32
		if len(buf) > 4<<30 {
			err = EINVAL
			return
		}
	}
	r1, _, e1 := Syscall6(procWriteFile.Addr(), 5, uintptr(handle), uintptr(unsafe.Pointer(_p0)), uintptr(len(buf)), uintptr(unsafe.Pointer(done)), uintptr(unsafe.Pointer(overlapped)), 0)
	if r1 == 0 {
		if e1 != 0 {
			err = errnoErr(e1)
		} else {
			err = EINVAL
		}
	}
	return
}

Apply maxRW logic from fd_unix

internal\poll\fd_windows.go

// Windows can't read or write 4GB+ files at a time with syscall.Read or syscall.Write
// even on 64-bit systems.
// Use 1GB instead of, say, 2GB-1, to keep subsequent reads aligned.
const maxRW = 1 << 30

// Read implements io.Reader.
func (fd *FD) Read(buf []byte) (int, error) {
	if err := fd.readLock(); err != nil {
		return 0, err
	}
	defer fd.readUnlock()

	var n int
	var err error
	if fd.isFile || fd.isDir || fd.isConsole {
		fd.l.Lock()
		defer fd.l.Unlock()
		if fd.isConsole {
			n, err = fd.readConsole(buf)
		} else {
			if len(buf) > maxRW {
				buf = buf[:maxRW]
			}
			n, err = syscall.Read(fd.Sysfd, buf)
		}
		if err != nil {
			n = 0
		}
	} else {
		o := &fd.rop
		o.InitBuf(buf)
		n, err = rsrv.ExecIO(o, func(o *operation) error {
			return syscall.WSARecv(o.fd.Sysfd, &o.buf, 1, &o.qty, &o.flags, &o.o, nil)
		})
		if race.Enabled {
			race.Acquire(unsafe.Pointer(&ioSync))
		}
	}
	if len(buf) != 0 {
		err = fd.eofError(n, err)
	}
	return n, err
}

// Write implements io.Writer.
func (fd *FD) Write(buf []byte) (int, error) {
	if err := fd.writeLock(); err != nil {
		return 0, err
	}
	defer fd.writeUnlock()

	var n int
	var err error
	if fd.isFile || fd.isDir || fd.isConsole {
		fd.l.Lock()
		defer fd.l.Unlock()
		if fd.isConsole {
			n, err = fd.writeConsole(buf)
			if err != nil {
				n = 0
			}
		} else {
			var nn int
			for {
				max := len(buf)
				if max-nn > maxRW {
					max = nn + maxRW
				}
				n, err = syscall.Write(fd.Sysfd, buf[nn:max])
				if n > 0 {
					nn += n
				}
				if nn == len(buf) {
					return nn, err
				}
				if err != nil {
					return nn, err
				}
				if n == 0 {
					return nn, io.ErrUnexpectedEOF
				}
			}
		}
	} else {
		if race.Enabled {
			race.ReleaseMerge(unsafe.Pointer(&ioSync))
		}
		o := &fd.wop
		o.InitBuf(buf)
		n, err = wsrv.ExecIO(o, func(o *operation) error {
			return syscall.WSASend(o.fd.Sysfd, &o.buf, 1, &o.qty, 0, &o.o, nil)
		})
	}
	return n, err
}

To test the maxRW fix above, I used the following code on my Windows 10 machine:

package main

import (
	"fmt"
	"io/ioutil"
)

const (
	slightlyLargerThan4GB = 4470475739
)

func main() {
	// Write large file, add text "end-of-the-file" to the end of it
	text := "end-of-the-file"
	data := make([]byte, slightlyLargerThan4GB-len(text), slightlyLargerThan4GB)
	data = append(data, text...)
	fmt.Printf("Writing %d bytes to \"test_write_4gb\"\n", len(data))
	err := ioutil.WriteFile("test_write_4gb", data, 0644)
	if err != nil {
		fmt.Printf("%s\n", err)
	} else {
		fmt.Printf("Written %d bytes to file.\n", len(data))
	}

	// Read large file, get the last bit of text
	fmt.Printf("Reading file \"test_write_4gb\"...\n")
	bytes, err := ioutil.ReadFile("test_write_4gb")
	if err != nil {
		panic(err)
	}
	endText := string(bytes[len(bytes)-len(text) : len(bytes)])
	if endText == "end-of-the-file" {
		fmt.Printf("SUCCESS! Loaded: %d bytes, End of file has text: %s\n", len(bytes), endText)
	} else {
		fmt.Printf("FAILURE! Loaded: %d bytes, End of file has text: %s\n", len(bytes), endText)
	}
}

@alexbrainman
Copy link
Member

I'm more than willing to put together some PR(s) together if this looks OK!

@silbinarywolf thank you for the offer, but I already sent https://golang.org/cl/129137

Alex

@silbinarywolf
Copy link
Contributor

Ah nice one! Looks more complete as well :)

@golang golang locked and limited conversation to collaborators Sep 9, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Projects
None yet
Development

No branches or pull requests

7 participants