[RFC] Autoclosing files #780

waj · 2015-06-10T02:17:34Z

Currently this crashes very quickly in Crystal, but it works on Ruby:

loop do
  File.new(__FILE__)
end

The reason: the files are never closed so it eventually hits the system/user limit. In Ruby it doesn't fail because once that happens it performs a garbage collection and tries one more time before raising an exception (https://github.com/ruby/ruby/blob/trunk/io.c#L5451-L5461). The File#finalizer will close the file for us and that frees the file descriptor resource.

We could do the very same thing in Crystal but I'm not totally convinced by this approach. This could hide potential bugs in any application. In general, I think the File#finalizer should never be the point where the file is actually closed.

I can see the following options:

Don't touch anything. The files get automatically closed by the finalizer, but the example still fails because we don't close the files on time.
Copy the Ruby behaviour: run the garbage collection so it finalises unused files and retry to open the file.
Raise a warning from the finalizer when the file was not manually closed. We cannot raise exceptions here so all we can do is print a message on STDERR.
Do both 2 & 3.

Running the GC just in case sounds overkill, so I'm more inclined by option 3, but I wanted to hear other's opinions. Maybe I'm missing any other good reason for the Ruby implementation? (besides convenience).

The text was updated successfully, but these errors were encountered:

vyp · 2015-06-10T02:25:57Z

Leaning towards option 3 too. However, I remember Ary said he doesn't like warning messages as a solution because they're usually ignored anyway?

Also, option 2 means raise an exception after the second failure correct?

jhass · 2015-06-10T08:49:48Z

I'm leaning towards 1 actually, not closing the file is a clear bug and all languages that automatically close files still strongly recommend doing it manually as soon as possible, where manually can mean to use the appropriate language construct, e.g. with in Python and the block form of File.new in Ruby/Crystal

asterite · 2015-06-10T13:38:40Z

I actually prefer 2.

Imagine you want to return a String iterator over the lines of a file, but streaming from disk:

def stream_lines
  File.open("...").each_line
end

Now there will be an open file that won't be closed, and can't be closed from outside that method because the File isn't provided. After running the program for a long time (if you invoke that method multiple times) you will get that warning/error if anything but solution 2 is chosen.

Maybe the above method isn't designed well. But I don't think the check in point 2 is very expensive: it's just checking if the return value from open returns less then zero (which we are already doing) and in that exceptional case (which can't happen that often) run the GC.

Another problem is that you get the error after trying to open the last File. You get that error and understand: "OK, there must be some file that isn't being closed.". How do you find that file? Maybe you forgot to do it in your code. Maybe a library you are using forgot to close one. Wouldn't it be better for the program not to crash and the language to just take care of this? If later you find that the GC is being invoked a lot in File.new for no reason, you can search where are the unclosed files and fix that, but in the meantime you got a working program.

waj · 2015-06-10T13:54:26Z

Those never closing files are like memory leaks: they might be hard to find.
The example is just a really bad api design, but if we do what Ruby does, the programmer might never realise there is a problem. In a long running process, the number of open files will hit the maximum quite often actually and it's not the check what it's expensive but the full GC call.

waj · 2015-06-10T14:03:32Z

Also, my point is: if a file is being closed by the finalizer, there is definitely a bad design. If the language and the runtime should be there to make the programmers life easier, then it should warn as soon as possible about the problem. It has the ability to detect it but if it just tries keep the program running it's like hiding the trash under the carpet.

One might argue that this is just exactly like the memory garbage collection, but for me it's not. The files are a much more expensive and limited resource. And why am I just talking about files. It's the same, or even worse, if we think about open sockets or pipes. Any IO should be explicitly closed, otherwise it's a bug.

asterite · 2015-06-10T14:08:14Z

@waj How do you detect the file that's leaking?

asterite · 2015-06-10T14:08:48Z

Well, nevermind. Probably just find all File.new or File.open and trace that :-)

waj · 2015-06-10T14:09:18Z

We could print the path of the leaked file in the finalizer. That should help a lot.

refi64 · 2015-06-10T14:13:53Z

I'm not really a Ruby user too much, but I like the way Python solves it: close it in the finalizer anyway, but add a mechanism for automatic closing:

with open('abc', 'r') as f:
    print(f.read())
    # The 'with' block closes 'f' afterwards

It's kind of like an implicit try-finally.

waj · 2015-06-10T14:15:30Z

@kirbyfan64 In Crystal (and Ruby) the same mechanism is available with block methods:

File.open(...) do |f|
  f.gets
  ...
end

The file is automatically closed in this case.

asb · 2015-06-10T14:24:05Z

In this trivial case, a simple static analysis would show that the file reference doesn't escape the loop and can be stack allocated, so its destructor can be run at the end of the basic block. Is this not currently done, or is it that are destructors only run at the method boundary?

waj · 2015-06-10T14:39:37Z

It's not currently done. The class instances are always allocated in the heap. In this case the local variable (or the implicit reference) is allocated in the stack, so ideally with escape analysis we could destroy the object and even return the memory space to the heap before the GC runs.

Still that would be an optimisation and I don't think we should relay on such mechanism to release the resources.

vyp · 2015-06-10T16:31:37Z

Wouldn't it be better for the program not to crash and the language to just take care of this? If later you find that the GC is being invoked a lot in File.new for no reason, you can search where are the unclosed files and fix that, but in the meantime you got a working program

I don't know, I'd rather have the compiler give me an error or warning instead. As @waj says:

it should warn as soon as possible about the problem

This way I feel that the programmer's time is saved the most (overall).

asterite · 2015-06-10T16:39:51Z

@vyp Note that it won't be a compile error, it will be a runtime error

vyp · 2015-06-10T16:41:13Z

Sorry yes waj mentioned that at the start, hence "leaning".

vyp · 2015-06-10T16:42:22Z

Oh right, you meant the rust comment is useless. Correct, sorry.

vyp · 2015-06-10T16:47:09Z

Okay, so what about option 4? Except for jhass, it can do both of what @asterite and @waj wants?

asterite · 2015-06-10T17:03:35Z

Oh, now that I read everything again, I think I prefer options 3 or 4. I'm not sure which one, having your program print warnings but continue running is strange.

vyp · 2015-06-10T17:14:21Z

Yes exactly, I keep thinking 4 is just too strange/confusing too.

vyp · 2015-06-10T17:35:47Z

Actually yes, rereading, I don't think @waj would want 4. But just wait for him to respond.

@jhass, isn't 3 also in alignment with your opinion?

jhass · 2015-06-10T17:55:58Z

So far we're warning free, either you're doing it wrong or you're doing it right in Crystal. I like that.

vyp · 2015-06-10T23:42:25Z

Ah, didn't know that, thanks.

ozra · 2015-06-26T16:39:47Z

I vote for 4 (2 & 3).
If it can be caught at compile time - great!
If not - anything that can be saved at runtime, do it: do what the programmer intended and expected, and let the application do what the user expects: work - not crash.

If this thing would happen in some routine seldom called, implementing the heart surveillance system that your grandma is hooked up to at the hospital, you'd be glad it didn't crash because some coders decided "it's not the right way, so the coder should be punished with a crash".

The warning should of course be raised, so that it can be fixed properly for the next release.

refi64 · 2015-06-26T18:20:42Z

My vote goes to 4. Programs shouldn't crash from simple oversights like this when possible.

sdogruyol · 2016-11-20T17:50:10Z

I've tested this on Crystal 0.19.4 with OS X 10.11 ulimit -n is 7168.

Looped over 15 million times for 5 times and haven't crashed once.

index = 0
loop do
  index += 1
  puts index
  File.new(__FILE__)
end

crystal build --release file.cr && ./file
15277652
15277653
15277654
15277655
15277656
15277657
15277658
15277659
15277660
15277661
15277662
15277663
15277664
15277665

RX14 · 2017-05-14T13:36:10Z

Using strace on the program I see a lot of open() and close() syscalls. I'm guessing this is simply the finaliser running, so I don't think that this issue is solved because it's impossible to rely on the finaliser for programs with slow leaks.

I'd vote for option 3, but I wouldn't mind 2 & 3. Printing a warning is good, because explicit closing is obviously the optimal solution. It all depends on how tricky 2 is to implement.

akzhan · 2017-05-23T08:09:35Z

Just rename new to open_returning, so it will be rarely used :-)

rdp · 2019-08-30T05:33:15Z

ulimit -n is 71681 does loop forever. Even with the default unlimit (256?) the error message is Unhandled exception: Error opening file '/path/to/bad.cr' with mode 'r': Too many open files (Errno) which is reasonably clear.

The backtrace gets a bit messed up, I presume since it can't open files;

Failed to raise an exception: END_OF_STACK
[0x105d6dd0b] *CallStack::print_backtrace:Int32 +107
[0x105d473fb] __crystal_raise +91
[0x105d477ed] *raise<Errno>:NoReturn +189
[0x105d7da85] *Crystal::System::File::open<String, String, File::Permissions>:Int32 +197
[0x105d7b78f] *File::new<String, String, File::Permissions, Nil, Nil>:File +63
[0x105d63e40] *CallStack::read_dwarf_sections:(Array(Tuple(UInt64, UInt64, String)) | Nil) +40256
[0x105d59f23] *CallStack::decode_line_number<UInt64>:Tuple(String, Int32, Int32) +51
[0x105d597a2] *CallStack#decode_backtrace:Array(String) +290
[0x105d59661] *CallStack#printable_backtrace:Array(String) +49
[0x105da5128] *Exception+@Exception#backtrace?:(Array(String) | Nil) +72
[0x105da4fb1] *Exception+@Exception#inspect_with_backtrace<IO::FileDescriptor>:Nil +113
[0x105da3840] *AtExitHandlers::run<Int32>:Int32 +432
[0x105da926e] *Crystal::main<Int32, Pointer(Pointer(UInt8))>:Int32 +126
[0x105d51c59] main +9

rdp · 2020-02-15T05:39:49Z

Maybe just spit a message to stderr if it's compiled without --release? How does java handle this I wonder...

jhass added the RFC label Aug 31, 2015

jhass added status:draft topic:stdlib labels Sep 14, 2015

spalladino added status:draft and removed status:draft labels Jan 9, 2017

spalladino removed the RFC label Jan 9, 2017

skunkworker mentioned this issue Jun 11, 2017

Too many open files, crash when serving static files. amberframework/amber#108

Closed

asterite mentioned this issue May 10, 2018

Calling File.each_line Iterator method doesn't close File #6087

Closed

ysbaddaden mentioned this issue Jul 12, 2024

IO::FileDescriptor & Socket finalizers do far too much #14807

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Autoclosing files #780

[RFC] Autoclosing files #780

waj commented Jun 10, 2015

vyp commented Jun 10, 2015

jhass commented Jun 10, 2015

asterite commented Jun 10, 2015

waj commented Jun 10, 2015

waj commented Jun 10, 2015

asterite commented Jun 10, 2015

asterite commented Jun 10, 2015

waj commented Jun 10, 2015

refi64 commented Jun 10, 2015

waj commented Jun 10, 2015

asb commented Jun 10, 2015

waj commented Jun 10, 2015

vyp commented Jun 10, 2015

asterite commented Jun 10, 2015

vyp commented Jun 10, 2015

vyp commented Jun 10, 2015

vyp commented Jun 10, 2015

asterite commented Jun 10, 2015

vyp commented Jun 10, 2015

vyp commented Jun 10, 2015

jhass commented Jun 10, 2015

vyp commented Jun 10, 2015

ozra commented Jun 26, 2015

refi64 commented Jun 26, 2015

sdogruyol commented Nov 20, 2016 •

edited

Loading

RX14 commented May 14, 2017

akzhan commented May 23, 2017

rdp commented Aug 30, 2019

rdp commented Feb 15, 2020

[RFC] Autoclosing files #780

[RFC] Autoclosing files #780

Comments

waj commented Jun 10, 2015

vyp commented Jun 10, 2015

jhass commented Jun 10, 2015

asterite commented Jun 10, 2015

waj commented Jun 10, 2015

waj commented Jun 10, 2015

asterite commented Jun 10, 2015

asterite commented Jun 10, 2015

waj commented Jun 10, 2015

refi64 commented Jun 10, 2015

waj commented Jun 10, 2015

asb commented Jun 10, 2015

waj commented Jun 10, 2015

vyp commented Jun 10, 2015

asterite commented Jun 10, 2015

vyp commented Jun 10, 2015

vyp commented Jun 10, 2015

vyp commented Jun 10, 2015

asterite commented Jun 10, 2015

vyp commented Jun 10, 2015

vyp commented Jun 10, 2015

jhass commented Jun 10, 2015

vyp commented Jun 10, 2015

ozra commented Jun 26, 2015

refi64 commented Jun 26, 2015

sdogruyol commented Nov 20, 2016 • edited Loading

RX14 commented May 14, 2017

akzhan commented May 23, 2017

rdp commented Aug 30, 2019

rdp commented Feb 15, 2020

sdogruyol commented Nov 20, 2016 •

edited

Loading