Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows only - Random crash, for version 2.2 #674

Closed
iksi4prs opened this issue Jan 9, 2024 · 71 comments · Fixed by #797
Closed

Windows only - Random crash, for version 2.2 #674

iksi4prs opened this issue Jan 9, 2024 · 71 comments · Fixed by #797
Labels
Milestone

Comments

@iksi4prs
Copy link

iksi4prs commented Jan 9, 2024

Description
App crashes.
See more information about optional cause in discussion/comment:
#654 (reply in thread)

To Reproduce
Random.
I've noticed that after first install, it works fine for few seconds.
But trying to run app again fails, namely crash is even before first window is created (or at least before it becomes visible)

Additional Info
Logs from "Windows's events viewer"

Application: Pinta.exe CoreCLR Version: 8.0.23.53103 .NET Version: 8.0.0 Description: The process was terminated due to an unhandled exception. Exception Info: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. Stack: at GObject.Internal.Object.RemoveToggleRef(IntPtr, GObject.Internal.ToggleNotify, IntPtr) at GObject.Internal.Object.RemoveToggleRef(IntPtr, GObject.Internal.ToggleNotify, IntPtr) at GObject.Internal.ObjectMapper.Unmap(IntPtr) at GObject.Internal.ObjectHandle.ReleaseHandle() at System.Runtime.InteropServices.SafeHandle.InternalRelease(Boolean) at System.Runtime.InteropServices.SafeHandle.Finalize()

Faulting application name: Pinta.exe, version: 2.2.0.0, time stamp: 0x65410000 Faulting module name: coreclr.dll, version: 8.0.23.53103, time stamp: 0x65411660 Exception code: 0xc0000005 Fault offset: 0x00000000001e337a Faulting process id: 0x0xBC80 Faulting application start time: 0x0x1DA3F1C8D463DE9 Faulting application path: C:\Program Files\Pinta\bin\Pinta.exe Faulting module path: C:\Program Files\Pinta\bin\coreclr.dll Report Id: fa0399c5-aff9-4e9b-b43c-0b1429429ed9 Faulting package full name: Faulting package-relative application ID:

Faulting application name: Pinta.exe, version: 2.1.1.0, time stamp: 0x63c9df9f Faulting module name: KERNELBASE.dll, version: 10.0.22621.2792, time stamp: 0x3091b6fb Exception code: 0x80000003 Fault offset: 0x000000000010cd82 Faulting process id: 0x0x6A54 Faulting application start time: 0x0x1DA3F1CC7555BDF Faulting application path: C:\Program Files\Pinta\Pinta.exe Faulting module path: C:\WINDOWS\System32\KERNELBASE.dll Report Id: 8d3cb891-26e0-42d4-943e-630d861f029b Faulting package full name: Faulting package-relative application ID:

Version
Operating System:
Windows 11 Pro
Version 22H2
OS build 22621.2861

Pinta:
Pinta 2.2, from this branch: (not all dev branch have this problem)
https://github.com/PintaProject/Pinta/actions/runs/7405489857

@iksi4prs iksi4prs added the bug label Jan 9, 2024
@cameronwhite
Copy link
Member

For Pinta 2.2, from this branch: (not all dev branch have this problem) does that mean that you haven't hit the crashes with the regular build from the master branch?

@iksi4prs
Copy link
Author

@cameronwhite, the first version that I used with for 2.2, with narrow toolbar for tools and scroll bar, didn't crash.
I think it was from master (all versions show 2.2 in about, without date of build or branch information).
If you have list of branches/builds that you want me to check/compare, put the urls here and I will check.

@cameronwhite
Copy link
Member

Thanks, yeah there isn't any additional info in the About window unfortunately

I think the main one to just compare against is the regular master branch, so the latest build there is https://github.com/PintaProject/Pinta/actions/runs/7470015712
I'm still trying to figure out why I'm not hitting these crashes on my Windows machine - another user had similar reports too

@badcel
Copy link
Contributor

badcel commented Jan 11, 2024

Those things are hard to debug as you are on the mercy of the gc.

For example in debug mode or with a debugger attached the gc will behave differently.

Perhaps your machine is more beefy and the gc does not need to collect objects as frequently.

You can try to force collection but from my experience this is not guaranteed to result in collection of the wanted object.

You can try to call:

GC.Collect();
GC.WaitForPendingFinalizers();

To force collection.

@iksi4prs
Copy link
Author

iksi4prs commented Jan 11, 2024

test 1: master, latest build 7470015712 (narrow toolbar with scrollbar)
result - runs fine
test 2: test branch, 7405489857 (with 2 columns toolbar)
result: surprise... runs fine about 2 minutes and then crash.
ok, at least I wrote in the subject "Random crash."
I've checked the system memory, nothing special.
A few days I wasn't able to start the app at all (exited before window was visible)
Maybe as suggested by @badcel, this is GC dependent.
I will try use it and see if crashes again, based on general system memory, or some peaks/swaps when opening more apps.

@cameronwhite
Copy link
Member

In 13ebeda I added a --debug command line argument that runs the GC more frequently

@iksi4prs
Copy link
Author

From what I saw, the "-debug" was added in master, but I've encountered the problem in build from branch "toolbox-layout".
Can you please merge it into https://github.com/PintaProject/Pinta/tree/fix/toolbox-layout ?
(If possible, better also to merge/rebases all other changes from master)

@cameronwhite
Copy link
Member

Sure, I've rebased it on the latest master branch: https://github.com/PintaProject/Pinta/actions/runs/7523936006

@iksi4prs
Copy link
Author

still getting crash in exe from 7523936006
Application: Pinta.exe CoreCLR Version: 8.0.123.58001 .NET Version: 8.0.1 Description: The process was terminated due to an unhandled exception. Exception Info: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. Stack: at GObject.Internal.Object.RemoveToggleRef(IntPtr, GObject.Internal.ToggleNotify, IntPtr) at GObject.Internal.Object.RemoveToggleRef(IntPtr, GObject.Internal.ToggleNotify, IntPtr) at GObject.Internal.ObjectMapper.Unmap(IntPtr) at GObject.Internal.ObjectHandle.ReleaseHandle() at System.Runtime.InteropServices.SafeHandle.InternalRelease(Boolean) at System.Runtime.InteropServices.SafeHandle.Finalize()

@badcel
Copy link
Contributor

badcel commented Jan 17, 2024

Perhaps an idea is to have some print to the console somewhere in the stacktrace and print the name of the class which creates the problem?

@iksi4prs
Copy link
Author

@badcel , as you wrote it maybe issue with GC, so I don't think it will help.
I've downloaded this app just to try and use, so I don't have a dev env.
I will try to download, build and debug on my machine as soon as I can.

@badcel
Copy link
Contributor

badcel commented Jan 17, 2024

The collection happens asynchronous, so there is no way back to the creating code. But if there is a way to detect the type of the handle which gets freed it is perhaps possible to narrow down the classes which could cause the problem.

@cameronwhite
Copy link
Member

Perhaps an idea is to have some print to the console somewhere in the stacktrace and print the name of the class which creates the problem?

Yeah I think that would be really helpful for tracking down issues 👍

@iksi4prs
Copy link
Author

I'm not familiar with GC to that level. Most examples from MS required usage of windbg.
I've done some tests with creating fatal errors during "Dispose" of some objects, and the stack/error looks different from the errors in Pinta, so needs more time to investigate.
@cameronwhite, is the there a way release a version with pdb of external libs (like GtkSharp.dll, etc) ?
currently I see .pdb only for Pinta files.

@badcel
Copy link
Contributor

badcel commented Jan 18, 2024

@iksi4prs see: gircore/gir.core#1004

Pinta uses GirCore which is compiled with source link support. So you can actually take a look into every file from pintas source.

@cameronwhite
Copy link
Member

I'm not familiar with GC to that level. Most examples from MS required usage of windbg. I've done some tests with creating fatal errors during "Dispose" of some objects, and the stack/error looks different from the errors in Pinta, so needs more time to investigate. @cameronwhite, is the there a way release a version with pdb of external libs (like GtkSharp.dll, etc) ? currently I see .pdb only for Pinta files.

There also shouldn't be a GtkSharp.dll since that's from the GTK3 version of Pinta (we use gir.core now) - maybe you have some leftover files from an old installation?

@iksi4prs
Copy link
Author

There also shouldn't be a GtkSharp.dll since that's from the GTK3 version of Pinta (we use gir.core now) - maybe you have some leftover files from an old installation?

I just run the installer every time. Looks like it didn't clean previous installation during upgrade.
Now I've deleted the folder manually and did a "clean install".

@iksi4prs
Copy link
Author

@cameronwhite,

I've installed dev env.
And I see the crash in "unmap" with this stack:

image

I see that's GObject is external to Pinta,
so I guess I should start by adding logs in dispose/dtors of objects that inherit from "GObject.Object" ?

@cameronwhite
Copy link
Member

A lot of the subclasses of GObject.Object are also external to Pinta (from the generated GTK bindings in the gir.core library), so I'm not sure how easy it'll be to add logs in those
We do have a few Object subclasses in Pinta for custom bindings etc though

You could also see if you can inspect value.Object when that crash happens, which I think is the corresponding managed object and that might let you see its type

@iksi4prs
Copy link
Author

@cameronwhite , @badcel .
The value.Object is null, but handle has value, I can see it has some value in call to
GObject.Internal.Object.RemoveToggleRef in Dispose

I saw pr in gircore/gir.core#1004
with logs ( gircore/gir.core@main...release-handle-exception )

When it is expected to available in Pinta's build ?

@badcel
Copy link
Contributor

badcel commented Jan 19, 2024

It makes sense that object is null. The handle gets released because there are no instances anymore in the first place, which requires the instance to be null.

The problem is that @cameronwhite raised the concern that, reading data from a pointer if an AccessViolationException happens could result in a new AccessViolationException.

I would not like to merge this without confirmation that it does not result in a new exception.

This would either require you to build GirCore yourself with the fix and afterwards build Pinta using your own local GirCore build to confirm it works or I find another solution. But this is not yet very obvious to me. There are lots of possible ways around this, but I want the code to be as clean and simple as possible.

@badcel
Copy link
Contributor

badcel commented Jan 22, 2024

@iksi4prs i think you need to manually find the problem as the Microsoft docs read like the exception can't be caught.

@iksi4prs
Copy link
Author

iksi4prs commented Jan 22, 2024

@badcel,

I'm trying to aim for a more global solution, in case occurs again in future.

What about adding logs in ObjectHandle.cs ?

eg, some "pseudo" code, with names that may not be accurate:

public class ObjectHandle : SafeHandle
{
    private static bool _verboseMode = Environment.GetCommandLineArgs.Contains("--gircore-verbose-gobject");

    public IntPtr Handle => IsInvalid ? IntPtr.Zero : DangerousGetHandle();

    public ObjectHandle(IntPtr handle, object obj, bool ownedRef) : base(IntPtr.Zero, true)
    {
       if (_verboseMode)
       {
           var obejctDetails = string that contains runtime type, and other details that can be obtained;
            Console.WriteLine($"{timestamp}, {nameof(ObjectHandle)}, handle: {handle}, obj: {obejctDetails}");
       }
        ObjectMapper.Map(handle, obj, ownedRef);
        SetHandle(handle);
    }

    public sealed override bool IsInvalid => handle == IntPtr.Zero;

    protected sealed override bool ReleaseHandle()
    {
       if (_verboseMode)
       {
            Console.WriteLine($"{timestamp}, {nameof(ReleaseHandle)}, handle: {handle}" ");
       }
        ObjectMapper.Unmap(handle);
        return true;
    }
}

@badcel
Copy link
Contributor

badcel commented Jan 22, 2024

@iksi4prs there are already debug logs in place in ObjectMapper.Map and ObjectMapper.Unmap. But as the packages are compiled in release mode they get omitted.

I think it is fine to compile GirCore locally in debug mode if a project has such kind of problems which requires actual debug output.

(The log in Unmap should probably be moved to the top of the method.)

@iksi4prs
Copy link
Author

@badcel ,maybe in this case (since exception cant be caught) it will be nice in the future to add "stopwatch" style logger, using disposable.
eg:

#ifdef DEBUG
      using (new Logger())
#endif
      {
        lock (WrapperObjects)
        {
            if (WrapperObjects.Remove(handle, out var toggleRef))
                toggleRef.Dispose();
        }
     }

Where ctor of Logger logs the "Enter" and "Dispose" logs the "Exit", then easier to find places where no matching "Exit" and "Enter" pairs.
Anyway, I will try build the debug version with logs and see if I can find anything.

@iksi4prs
Copy link
Author

@cameronwhite ,@badcel,

I've built Pinta with Gir.Core and added some logs.
Much harder to reproduce bug now.
But looks to be consistent and related to the tooltip of the tools container on the left.
I don't manage to create a reproducible steps, I just "play" with ui till crash.
In all times the mapped object is "Gtk.Tooltip'
So later I started to hover all buttons of tools, so all tooltips will show/open, and then start "play" with ui.
eg;

Mapped Object: Handle '1776277271200' Object 'Gtk.Tooltip' OwnedRef 'False'.
Initialising Object: Address 1776277271200, Type Gtk.Tooltip.

Where for other case, this is also the stack of the mapping itself:


Mapped Object: Handle '2865452951200' Object 'Gtk.Tooltip' OwnedRef 'False'. StackTrace:   
at GObject.Internal.ObjectMapper.Map(IntPtr handle, Object obj, Boolean ownedRef) 
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Internal\ObjectMapper.cs:line 37, 
at GObject.Internal.ObjectHandle..ctor(IntPtr handle, Object obj, Boolean ownedRef) 
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Internal\ObjectHandle.cs:line 12,  
at GObject.Object..ctor(IntPtr handle, Boolean ownedRef) 
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Public\Object.cs:line 29,   
at Gtk.Tooltip..ctor(IntPtr ptr, Boolean ownedRef) 
in D:\GirCore\gir.core\src\Libs\Gtk-4.0\Public\Tooltip.Framework.Generated.cs:line 16,  
at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor),   
at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(
Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr), 
at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(
Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture),
at System.Reflection.RuntimeConstructorInfo.Invoke(
BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture),  
at System.Reflection.ConstructorInfo.Invoke(Object[] parameters),  
at GObject.Internal.ObjectWrapper.WrapHandle[T](IntPtr handle, Boolean ownedRef) 
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Internal\ObjectWrapper.cs:line 47,  
at GObject.Internal.ObjectWrapper.WrapNullableHandle[T](IntPtr handle, Boolean ownedRef) 
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Internal\ObjectWrapper.cs:line 13, 
at GObject.Value.GetObject() 
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Public\Value.Generated.cs:line 259, 
at GObject.Value.CheckComplexTypes(UIntPtr gtype)
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Public\Value.cs:line 74, 
at GObject.Value.Extract() 
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Public\Value.cs:line 67,  
at GObject.Value.Extract[T]() 
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Public\Value.cs:line 96,  
at GObject.SignalArgs.Extract[T](Value value)
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Public\SignalArgs.cs:line 22,  
at Gtk.Widget.QueryTooltipSignalArgs.get_Tooltip()
in D:\GirCore\gir.core\src\Libs\Gtk-4.0\Public\Widget.Signals.Generated.cs:line 190,  
at Pinta.Gui.Widgets.StatusBarColorPaletteWidget.HandleQueryTooltip(Object o, QueryTooltipSignalArgs args) 
in D:\PintaWithGirCore\Pinta.Gui.Widgets\Widgets\StatusBarColorPaletteWidget.cs:line 268,  
at GObject.ReturningSignal`3.<>c__DisplayClass11_0.<Connect>b__0(Value returnValue, Value[] parameters)
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Public\ReturningSignalTSenderTSignalArgs.cs:line 50,
at GObject.Closure.InternalCallback(
IntPtr closure, IntPtr returnValuePtr, UInt32 nParamValues, IntPtr paramValuesData, IntPtr invocationHint, IntPtr userData) 
in D:\GirCore\gir.core\src\Libs\GObject-2.0\Public\Closure.cs:line 40,  
at Gio.Internal.Application.Run(IntPtr application, Int32 argc, String[] argv), 
at Gio.Internal.Application.Run(IntPtr application, Int32 argc, String[] argv),  
at Gio.Application.Run(Int32 argc, String[] argv)
in D:\GirCore\gir.core\src\Libs\Gio-2.0\Public\Application.Methods.Generated.cs:line 215, 
at Gio.Application.RunWithSynchronizationContext(String[] args)
in D:\GirCore\gir.core\src\Libs\Gio-2.0\Public\Application.cs:line 23, 
at Pinta.MainClass.OpenMainWindow(Int32 threads, IEnumerable`1 files, Boolean debug) 
in D:\PintaWithGirCore\Pinta\Main.cs:line 117,
at Pinta.MainClass.<>c.<Main>b__0_1(Int32 threads, String[] files, Boolean debug) 
in D:\PintaWithGirCore\Pinta\Main.cs:line 74,  

My familiarity with the code is 0.
Need some suggestions what to look for.

@badcel
Copy link
Contributor

badcel commented Jan 23, 2024

@iksi4prs my advice would be to do the following steps:

  1. Move the debug log statement before the ToggleRef.Dispose call.
  2. If the crash occurs the moved debug log should trigger and tell you which handle caused the problem.
  3. Search all debug logs for the handle to detect the class of the handle.

Or did you already do this and the result is that it's GTK.Tooltip?

@badcel
Copy link
Contributor

badcel commented Apr 9, 2024

@iksi4prs If it never gets collected I think this is a hint that it is not a double free bug because there is one obsolete ref which keeps the instance alive forever.

Can you check if the _reference member inside the ToggleRef class is either a Tooltip which would mean there is some C code still owning the reference or if it is a WeakReference instance which would mean C# owns the last instance?

@cameronwhite
Copy link
Member

Looking in the debugger, it does seem to be a WeakReference instance.

This was my sample code.

using System;
var application = Adw.Application.New("org.gir.core", Gio.ApplicationFlags.FlagsNone);
application.OnActivate += (sender, args) =>
{
    var label = Gtk.Label.New("Hello world");
    label.HasTooltip = true;
    label.OnQueryTooltip += (_, args) =>
    {
        args.Tooltip.SetText("This is a tooltip");
        return true;
    };
    var window = Gtk.ApplicationWindow.New((Adw.Application) sender);
    window.Title = "Window";
    window.SetDefaultSize(300, 300);
    window.SetChild(label);
    window.Show();
};

// Run the GC more frequently
GLib.Functions.TimeoutAdd(0, 100, () =>
{
    GC.Collect();
    GC.WaitForPendingFinalizers();
    return true;
});

return application.RunWithSynchronizationContext(null);

@cameronwhite
Copy link
Member

I also notice a warning (Window.exe:5940): Gdk-WARNING **: 22:44:59.122: gdk_gl_context_make_current() failed right before the crash.

Since this happens after moving the cursor away to close the tooltip, I wonder if there's a timing issue where if the tooltip is destroyed sometime later on (whenever the GC collects the C# object) instead of right away, it has a reference to something that is now stale?

@cameronwhite
Copy link
Member

Immediately cleaning up the C# objects does seem to work .. I haven't hit a crash so far with the following changes:

I changed the callback in the sample code to

        var tooltip = args.Tooltip;
        tooltip.SetText("This is a tooltip");
        tooltip.Dispose();

And then also changed src\Libs\GObject-2.0\Public\Closure.cs to add this at the end of InternalCallback to clean up the Value object that also references the Gtk.Tooltip. Otherwise I got a similar crash when the Glib.Value was eventually GC'd

        finally
        {
            foreach (var val in paramValues)
                val.Dispose();
        }

@badcel
Copy link
Contributor

badcel commented Apr 10, 2024

@cameronwhite did you try to apply only the change in Closure and not in the sample app?

I wonder if it is correct to dispose the Tolltip in the sample app before the call returns? Looks a little bit odd for me.

In the Closure implementation there is a copy of the Value parameter array created. This is a generic array wrapper for structs which uses the boxed_copy method to increase the ref count of the original Value.

But I'm not sure if the Value copy is at the same time increasing the ref count of the Tooltip it contains as there are now 2 Value instances referencing the same Tooltip.

I will check. Perhaps this is the bug we are looking for.

@cameronwhite
Copy link
Member

cameronwhite commented Apr 10, 2024

Yeah, both of those changes seemed necessary to fix all the crashes, although the change in the sample app is definitely a hack and isn't a good fix.

args.Tooltip creates a new Gtk.Tooltip managed wrapper, extracting from the Value in the args object, and there seemed to be crashes either from the Gtk.Tooltip managed wrapper being disposed much later on, or from the Value being disposed much later on (which should also have a reference to the underlying gobject.)

@badcel
Copy link
Contributor

badcel commented Apr 12, 2024

Hm, I wrote this test:

private class Foo : GObject.Object
{
    public Foo() : base(true, Array.Empty<ConstructArgument>())
    {
    }
}
[TestMethod]
public void ValueReferenceCountTest()
{
    var foo = new Foo();
    Marshal.PtrToStructure<GObject.Internal.ObjectData>(foo.Handle).RefCount.Should().Be(1);
    var v1 = new Value(foo);
    Marshal.PtrToStructure<GObject.Internal.ObjectData>(foo.Handle).RefCount.Should().Be(2);
    
    var ownedHandle = new GObject.Internal.ValueUnownedHandle(v1.Handle.DangerousGetHandle()).OwnedCopy();
    Marshal.PtrToStructure<GObject.Internal.ObjectData>(foo.Handle).RefCount.Should().Be(3);
    var v2 = new GObject.Value(ownedHandle);
    Marshal.PtrToStructure<GObject.Internal.ObjectData>(foo.Handle).RefCount.Should().Be(3);
}

Unfortunately the test is successfull under linux. The code used here is the code used if the value array is copied.

So from my point of view everything looks correct here.

@cameronwhite i executed your sample code under linux and the app crashed after the tooltip gets hidden and a warning is emited. It seems to be the same error like on windows. Can you confirm?

(ConsoleApp2:537518): Gdk-WARNING **: 16:04:13.860: gdk_gl_context_make_current() failed
ConsoleApp2: ../src/dispatch_common.c:872: epoxy_get_proc_address: Zusicherung »0 && "Couldn't find current GLX or EGL context.\n"« nicht erfüllt.

Process finished with exit code 134.

I think the next step is that I convert the sample code to C just to be sure that the Gdk-Warning is not responsible for the crash.

@cameronwhite
Copy link
Member

On macOS I get a different but probably equivalent warning:

(<unknown>:25752): Gdk-CRITICAL **: 23:00:29.201: _gdk_macos_surface_update_fullscreen_state: assertion 'GDK_IS_MACOS_SURFACE (self)' failed

And then the sample application hangs afterwards.
I built an equivalent test case in C (pasted below) - on macOS it produces the same warning, but doesn't lock up and works fine otherwise
I'll try to get this tested on Windows too

Code:

#include <adwaita.h>

static gboolean
on_query_tooltip(GtkWidget *widget,
                 int x,
                 int y,
                 gboolean keyboard_mode,
                 GtkTooltip *tooltip,
                 gpointer data)
{
  gtk_tooltip_set_text(tooltip, "This is a tooltip");
  return TRUE;
}

static void
activate_cb(GtkApplication *app)
{
  GtkWindow *window = GTK_WINDOW(gtk_application_window_new(app));
  GtkWidget *label = gtk_label_new("Hello World");
  gtk_widget_set_has_tooltip(label, TRUE);

  g_signal_connect(label, "query-tooltip", G_CALLBACK(on_query_tooltip), NULL);

  gtk_window_set_title(window, "Window");
  gtk_window_set_default_size(window, 300, 300);
  gtk_window_set_child(window, label);
  gtk_window_present(window);
}

int main(int argc,
         char *argv[])
{
  g_autoptr(AdwApplication) app = NULL;

  app = adw_application_new("org.gir.core", G_APPLICATION_DEFAULT_FLAGS);

  g_signal_connect(app, "activate", G_CALLBACK(activate_cb), NULL);

  return g_application_run(G_APPLICATION(app), argc, argv);
}

Meson build script

project('tooltiptest', 'c', version: '0.1')

adw_dep = dependency('libadwaita-1', version: '>= 1.4.0')

executable('tooltiptest',
  'tooltiptest.c',
  dependencies: [adw_dep]
)

@cameronwhite
Copy link
Member

The test works fine for me on Windows and doesn't crash. I didn't see any console output either

Thinking about this more, one other possible issue is that the final g_object_unref() is being called from the GC thread, not the main UI thread. This could cause problems if the object's cleanup does anything non-trivial, since GTK stuff needs to happen on the main thread?
I came across https://discourse.gnome.org/t/using-g-object-unref-from-non-main-threads/7046/7 which talks about this more and also mentions that the old C# bindings had some special handling for this to queue up the unref's onto the main thread

My previous workaround, which immediately called Dispose(), also would have had the side effect of making the g_object_unref() happen on the current thread.

@badcel
Copy link
Contributor

badcel commented Apr 13, 2024

@cameronwhite this is a good idea.

Perhaps disposing the Tooltip directly is no problem because GTK itself holds another reference and this would avoid the GC to kick in from another thread.

I will check if this is true and try some hacks to verify your assumption. (For me it sounds very plausible.)

@cameronwhite
Copy link
Member

I made a quick test by replacing the ToggleRef.Dispose() method with:

        public void Dispose()
        {
            GLib.Functions.TimeoutAdd(GLib.Constants.PRIORITY_DEFAULT, 50, () =>
            {
                Internal.Object.RemoveToggleRef(_handle, _callback, IntPtr.Zero);
                return false;
            });
        }

and that seemed to work in the testing I did on macOS and Windows

@badcel
Copy link
Contributor

badcel commented Apr 13, 2024

I tested your code, too on fedora and it worked.

I have an alternate implementation, which seems to work, too. What do you think?

public void Dispose()
{    
    var sourceFunc = new SourceFuncAsyncHandler(() =>
    {
        Internal.Object.RemoveToggleRef(_handle, _callback, IntPtr.Zero);
        return false;
    });
    MainContext.Invoke(MainContextUnownedHandle.NullHandle, sourceFunc.NativeCallback, IntPtr.Zero);
}

It avoids a timeout and invokes the Internal.Object.RemoveToggleRef on the main thread.

Docs: MainContext.Invoke / TimeOutAdd

In general I wonder if it is correct to call the RemoveToggleRef on the MainContext. I'm not sure what happens if there is no MainLoop running, e.g. an App does not use GTK at all. Any thoughts?

@badcel
Copy link
Contributor

badcel commented Apr 13, 2024

I made another test, which involves only GObject:

using GObject;

public class Program
{
    public static async Task Main(string[] args)
    {
        GObject.Module.Initialize();
        Console.WriteLine("Hello, World!");
        var p = new Program();
        p.Do();
        await Task.Delay(5000);
        GC.Collect();
        GC.WaitForPendingFinalizers();
        GC.Collect();

        await Task.Delay(5000);
    }

    public void Do()
    {
        var foo = new Foo();
    }
}

public class Foo : GObject.Object
{
    public Foo() : base(true, Array.Empty<ConstructArgument>())
    {
    }
}

I don't know what must be altered so it works with the solution using GLib.TimeoutAdd? This scenario fails to remove the ToggleRef if GLib.TimeoutAdd is used.

But the original ToggleRef code and my suggested solution both work in this scenario as the RemoveToggleRef code is called.

So I would tend to say that the call of MainContext.Invoke is probably the way to go as it works in all cases. Any thoughts?

@cameronwhite
Copy link
Member

If I add some extra print statements like this

        public void Dispose()
        {
            System.Console.WriteLine("disposing");
            var sourceFunc = new GLib.Internal.SourceFuncAsyncHandler(() =>
            {
                System.Console.WriteLine("remove toggle ref");
                Internal.Object.RemoveToggleRef(_handle, _callback, IntPtr.Zero);
                return false;
            }); 

            MainContext.Invoke(MainContextUnownedHandle.NullHandle, sourceFunc.NativeCallback, IntPtr.Zero);

            System.Console.WriteLine("done disposing");
        }

Then in the GObject test it looks like the callback is immediately executed.

Hello, World!
disposing
remove toggle ref
done disposing

So it must be meeting the conditions mentioned in https://docs.gtk.org/glib/method.MainContext.invoke.html to execute right away, I think because the main context hasn't been acquired by another thread
I assume the timeout fails because there isn't an event loop running to actually run the timeout callback later on?

So I think MainContext.Invoke() sounds fine to me - my only thought with the timeout was that it would allow batching together the RemoveToggleRef calls if there are a bunch of objects being disposed, if there is any overhead from creating and scheduling the callback to run on the main thread for every object. But that can probably be done later if we find it actually necessary

@badcel
Copy link
Contributor

badcel commented Apr 16, 2024

The fix just got merged in gir.core. With the next release it should be fixed.

@iksi4prs
Copy link
Author

@badcel and @cameronwhite , thanks for time and effort putting in fixing this one.

@cameronwhite , this issue was arisen after changes for "toolbox-layout"
https://github.com/PintaProject/Pinta/tree/fix/toolbox-layout.

Do you have plans to merge it master, or need any more feedback, or else ?

(see original discussion: #654 )

@cameronwhite
Copy link
Member

Excellent, thanks for merging the fix!

I think the toolbox-layout branch can likely be merged soon - I'll just do a bit more testing on it now that we know these crashes are unrelated to the UI changes

@badcel
Copy link
Contributor

badcel commented Apr 17, 2024

Are you planning on testing it with the gir.core fix? This would actually be great, to see potential performance problems.

@cameronwhite
Copy link
Member

I'll try rebuilding locally with the new gir.core fix and do some testing.

@cameronwhite
Copy link
Member

I've tested on Windows with Pinta built against the latest gir.core and verified that the tooltip crash seems to be fixed (along with the similar crash that occurred in #766)

I haven't noticed any performance issues, but other than a few cases like the tooltip signal handler Pinta typically isn't rapidly creating GObject instances (the hot code paths mostly involve Cairo)

cameronwhite added a commit that referenced this issue May 9, 2024
- This contains a fix for the random crashes encountered on Windows

- The Cairo.Matrix class now has a more convenient constructor

- Gdk.FileList.GetFiles() is not being generated and requires a manual binding for now, but GLib.SList methods are now generated

Fixes: #674
cameronwhite added a commit that referenced this issue May 9, 2024
- This contains a fix for the random crashes encountered on Windows

- The Cairo.Matrix class now has a more convenient constructor

- Gdk.FileList.GetFiles() is not being generated and requires a manual binding for now, but GLib.SList methods are now generated

Fixes: #674
cameronwhite added a commit that referenced this issue May 9, 2024
- This contains a fix for the random crashes encountered on Windows

- The Cairo.Matrix class now has a more convenient constructor

- Gdk.FileList.GetFiles() is not being generated and requires a manual binding for now, but GLib.SList methods are now generated

Fixes: #674
cameronwhite added a commit that referenced this issue May 9, 2024
- This contains a fix for the random crashes encountered on Windows

- The Cairo.Matrix class now has a more convenient constructor

- Gdk.FileList.GetFiles() is not being generated and requires a manual binding for now, but GLib.SList methods are now generated

Fixes: #674
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants