Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to show save progress? #46

Closed
ElmoFuntz opened this issue Jan 17, 2020 · 43 comments
Closed

Is there a way to show save progress? #46

ElmoFuntz opened this issue Jan 17, 2020 · 43 comments
Assignees
Labels

Comments

@ElmoFuntz
Copy link

I have started testing saving of metadata to my audio book files and at first thought it was failing because I was seeing some mdat atom not found errors in the log but it turns out it was just taking a long time to save. Is there a way to display out the metadata save progress?

@Zeugma440 Zeugma440 self-assigned this Jan 17, 2020
@Zeugma440
Copy link
Owner

Zeugma440 commented Jan 17, 2020

Hi and thanks for the feeback ! That's a very relevant question for someone who uses the library on large files.

Right now there's nothing to allow what you describe, but I think it's a very useful feature. I need some time to research what are the .NET best practices on this matter.

I'd be very tempted to use ReactiveX, but this might be disproportionnate - maybe there's already something that allows to do the job on the basic .NET framework itself.

I'll keep you informed about the next steps.

@ElmoFuntz
Copy link
Author

Thanks its very noticeable on my first test file which was 1.2GB (44 hours long) but even the second one at only 230mb locks up the program with no status for a good 10 seconds or so.

@Zeugma440
Copy link
Owner

Zeugma440 commented Jan 19, 2020

I've added support for IProgress that will be available in the next version.

Sample code :

private void displayProgress(float progress)
{
	Console.WriteLine(progress * 100 + "%");
}

private void performWrite()
{
	IProgress<float> progress = new Progress<float>(displayProgress);
	Track t = new Track(s, progress);

	t.AdditionalFields.Add(new KeyValuePair<string, string>("test","aaa"));

	t.Save();
}

Important : ATL minimum requirements will be .NET Framework 4.5 instead of .NET Framework 3.0

Don't hesitate if you have additional feedback.

@ElmoFuntz
Copy link
Author

Nice! I'll give it a whirl when it is available. I'm already targeting 4.8 so that should not be a problem.

@ElmoFuntz
Copy link
Author

Ok not sure if i'm doing this right or its a bug but all i am getting when using IProgress is 1% twice

This is the log from 3 tests

Custom field TSSE : value = LAME 32bits version 3.99.5 (http://lame.sf.net)
1%
1%
d:/test.mp3 | Disk stream operation : Shortening (delta=-11349) [?=74991 ms]
1%
1%
d:/test.mp3 | Disk stream operation : Shortening (delta=-16056) [?=75441 ms]
1%
1%

and this is the code

        private void displayProgress(float progress)
        {
            Console.WriteLine(progress + "%");
        }

         private void btnSave_Click(object sender, EventArgs e)
        {
            // Adapt this to whatever your file path needs to be
            string fileName = "d:/test.mp3";

            IProgress<float> progress = new Progress<float>(displayProgress);
            // Load audio file information into memory
            Track theTrack = new Track(fileName, progress);

            // Modify metadata
            theTrack.Artist = "Hey ho";
            theTrack.Composer = "Oscar Wilde";
            theTrack.Album = "Fake album starts here and is longer than the original one";

            PictureInfo newPicture = new PictureInfo(Commons.ImageFormat.Jpeg, PictureInfo.PIC_TYPE.Front);
            ImageConverter converter = new ImageConverter();
            newPicture.PictureData = (byte[])converter.ConvertTo(picAlbumArt.Image, typeof(byte[]));
            theTrack.EmbeddedPictures.Add(newPicture);


            // Save modifications on the disc

            theTrack.Save();
        }

@Zeugma440
Copy link
Owner

Zeugma440 commented Jan 25, 2020

First of all, the code sample I gave you had an incorrect line. The progress should actually be displayed that way. Sorry for the misdirection.

Console.WriteLine(progress * 100 + "%");

Secondly, I've simplified the progress report to make it more consistent. It should become more intuitive in the next build.

Just don't expect something fine-tuned; it will be 0-100 or 0-50-100. For now I can't do much more than that because the nunber of low-level elements to write is unknown until all of them have been actually written. Making the library count them all at the beginning of the process will slow down the whole operation and/or occupy more memory, which I can't accept just for the sake of reporting progress.
I hope you understand.

@ElmoFuntz
Copy link
Author

ElmoFuntz commented Jan 25, 2020

Ah ok. Is it possible to even do it in 25% increments or does that add too much overhead? Does it go through this/add overhead even if you don’t initialize theTrack with the progress?

@Zeugma440
Copy link
Owner

Zeugma440 commented Jan 25, 2020

As far as I can see, there are 3 ways to report about progress :

1/ Basic level of detail is to count the number of tag containers that have been written (my choice for the next build)

e.g. if an MP3 file has an ID3v2 and an APE tag

  • Beginning : 0%
  • ID3v2 tag written : 50%
  • ID3v2 and APE tag written : 100%

2/ Medium level of detail is to report about how many zones have been written inside of each tag container. A zone is an area of the file that contains important data to write or to update.

It won't do much of a difference for ID3v2 and APE because they both have one single zone, as their structure is very simple. WMA's or MP4/M4A's have a more complex structure and may have a few dozen zones to go through, which will give a more detailed report.

To display a correct %, the library will need to count the zones of all tag containers before starting to update the file. Adapting the library to do that can be quite heavy to implement but should only generate a moderate overhead... and you won't see the difference with basic level when editing MP3s anyway : it will still jump 0-50-100.

3/ Ultimate level of detail is to report about how many bytes have been written. To know that, it is necessary to measure how many bytes need to be written on the whole file before starting the operation. In other words, the library will need to simulate the updating all tag containers before the update operation even begins. That's complex and generates a huge overhead for each file, just for the sake of displaying a progress indicator.

=> imho, choice 1 is the best tradeoff between ease of implementation and added value

@ElmoFuntz
Copy link
Author

Ah ok. I figured it was as simple as hey this file is 500mb and I’ve written 200 so far and doing the math on that. Sounds like it is much more complicated.

@Zeugma440
Copy link
Owner

yeah, unfortunately
Most audio files have like 1% of metadata and 99% of audio data. The only part that gets updated is the metadata. The difficult part is to determine the size of the portion of the file that actually needs to be updated, before the update begins.

@ElmoFuntz
Copy link
Author

So now the output shows 0% and 100% right after

Bitrate (KBps) : 0
Has variable bitrate audio : yes
Has lossless audio : no
Custom field ©too : value = Lavf55.45.100
d:/test.m4b | trak atom could not be found for index 3; aborting reading through tracks [?=11381 ms]
d:/test.m4b | mdat atom could not be found; aborting read [?=1 ms]
d:/test.m4b | trak atom could not be found for index 3; aborting reading through tracks [?=58 ms]
d:/test.m4b | mdat atom could not be found; aborting read [?=0 ms]
d:/test.m4b | trak atom could not be found for index 3; aborting reading through tracks [?=68 ms]
d:/test.m4b | mdat atom could not be found; aborting read [?=0 ms]
d:/test.m4b | Disk stream operation : Lengthening (delta=26888) [?=1880 ms]
d:/test.m4b | trak atom could not be found for index 3; aborting reading through tracks [?=2514 ms]
d:/test.m4b | mdat atom could not be found; aborting read [?=0 ms]
0%
100%

@Zeugma440
Copy link
Owner

M4B files usually have one single native tag container, so that totally makes sense.

There's no finer detail than that, unfortunately, unless I dive deeper and count the zones (see yesterday's explanation).

@ElmoFuntz
Copy link
Author

Ah I understand. I know it's not a priority but it would be cool if you could look into it more as you have time. I'm not sure how other (now obsolete) programs have done it in the past. I think Chapter & Verse was one that did show progress but it also used quicktime and itunes which is why it no longer works. If there was even a "quick save" and a "progress save" where the first did not do any counting and the 2nd did that might be workable.

@mattjopete
Copy link

mattjopete commented Feb 28, 2020

@ElmoFuntz Have you looked at setting this to help speed up file saving? It ends up using more RAM but even for large files it's not too bad.

Settings.FileBufferSize = fileStream.Length > int.MaxValue
                ? int.MaxValue
                : (int)fileStream.Length;

#36

@ElmoFuntz
Copy link
Author

Z mentioned that setting to me in another bug and setting it to 4096 dramatically reduced save times.

@CooPzZ
Copy link

CooPzZ commented Aug 23, 2022

For me it was the CopySameStream that needed a status. Unfortunately, it could do this procedure a few times depending on what changes - I did a hacky status trick using Trace in the CopySameStream and a listener into my app which allowed me to see better how long it was going to take which helped a lot on big files with the perceived hang time.
image

eg: Please note this is not a valid solution and hack to view it in a vb.net win forms app.

    Private WithEvents traceListner As New Listeners

    Private Sub Form_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            ATL.Settings.FileBufferSize = My.Settings.TagSaveIOBufferSize
            Trace.Listeners.Add(traceListner)
    End Sub

    Private Sub traceListner_Message(message As String, category As String) Handles traceListner.Message
        If (category.withValue AndAlso category = "Saving") Then
            ''NEED TO ADD THESE TO THE ALTdotnet.SteamUtils.cs > CopySameStream sub in ATL lib to use
            'System.Diagnostics.Trace.Write($"{(Double)written / (double)length * 100:00.#}", "Saving");

            progressBar1.Value = message
            'Application.DoEvents()
        End If
    End Sub

Listener.

Public Class Listeners
    Inherits TraceListener

    ' Public Event Message(message As String)
    Public Event Message(message As String, category As String)

    Public Overrides Sub Write(message As String)
        RaiseEvent message(message, Nothing)
    End Sub

    Public Overrides Sub Write(message As String, category As String)
        RaiseEvent message(message, category)
    End Sub

    Public Overrides Sub WriteLine(message As String)
        RaiseEvent message(message, Nothing)
    End Sub

    Public Overrides Sub WriteLine(message As String, category As String)
        RaiseEvent message(message, category)
    End Sub
End Class

You cannot see it here but I'm also using the progress mentioned above - as noted it barely registers an output where this kept me much more informed.
WARNING: it does slow the save down using the TRACE.
Hope it helps.

@Zeugma440
Copy link
Owner

Zeugma440 commented Aug 23, 2022

Thanks for the workaround @CooPzZ. I hope this helps other people.

FYI, I'm keeping this issue alive as I haven't found any satisfactory solution to implement a proper progress indicator that :

  • gives meaningful output for all supported formats
  • does not slow down processing

For all of you who are reading me, do not forget to take a look at the FAQ dedicated to performance : https://github.com/Zeugma440/atldotnet/wiki/4.-FAQ-(Frequently-Asked-Questions)#FAQM.5

@CooPzZ
Copy link

CooPzZ commented Aug 23, 2022

No worries - I understand.
Just remember you don't have to be exact - a response is better than a hang, and if it does cause a slow down you'll most probably be able to let us know why and give a choice on whether to use it. As you say if it really slows it down > dont.

I would think if there is a process per file type make it a dynamic with 2 stages:

  1. high-level processing - eg: check for change, processing zones, removing, adding, re-writing headers, saving outcome .. etc.
  2. In the places where a long process could occur. eg: like the CopySameStream, give a percentage of work done if it's extremely quick to make otherwise maybe just a position indicator.

Again to keep it simple you could export a status tag like I'm doing here and a percentage and let the calling application decide what to do with it, that way you don't need to keep a strict order of things either.
If the calling app slows it down (by calling something like Application.DoEvents on the same thread), it's not your library's fault to worry about that as we know we should be reacting on a different non-blocking thread either way.

I was thinking of extending your Status object to work like the Trace and call it very similarly as I'm using both at the moment - I might have a go at it, however I'm still learning the lib, and I think you were on the right track.

@Zeugma440
Copy link
Owner

I just attempted something - tell me if it fits your needs @CooPzZ

NB : The app can't predict the effort required for each step, so you might see "peaks" while going from 0% to 100%

@CooPzZ
Copy link

CooPzZ commented Sep 1, 2022

It may however it never hits the code as the progress object didn't get to the procedure.
image

I think it's these calls in the FileSurgeon.RewriteZones that need amending.
image

Looking further if I've set my Progress object - you are recreating them as well on the calls which losses my reference
if(writeProgress != null) progress = writeProgress.CreateIProgress();

@CooPzZ
Copy link

CooPzZ commented Sep 1, 2022

Could you please give an example on how to use it with the section name coming out and the Track object?

@Zeugma440
Copy link
Owner

It may however it never hits the code as the progress object didn't get to the procedure.

That part was intentionally left out as the current progress granularity is what FileSurgeon calls the "Region" level. Your use case seems to need additional granularity, to the "Zone" level.

I'll adapt the code and get back to you.

Could you please give an example on how to use it with the section name coming out and the Track object?

For now, section name is only recorded for debug purposes. Do you really need that for end user display ?

@CooPzZ
Copy link

CooPzZ commented Sep 2, 2022

Your use case seems to need additional granularity, to the "Zone" level.

No worries - that seems to be the part where most time is spent - well with the saving I've been doing. ie: I hit pause and 9 times out of 10 it's in that procedure - some times it going through the 300k of Zones. Keep in mind this is for big files only.

Do you really need that for end user display ?

No but it makes more sense if it's goes through the procedures more than once.

@CooPzZ
Copy link

CooPzZ commented Sep 2, 2022

Hi @Zeugma440 ,

This is more of what I was thinking so you didn't need to fluff up and down the procedure tree - I hate when I have to do that - I'm not sure if its' a great pattern however it should be efficient and loose enough - it is loosely based on the observer pattern it think, but works much like the Trace statement.

namespace ATL.Progress
{
    public static class ProgressMonitor
    {
        public static void Report(Single value, string Where = null)
        {  //maybe test for listeners first
            foreach (ProgressListener oLis in _colListeners)
            {
                oLis.ReportProgress(value, Where);
            }
        }

        private static readonly List<ProgressListener> _colListeners = new List<ProgressListener>();
        public static ProgressListener AddListener(ProgressListener lis)
        {
            _colListeners.Add(lis);
            return lis;
        }

    }

    public class ProgressListener
    {
        public event ReportEventHandler Report;
        public delegate void ReportEventHandler(double value, string where);

        public void ReportProgress(double value, string where)
        {
            Report(value, where);
        }
    }
}

To report progress, call the static procedure.. ie: There is no need to test if the progress object exists - the monitor will decide for you based on the listeners.
Progress.ProgressMonitor.Report(iteration / nbIterations, "CopySameStream");

And in the calling app you need to create the listener and subscribe to it in ATL once. Note this bit is in vb as I can't be bothered tonight to translate - maybe next week but hopefully you understand what I'm doing here.

    Private WithEvents _ProgressListener As New ATL.Progress.ProgressListener
    Private Sub Form_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        ATL.Progress.ProgressMonitor.AddListener(_ProgressListener)
    End Sub
    Private Sub _ProgressListener_Report(value As Double, where As String) Handles _ProgressListener.Report
        Debug.Write(where?.ToString)
        Debug.WriteLine(value.ToString)
    End Sub

Anyway hope that helps.

@Zeugma440
Copy link
Owner

Hey @CooPzZ

The Observer is definitely a pattern to consider here to make things easier to handle, the same way logging has been designed.

However I need to be careful with that one as some ATL clients use the library in a multi-threaded environment where multiple files may be being processed at the same time. We might need a progress token or something akin to differentiate feedback when used on multiple threads... or maybe use the tread ID itself (done that before somewhere) 🤔

@ElmoFuntz
Copy link
Author

Glad to see this is getting some attention. Also thanks Z for all of your hard work on this project over the years!

@Zeugma440
Copy link
Owner

@CooPzZ I haven't dived yet in architectural refactoring, but now every unbuffered zone should report its own progress.

Here's a sample code to use Track and display save progress :

IProgress<float> progress = new Progress<float>(p => Console.WriteLine(p));
Track theTrack = new Track(file_location, progress);

I'll try to refactor the whole thing using an Observer pattern as soon as you confirm it behaves the way you expect.

@CooPzZ
Copy link

CooPzZ commented Sep 5, 2022

We might need a progress token or something akin to
True - in the scenarios described, I agree my simplified method isn't enough for that and you need a reference back.

The update:
--Scratch that
Tell me if I'm wrong - you are iterating all the zones and adding a progress report sequentially. Which does work. With the file being so big, it looks like it moves from 10.1% to 10.11 percent (ie: appeared really slow as I was only rounding to 1 decimal) and hence makes more sense.
Yes I think this will work.
sidenote: I not sure why, but it must get to a point in the process where if you use Application.Doevents() that it begins a stack overflow because it's going so quick. it is happening through the lambda in ProgressManagers.CreateIProgress.

@CooPzZ
Copy link

CooPzZ commented Sep 6, 2022

Sorry Zeugma, try again -
I've added another video today in the email I sent yesterday, showing the status working - however you'll see that it doesn't show a progress when going through the initial CopySameStream as that's using my example progressmonitor I added.
The stack says it's during the FileSurgeon.RewriteZones run.
I hope I'm helping here.
Cheers
CooPzZ

@Zeugma440
Copy link
Owner

Zeugma440 commented Sep 10, 2022

I've added another video today in the email I sent yesterday, showing the status working - however you'll see that it doesn't show a progress when going through the initial CopySameStream as that's using my example progressmonitor I added.

First of all, your video is super clear from beginning to end. Thanks for taking the time to capture it 👍

I haven't seen all your code, but something is telling me that your calling Progress.ProgressMonitor.Report happens on the main thread.
On the other hand, my use of .NET's Progress.Report sends notifications on the ThreadPool, making ATL progress feedback asynchronous.

What might happen there is your own feedback being processed before ATL's because of it being prioritary (main thread) over ATL's (ThreadPool), which explains why you have the impression that the app goes multiple times into CopySameStream.

Last but not least, the fact that your own progress runs is itself proof that ATL was logging something too, as the whole block inside CopySameStream starts with if (progress != null).

=> Did you try to run the app while disabling your own progress feedback to see if the ATL progress feedback covers the whole operation ?

The stack says it's during the FileSurgeon.RewriteZones run.

If we're going deeper into that issue, I'll need to know which line of FileSurgeon.RewriteZones it was called from, as there are different cases that result to calls to StreamUtils.LengthenStream / StreamUtils.ShortenStream inside FileSurgeon.RewriteZones

@CooPzZ
Copy link

CooPzZ commented Sep 12, 2022

I'm sorry, I'm not sure if I'm doing threads right - but I can see that the progress isn't active to report back through the ATL progress when you see mine coming through LengthenStream - CopySameStream. Mine is probably wrong, but you can see the long process happening and goes through it twice - while it's going through the zones.
Added another video for viewing with hopefully enough detail for you to see what I'm mentioning.

Yes I have done it without mine and I don't see it doing anything during that period - 3rd vid.

@Zeugma440
Copy link
Owner

It is a threading issue, and the more I read about it, the more I get the feeling IProgress has to be used in conjunction with async / await... which ATL doesn't support for now.

I'm gonna set up a new test project with a bona fide UI (instead of a Console) and experiment with that. Stay tuned !

@Zeugma440
Copy link
Owner

Zeugma440 commented Sep 14, 2022

Graphical UI is now working as expected !

My conclusions are :

  • IProgress, async and await are indeed meant to be used together. Asking for an IProgress in the API doesn't make any sense when I'm not supporting async
  • As long as BinaryWriter doesn't support async (Add async overloads to BinaryReader/Writer dotnet/runtime#17229), progress shall be reported using the main thread
  • Providing an Observer-based progress feedback API, such as the one ATL has for logging, still seems a bit overkill to me :

Here's the downsides for each solution :

Library Client app
Delegate Has to be passed as an argument all around the library Has to be passed to each new Track
Observer Will require a more complex internal API to handle multithreading and re-create the current transaction-based "russian doll reporting" The capacity to have multiple subscribers will probably be useless

=> For now, I've chosen the simplest route, changing the public API from

Track(string path, IProgress<float> writeProgress = null, bool load = true)

to

Track(string path, Action<float> writeProgress = null, bool load = true)

UI refresh happens properly when feeding that to the Delegate

private void displayProgress(float progress)
{
    String str = (progress * 100).ToString() + "%";
    Console.WriteLine(str);
    ProgressLbl.Text = str;
    Application.DoEvents();
}

This should (hopefully) be the final update to this issue 🤞

@Zeugma440
Copy link
Owner

@CooPzZ forgot to tag you as you're not this issue's OP

@CooPzZ
Copy link

CooPzZ commented Sep 16, 2022

Still got it - sorry, just haven't had a chance to check it.

@Zeugma440
Copy link
Owner

Zeugma440 commented Sep 17, 2022

FYI, the tests to support async are progressing quickly.

The target API will be :

  • Track.Save and Track.Remove will be performance-focused, call the current implementation and won't report any progress
  • Track.SaveAsync and Track.RemoveAsync will call an async-compatible implementation and will report progress using IProgress<float>

=> If you don't mind, I'd rather you spend time on #160 so that I can release a bugfix build for #160 and #159. The release after that one will contain support for async as well as the definitive ( 🤞 ) solution to the current issue.

@CooPzZ
Copy link

CooPzZ commented Sep 19, 2022

This looks very good now and reports during that period. Excellent work.

There is one case to note, which you may already know.
On long m4b books with a simple edit, locally or over the network doesn't matter, the progress is extremely slow. ie: 0.5% a second. It's not sending data over the network and I'm using a buffer of 256M and appears to be on the progress showing.
Some timings with and without the progress reporting:
Adding ~40 bytes
with Status > 3m4s and updated the progress 148085 (LogAddBytes_withProgress.m4b.181808.ATLlog)
without Status > 7s (LogAddBytes_withoutProgress.m4b.181129.ATLlog)
Simular timings for removing the bytes again.
So I think there is a resolution issue here and maybe needs to be proportional.
Note: this is not the case on mp3 - assuming a m4b thing going through the zones.

I've added the Logs in the share for both.
Note: this was happening beforehand also.

@Zeugma440
Copy link
Owner

Roger. I have an idea about how to solve that too. I'll fix that along with the async implementation.

@Zeugma440
Copy link
Owner

Done ! The main branch is now async-compatible, and I've fixed the resolution problem you reported on MP4/M4A files with a lot of audio chunks.

More information there : https://github.com/Zeugma440/atldotnet/wiki/Progress-feedback

Warning The API has changed (again) : the progress feedback handle must now be given as arguments to Save and Remove methods directly instead of being given to Track at construct time.

@CooPzZ
Copy link

CooPzZ commented Sep 26, 2022

Thanks Zeugma -- ok playing with this today I couldn't find out why your progress was working differently to mine.
Going through many things, I was using .net6 instead of .Net 4.8 and thinking this was the issue, etc.. But I finally found the call that was causing the counting more than once when doing a progress.

ATL.Settings.ForceDiskIO = True

Is there something regarding this and progress we need to be aware of.

@Zeugma440
Copy link
Owner

If I may, instead of comparing your progress vs. mine, wouldn't it be better to wonder if the ATL standard progress is meaningful to the end user ?

To answer your question, ATL.Settings.ForceDiskIO = True disables write-time buffering, which implies more I/O operations.
Unless you have serious issues with RAM usage (e.g. use on a mobile device), I advise you to keep that option to its default value (false).

@CooPzZ
Copy link

CooPzZ commented Sep 26, 2022

Sorry, I meant in my program, I'm using the same ATL progress object you've built in the UI-test (nothing of my progress method exists any more) I was getting a different output..
The only difference that i found was I had set the value ATL.Settings.ForceDiskIO = True (not sure why, but was playing at some point) and the status was coming out differently. This has been causing confusion between us.

What might happen there is your own feedback being processed before ATL's because of it being prioritary (main thread) over ATL's (ThreadPool), which explains why you have the impression that the app goes multiple times into CopySameStream.

Instead of your remark around threads, the ATL.Settings.ForceDiskIO = True, may make the progress report count to 100% more than once.

Otherwise the progress methods added work really well. I need a bit more time to comment on the resolution with different file types.

@Zeugma440
Copy link
Owner

The original issue reported by the OP has been answered with v4.11 (https://github.com/Zeugma440/atldotnet/wiki/Progress-feedback)

Please open new issues if you detect bugs or improvements on the current reporting behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants