Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Unable to generate service file from large (~900MB) video file. #1278

Closed
antbrown opened this issue Sep 19, 2019 · 14 comments
Closed

Bug: Unable to generate service file from large (~900MB) video file. #1278

antbrown opened this issue Sep 19, 2019 · 14 comments

Comments

@antbrown
Copy link

I'll preface this by saying sub-100MB videos work fine. Service file and thumbnail is generated correctly and added back as media to the original islandora_object.

Before we get to the big video I should note we were also having trouble with video files at around the 400MB size.

I managed to solve it by adding -loglevel error to the ffmpeg command in HomarusController::convert at line 199.

ffmpeg writes out heaps of data to stdout and I don't think the CmdExecuteService was able to process all of it, so the 4Kb limit was reached and ffmpeg froze.

Running a stack trace on the ffmpeg process id showed it was stuck on a write() operation which is what I'm basing these guestimates on.

Adding -loglevel error to the command keeps it nice and quiet (unless an error occurs) and so the 4Kb limit is never reached and ffmpeg conversion completes successfully.

Now... we're trying to process a 930MB .mp4 video file.

The ffmpeg conversion completes successfully, we get a HTTP 200 response from Homarus and then about 50 seconds - 2 minutes later either JVM gets restarted or the request is retried and the whole ffmpeg process starts again.

I'll try to attach what I think are relevant log entries, but I'm still learning a lot of this stuff so feel free to ask for more info :)

Tokens and urls have been redacted to protect the innocent.

  • homarus.log shows a successful conversion of large video file.
  • camel.log shows the 200 response from Homarus, then retries the request.
  • wrapper.log shows an earlier attempt which resulted in the JVM restarting.

Unfortunately the karaf.log is super noisy and has been rotated out so I don't have access to the matching timestamps.

I've also run htop, kept an eye on memory/cpu/swap and nothing seems to be breaking any limits there.

Cheers!
Ant

@kayakr
Copy link
Contributor

kayakr commented Sep 19, 2019

I'm working with @antbrown on this.

We see no POST (via Apache) to Drupal when the video Service File is returned from Homarus. The messaging layer is falling over when handling the Homarus output.

It would be good to understand more about the architecture around Karaf/Camel/ActiveMQ especially when it comes to handling larger binaries. Is the output of homarus being accumulated somewhere before being POSTed to Drupal API. What systems should we be looking at? Has anyone else tried this scale of conversion?

@kayakr
Copy link
Contributor

kayakr commented Sep 19, 2019

@dannylamb
Copy link
Contributor

@antbrown @kayakr This is good stuff. Thanks for being the first to tackle large video files.

I'll have to dig deeper here, since it sounds like we've got sneaky issues going on in different layers. Both CmdExecuteService and the DerivativeConnector will have to be inspected more thoroughly.

@dannylamb
Copy link
Contributor

Sounds based on what you're describing it never gets here: https://github.com/Islandora-CLAW/Alpaca/blob/dev/islandora-connector-derivative/src/main/java/ca/islandora/alpaca/connector/derivative/DerivativeConnector.java#L69-L72

IN THEORY its all streamed around. The microservice will request the video, stream it to ffmpeg, and then stream the results back to camel, which then PUTs them into Drupal. I suspect that when it PUTs to Drupal, it's not getting streamed, gobbling up too much memory, and making the JVM tip over.

I'll play with it more and report back.

@dannylamb
Copy link
Contributor

After reading the docs and getting some tentative confirmation from @birkland, I think the issue (or at least part of it is) we need to set &disableStreamCache=true for all our http calls in Camel. For reference: https://camel.apache.org/components/latest/http-component.html#_query_parameters_50_parameters

We had to go through this a short while ago with &connectionClose=true. I can issue the PR for it if you're willing to test. Deploying Alpaca can be a bit of a bear, so I'll try to provide as much help as possible so you can get it on the server doing the migrations.

@antbrown
Copy link
Author

Awesome! Thanks @dannylamb

If you get a chance to post the pull request I am more than happy to test it out.

I will try find my way through re-deploying Alpaca, any help you can provide here will be greatly appreciated.

@antbrown
Copy link
Author

Just an update from me.

I made the following changes...

Islandora/Alpaca@dev...antbrown:antbrown/disable-stream-cache

I thought it is only relevant to derivatives and not indexing, which is why I've only changed the DerivativeConnector and its test. Please correct me if I'm wrong!

I also split the line because gradle didn't like lines longer than 120 chars, fair enough!

Then I did this from the root of the git repository:

docker run --rm -u gradle -v "$PWD":/home/gradle/project -w /home/gradle/project gradle ./gradlew build

It gave me a new islandora-connector-derivative-1.0.2.jar, which is cool.

Now, I just need to figure out where to put it 😄

@dannylamb
Copy link
Contributor

dannylamb commented Sep 26, 2019

@antbrown Eventually we'll hit all the spots in Alpaca, but for now sticking to just the DerivativeConnector is totally fine.

If you want to deploy it, it goes to /opt/karaf/deploy. Putting it there will trigger karaf to load the jar. Deal is though, you gotta disable the original. So before you do that, log into the karaf console and bundle:uninstall it. Here's a copy of what I did in my shell.

First I logged in a uninstalled the original bundle using its ID, which I looked up. Karaf has a lot of bash-isms that may look familiar here (e.g. | grep )

vagrant@claw:~/Alpaca$ /opt/karaf/bin/client
client: Ignoring predefined value for KARAF_HOME
Logging in as karaf
[ERROR] Failed to construct terminal; falling back to unsupported
java.lang.NumberFormatException: For input string: "0x100"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.lang.Integer.parseInt(Integer.java:580)
	at java.lang.Integer.valueOf(Integer.java:766)
	at jline.internal.InfoCmp.parseInfoCmp(InfoCmp.java:59)
	at jline.UnixTerminal.parseInfoCmp(UnixTerminal.java:242)
	at jline.UnixTerminal.<init>(UnixTerminal.java:65)
	at jline.UnixTerminal.<init>(UnixTerminal.java:50)
	at jline.NoInterruptUnixTerminal.<init>(NoInterruptUnixTerminal.java:24)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at java.lang.Class.newInstance(Class.java:442)
	at jline.TerminalFactory.getFlavor(TerminalFactory.java:211)
	at jline.TerminalFactory.create(TerminalFactory.java:102)
	at jline.TerminalFactory.create(TerminalFactory.java:51)
	at org.apache.karaf.client.Main.main(Main.java:144)

        __ __                  ____      
       / //_/____ __________ _/ __/      
      / ,<  / __ `/ ___/ __ `/ /_        
     / /| |/ /_/ / /  / /_/ / __/        
    /_/ |_|\__,_/_/   \__,_/_/         

  Apache Karaf (4.0.8)

Hit '<tab>' for a list of available commands
and '[cmd] --help' for help on a specific command.
Hit 'system:shutdown' to shutdown Karaf.
Hit '<ctrl-d>' or type 'logout' to disconnect shell from current session.

karaf@root()> bundle:list | grep islandora
bundle:list | grep islandora
113 | Active |  80 | 1.0.1    | islandora-http-client
114 | Active |  80 | 1.0.1    | islandora-indexing-triplestore
119 | Active |  80 | 1.0.1    | islandora-indexing-fcrepo
120 | Active |  80 | 1.0.1    | islandora-connector-derivative
121 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.ocr.blueprint.xml
122 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.houdini.blueprint.xml
123 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.fits.blueprint.xml
124 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.homarus.blueprint.xml

karaf@root()> bundle:uninstall 120
bundle:uninstall 120

Then I checked all the bundles again to make sure it was gone (it was).

karaf@root()> bundle:list | grep islandora
bundle:list | grep islandora
113 | Active |  80 | 1.0.1    | islandora-http-client
114 | Active |  80 | 1.0.1    | islandora-indexing-triplestore
119 | Active |  80 | 1.0.1    | islandora-indexing-fcrepo
121 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.ocr.blueprint.xml
122 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.houdini.blueprint.xml
123 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.fits.blueprint.xml
124 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.homarus.blueprint.xml

Then I hit ctrl+c to get out of the karaf shell and copied the jar over to the deploy directory.

vagrant@claw:~/Alpaca$ sudo cp ~/Alpaca/islandora-connector-derivative/build/libs/islandora-connector-derivative-1.0.2.jar /opt/karaf/deploy

Then I hopped back in the shell and confirmed the new bundle was in the Active state.

vagrant@claw:~/Alpaca$ /opt/karaf/bin/client
client: Ignoring predefined value for KARAF_HOME
Logging in as karaf
[ERROR] Failed to construct terminal; falling back to unsupported
java.lang.NumberFormatException: For input string: "0x100"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.lang.Integer.parseInt(Integer.java:580)
	at java.lang.Integer.valueOf(Integer.java:766)
	at jline.internal.InfoCmp.parseInfoCmp(InfoCmp.java:59)
	at jline.UnixTerminal.parseInfoCmp(UnixTerminal.java:242)
	at jline.UnixTerminal.<init>(UnixTerminal.java:65)
	at jline.UnixTerminal.<init>(UnixTerminal.java:50)
	at jline.NoInterruptUnixTerminal.<init>(NoInterruptUnixTerminal.java:24)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at java.lang.Class.newInstance(Class.java:442)
	at jline.TerminalFactory.getFlavor(TerminalFactory.java:211)
	at jline.TerminalFactory.create(TerminalFactory.java:102)
	at jline.TerminalFactory.create(TerminalFactory.java:51)
	at org.apache.karaf.client.Main.main(Main.java:144)

        __ __                  ____      
       / //_/____ __________ _/ __/      
      / ,<  / __ `/ ___/ __ `/ /_        
     / /| |/ /_/ / /  / /_/ / __/        
    /_/ |_|\__,_/_/   \__,_/_/         

  Apache Karaf (4.0.8)

Hit '<tab>' for a list of available commands
and '[cmd] --help' for help on a specific command.
Hit 'system:shutdown' to shutdown Karaf.
Hit '<ctrl-d>' or type 'logout' to disconnect shell from current session.

karaf@root()> bundle:list | grep islandora
bundle:list | grep islandora
113 | Active |  80 | 1.0.1    | islandora-http-client
114 | Active |  80 | 1.0.1    | islandora-indexing-triplestore
119 | Active |  80 | 1.0.1    | islandora-indexing-fcrepo
121 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.ocr.blueprint.xml
122 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.houdini.blueprint.xml
123 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.fits.blueprint.xml
124 | Active |  80 | 0.0.0    | ca.islandora.alpaca.connector.homarus.blueprint.xml
125 | Active |  80 | 1.0.2    | islandora-connector-derivative

Then, to really make sure, I made an object and got derivatives.

It was pretty painless for me, but sometimes karaf can really throw you a curveball, so if you follow along and something unexpected happens, let me know.

@antbrown
Copy link
Author

Hi @dannylamb thank you for the very thorough instructions :)

Here's what I did:

karaf@root()> bundle:list | grep islandora-connector-derivative
120 | Active |  80 | 1.0.1    | islandora-connector-derivative

karaf@root()> bundle:uninstall 120

karaf@root()> bundle:list | grep islandora-connector-derivative
karaf@root()>

Back on my host:

cp ~/forks/Islandora-CLAW/Alpaca/islandora-connector-derivative/build/libs/islandora-connector-derivative-1.0.2 ~/ccc-dhr/sandbox6/

Back inside VM:

sudo mv /vagrant/islandora-connector-derivative-1.0.2.jar /opt/karaf/deploy/

From the camel.log

2019-09-27 10:11:25,865 | DEBUG | nsole user karaf | Activator                        | 57 - org.apache.camel.camel-core - 2.20.4 | Bundle stopped: ca.islandora.alpaca.islandora-connector-derivative
2019-09-27 10:14:44,237 | DEBUG | raf-4.0.8/deploy | Activator                        | 57 - org.apache.camel.camel-core - 2.20.4 | Bundle started: ca.islandora.alpaca.islandora-connector-derivative

Then I went back to http://localhost:8000/node/add/islandora_object to create a simple Video repository item, then added/uploaded a new media/video item (930MB), crossed my fingers and watched the logs.

Oh, I also needed to make sure I was running the patched version of Crayfish/Homarus to include the -loglevel error flag in the command so it didn't fall over after 5 minutes.

Video conversion took around 20 minutes (which is fine, I've only given the VM 1x cpu).

Homarus returned the file which was then PUT back to Drupal as a 'Service File' media item attached to the original islandora_object. Happy days 😄

I learned a lot following this process, thanks heaps.

Not sure where to from here, do I figure out how to open pull requests against Alpaca and Crayfish and submit them for review, or is this something you want to look into further before committing to a solution?

I'll attach some log entries too, in case it helps future peoples...

Cheers!
Ant

@dannylamb
Copy link
Contributor

PRs against Alpaca and Crayfish would be much appreciated. Bonus points if you throw disableStreamCache=true on the other Alpaca routes, but I can do that in a follow up PR if you don't have time.

@dannylamb
Copy link
Contributor

@antbrown @kayakr I've gone ahead and tossed up PRs for Crayfish and Alpaca to address this issue. Can you two confirm these fix the issue with large files?

@antbrown
Copy link
Author

Hi @dannylamb sorry for the delay,

I'm not sure what I'm doing differently, the first time when I got this working I just dropped the jar file into the karaf deploy directory, karaf slurped it up and was happy, but now doing the same with the 1.0.2 jars from the 1278 pull request (built with gradlew) doesn't seem to update the features/bundles within karaf (even after a tomcat restart).

Karaf reports that the 1.0.1 version is still in use.

I don't have a lot of experience in this area, I'm wondering if you could point me to any extra docs around how Alpaca is meant to be deployed? Or even more abstract docs on how to deploy things to karaf properly?

Many thanks, Ant

@antbrown
Copy link
Author

@dannylamb we’ve managed to push the 1.0.2 jars out to staging and UAT, and coupled with the-loglevel error change in Crayfish we can now generate service files for large videos 😄

Thanks very much for your work on this!

@dannylamb
Copy link
Contributor

dannylamb commented Dec 16, 2019

@antbrown Thank you so much for testing. Sounds like Islandora/Crayfish#82 is good, then. Can you confirm that Islandora/Alpaca#66 is good as well? Or is it no longer neccessary? It's not part of 1.0.2 unfortunately.

Either way, these have sat for a bit and I'd like to get them taken care of. So let me know and I'll make a plea to get one or both of these merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants