Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support opus-mt-mul-en translation in WebGPU #841

Closed
flatsiedatsie opened this issue Jul 9, 2024 · 20 comments
Closed

Support opus-mt-mul-en translation in WebGPU #841

flatsiedatsie opened this issue Jul 9, 2024 · 20 comments
Labels
question Further information is requested

Comments

@flatsiedatsie
Copy link
Contributor

Question

I've been having some trouble where translation sometimes wasn't working. For example, I just tried translating Polish into English using opus-mt-mul-en. But if outputs empty strings.

So I started looking for what could be wrong, and in the Transformers.js source code I found this marian.py file:
https://github.com/xenova/transformers.js/blob/7f5081da29c3f77ee830269ab801344776e61bcb/scripts/extra/marian.py#L18

It lists the supported Opus MT models, and while the model is available on Huggingface (https://huggingface.co/Xenova/opus-mt-mul-en), I'm guessing it isn't actually supported (yet)?

Do I understand correctly?

Related: is there a setting with the mul models that I need to set to select which language is translated into?

For completeness, here's some of my code:

Constructing the model:

const hf_model_url = 'Xenova/opus-mt-mul-en';
 
pipeline('translation', hf_model_url, {
	progress_callback: progressCallback,
	dtype: dtype_settings,
	device: self.device
	},
			    			
)
.then((pipe) => {
etc

And getting a translation out:

.pipe(sentence)
.then((translation) => {
etc

.. which already begs the question: as oput-mt-en-mul is supported according to that file, ...then how would that multi-model know what language to output to?

I'll continue searching to see if I can answer my own question :-)

@flatsiedatsie flatsiedatsie added the question Further information is requested label Jul 9, 2024
@flatsiedatsie
Copy link
Contributor Author

flatsiedatsie commented Jul 9, 2024

Answering my own question of course.

Firstly, I was wrong, mul_en is listed on that page.

Secondly, I found references to src_lang and tgt_lang in the react example.

{     
     src_lang: 'hin_Deva', // Hindi
     tgt_lang: 'fra_Latn', // French
}

// Hmm, in my test adding {src_lang: 'pol', tgt_lang: 'eng'} doesn't work.. yet.

@flatsiedatsie
Copy link
Contributor Author

flatsiedatsie commented Jul 9, 2024

Aha, more clues. This issue mentions adding >>jpn<< at the start of the string to translate for the Opus models.

// hmm

Unsupported language code ">>pol<<" detected, which may lead to unexpected behavior. Should be one of: []

@flatsiedatsie
Copy link
Contributor Author

Switched back to Transformers.js V2, and even though I see the same error about the language code not being in the empty list, it now works.

Screenshot 2024-07-09 at 14 44 09

So for now I guess I'll have to choose between all those languages.. or the speed of V3.

@flatsiedatsie
Copy link
Contributor Author

As an experiment I tried the model from the example, Xenova/nllb-200-distilled-600M, to see if it would work with GPU. It does a nicer job of translating. Unfortunately it doesn't seem to support WebGPU yet either.

Screenshot 2024-07-09 at 15 27 17

I tried changing the proxy settings. With dtype:':'fp16', it gave an unspecified error:

Uncaught 1668117928

and without it, the same:

Uncaught 1668117928

@flatsiedatsie
Copy link
Contributor Author

flatsiedatsie commented Jul 9, 2024

Did some more lmiited testing: While the Opus MT translations are not perfect, on the whole they are better than the nllb-200-distilled-600M translations.

@flatsiedatsie flatsiedatsie reopened this Aug 27, 2024
@flatsiedatsie flatsiedatsie changed the title Is translation via opus-mt-mul-en not supported? Support opus-mt-mul-en translation in WebGPU Aug 27, 2024
@flatsiedatsie
Copy link
Contributor Author

flatsiedatsie commented Aug 27, 2024

V3 still doesn't return any output for the Opus mul-en and en-mul models.

I tried forcing WASM, but there was still no output.

Screenshot 2024-08-27 at 11 22 25

// tried capitalization of the language code, and shortening it to two characters, but (expectedly) that had no effect.

// switching back to V2 makes it work immediately

@xenova
Copy link
Collaborator

xenova commented Aug 27, 2024

Thanks for investigating this! I'm revisiting this now :) Do you by any chance have any sample python code for running Helsinki-NLP/opus-mt-mul-en you use for testing? That will come in handy!

@xenova
Copy link
Collaborator

xenova commented Aug 27, 2024

Okay I think I figured it out (issue related to the NoBadWordsLogitsProcessor; 633976f) . Can you try now?

@flatsiedatsie
Copy link
Contributor Author

flatsiedatsie commented Aug 27, 2024

Cool!

Unfortunately I don't have any Python code? It's 100% browser / Javascript for me.

Here's some (cleaned up) parts of my code that might be useful.

PREPARATION

let dtype_settings

function progressCallback(x){}

if( (self.supports_web_gpu16 || self.supports_web_gpu32) && hf_model_url.indexOf('opus-mt-') != -1){
	
	dtype_settings = self.supports_web_gpu16 ? 'fp16' : 'fp32'; //'fp16';
	console.log("translation worker: setting DTYPE to give Opus MT translation a speed boost.  dtype_settings: ", dtype_settings);
}

let actual_device = self.device; // wasm or webgpu, detected by separate function

//if(hf_model_url.endsWith('-mul-en') || hf_model_url.endsWith('-en-mul')){
if( hf_model_url.endsWith('m2m100_418M') || hf_model_url.endsWith('mbart-large-50-many-to-many-mmt') ){
	console.log("translation worker: forcing WASM pipeline for translation model: ", hf_model_url);
	env.backends.onnx.wasm.proxy = true;
	actual_device = 'wasm';
	//dtype_settings = 'fp32';
}
else{
	env.backends.onnx.wasm.proxy = self.device !== 'webgpu';
}


console.log("translation worker: actual_device: ", actual_device);


pipeline(
	'translation', 
	hf_model_url, 
	{
		progress_callback: progressCallback,
		//dtype: dtype_settings,
		device: actual_device,
		//quantized: true,
	},	 
) 
.then((pipe) => {
	console.log("TRANSLATION WORKER: pipeline_constructed.  pipe, sentences: ", pipe, sentences);
	pipelines[hf_model_url] = {'pipe':pipe,'last_used':Date.now()}
	do_translation_loop();
})
.catch((err) => {
	console.error("TRANSLATION WORKER: caught pipeline_construction error. err: ", err);
})

INFERENCE


let ht_model_url = 'Xenova/opus-mt-mul-en';
let translation_settings = {};

if(hf_model_url.endsWith('-mul-en') || hf_model_url.endsWith('-en-mul')){

	if(hf_model_url.endsWith('-mul-en')){
		//console.log("TRANSLATION WORKER: MUL -> EN");
		sentence = '>>' + input_language + '<< ' + sentence;
	}
	else if(hf_model_url.endsWith('-en-mul')){
		//console.log("TRANSLATION WORKER: EN -> MUL");
		sentence = '>>' + output_language + '<< ' + sentence;
	}
		
}
else{
	translation_settings['src_lang'] = input_language //input_language;
	translation_settings['tgt_lang'] = output_language //output_language;
}



console.log("TRANSLATION_WORKER: hf_model_url: ", hf_model_url);
console.log("TRANSLATION_WORKER: translation_settings: ", translation_settings);

// final cleaning of the string before translating it
sentence = sentence.replaceAll('\"','"');
sentence = sentence.replaceAll('\t','  ');
//sentence = sentence.trim();
//console.log("trimmed sentence: -->" + sentence + "<--");

console.log("TRANSLATION WORKER: TRANSLATING: FINAL SENTENCE:", sentence);


pipelines[hf_model_url].pipe(sentence,translation_settings)
.then((translation) => {
	console.log("TRANSLATION WORKER: TRANSLATED! \nfrom: ", sentence, "\n to: ", JSON.stringify(translation,null,2));
})
.catch((err) => {
	console.error("TRANSLATION WORKER: pipe: caught translation error. err: ", err);
})


@flatsiedatsie
Copy link
Contributor Author

Sure! On it.

@flatsiedatsie
Copy link
Contributor Author

flatsiedatsie commented Aug 27, 2024

Yep, that did it!

Thank you!

@xenova
Copy link
Collaborator

xenova commented Aug 27, 2024

Great! Also, I'm terribly sorry about the delay... catching up on my long list of notifications! 😅

@xenova xenova closed this as completed Aug 27, 2024
@flatsiedatsie
Copy link
Contributor Author

No worries, I'm frankly amazed at how hard you work.

@flatsiedatsie
Copy link
Contributor Author

Because the Opus MT MUL models weren't working, I was trying to find other WebGPU enhanced translation models today. So just as a matter of record, and not in any way as a request:

  • M2M100 runs on V3 via CPU, but not GPU.
  • MBart50 runs on V3 via CPU, but not GPU.

Both output an integer as error, along the lines of:

Screenshot 2024-08-27 at 13 22 47

@xenova
Copy link
Collaborator

xenova commented Aug 27, 2024

most likely an out-of-memory error. Are you running any quantized versions?

@flatsiedatsie
Copy link
Contributor Author

flatsiedatsie commented Aug 27, 2024

That was my first thought as well. I tried with and without quantized, same result in either case.

Switch to WASM, and it works immediately.

@xenova
Copy link
Collaborator

xenova commented Aug 27, 2024

@guschmue might be able to help debug

@gyagp
Copy link

gyagp commented Aug 28, 2024

@flatsiedatsie Can you share an app (workable on WASM, but failed with WebGPU)? We can take a look from ONNX Runtime WebGPU side.

@flatsiedatsie
Copy link
Contributor Author

@gyagp I can. I'll send you an email with a sneak preview link to my soon-to-be-released project.

@flatsiedatsie
Copy link
Contributor Author

Email sent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants