Support opus-mt-mul-en translation in WebGPU #841

flatsiedatsie · 2024-07-09T11:52:12Z

Question

I've been having some trouble where translation sometimes wasn't working. For example, I just tried translating Polish into English using opus-mt-mul-en. But if outputs empty strings.

So I started looking for what could be wrong, and in the Transformers.js source code I found this marian.py file:
https://github.com/xenova/transformers.js/blob/7f5081da29c3f77ee830269ab801344776e61bcb/scripts/extra/marian.py#L18

It lists the supported Opus MT models, and while the model is available on Huggingface (https://huggingface.co/Xenova/opus-mt-mul-en), I'm guessing it isn't actually supported (yet)?

Do I understand correctly?

Related: is there a setting with the mul models that I need to set to select which language is translated into?

For completeness, here's some of my code:

Constructing the model:

const hf_model_url = 'Xenova/opus-mt-mul-en';
 
pipeline('translation', hf_model_url, {
	progress_callback: progressCallback,
	dtype: dtype_settings,
	device: self.device
	},
			    			
)
.then((pipe) => {
etc

And getting a translation out:

.pipe(sentence)
.then((translation) => {
etc

.. which already begs the question: as oput-mt-en-mul is supported according to that file, ...then how would that multi-model know what language to output to?

I'll continue searching to see if I can answer my own question :-)

The text was updated successfully, but these errors were encountered:

flatsiedatsie · 2024-07-09T12:12:41Z

Answering my own question of course.

Firstly, I was wrong, mul_en is listed on that page.

Secondly, I found references to src_lang and tgt_lang in the react example.

{     
     src_lang: 'hin_Deva', // Hindi
     tgt_lang: 'fra_Latn', // French
}

// Hmm, in my test adding {src_lang: 'pol', tgt_lang: 'eng'} doesn't work.. yet.

flatsiedatsie · 2024-07-09T12:24:21Z

Aha, more clues. This issue mentions adding >>jpn<< at the start of the string to translate for the Opus models.

// hmm

Unsupported language code ">>pol<<" detected, which may lead to unexpected behavior. Should be one of: []

flatsiedatsie · 2024-07-09T12:46:41Z

Switched back to Transformers.js V2, and even though I see the same error about the language code not being in the empty list, it now works.

So for now I guess I'll have to choose between all those languages.. or the speed of V3.

flatsiedatsie · 2024-07-09T13:32:29Z

As an experiment I tried the model from the example, Xenova/nllb-200-distilled-600M, to see if it would work with GPU. It does a nicer job of translating. Unfortunately it doesn't seem to support WebGPU yet either.

I tried changing the proxy settings. With dtype:':'fp16', it gave an unspecified error:

Uncaught 1668117928

and without it, the same:

Uncaught 1668117928

flatsiedatsie · 2024-07-09T16:30:40Z

Did some more lmiited testing: While the Opus MT translations are not perfect, on the whole they are better than the nllb-200-distilled-600M translations.

flatsiedatsie · 2024-08-27T09:44:19Z

V3 still doesn't return any output for the Opus mul-en and en-mul models.

I tried forcing WASM, but there was still no output.

// tried capitalization of the language code, and shortening it to two characters, but (expectedly) that had no effect.

// switching back to V2 makes it work immediately

xenova · 2024-08-27T17:33:19Z

Thanks for investigating this! I'm revisiting this now :) Do you by any chance have any sample python code for running Helsinki-NLP/opus-mt-mul-en you use for testing? That will come in handy!

xenova · 2024-08-27T18:10:49Z

Okay I think I figured it out (issue related to the NoBadWordsLogitsProcessor; 633976f) . Can you try now?

flatsiedatsie · 2024-08-27T18:26:31Z

Cool!

Unfortunately I don't have any Python code? It's 100% browser / Javascript for me.

Here's some (cleaned up) parts of my code that might be useful.

PREPARATION

let dtype_settings

function progressCallback(x){}

if( (self.supports_web_gpu16 || self.supports_web_gpu32) && hf_model_url.indexOf('opus-mt-') != -1){
	
	dtype_settings = self.supports_web_gpu16 ? 'fp16' : 'fp32'; //'fp16';
	console.log("translation worker: setting DTYPE to give Opus MT translation a speed boost.  dtype_settings: ", dtype_settings);
}

let actual_device = self.device; // wasm or webgpu, detected by separate function

//if(hf_model_url.endsWith('-mul-en') || hf_model_url.endsWith('-en-mul')){
if( hf_model_url.endsWith('m2m100_418M') || hf_model_url.endsWith('mbart-large-50-many-to-many-mmt') ){
	console.log("translation worker: forcing WASM pipeline for translation model: ", hf_model_url);
	env.backends.onnx.wasm.proxy = true;
	actual_device = 'wasm';
	//dtype_settings = 'fp32';
}
else{
	env.backends.onnx.wasm.proxy = self.device !== 'webgpu';
}


console.log("translation worker: actual_device: ", actual_device);


pipeline(
	'translation', 
	hf_model_url, 
	{
		progress_callback: progressCallback,
		//dtype: dtype_settings,
		device: actual_device,
		//quantized: true,
	},	 
) 
.then((pipe) => {
	console.log("TRANSLATION WORKER: pipeline_constructed.  pipe, sentences: ", pipe, sentences);
	pipelines[hf_model_url] = {'pipe':pipe,'last_used':Date.now()}
	do_translation_loop();
})
.catch((err) => {
	console.error("TRANSLATION WORKER: caught pipeline_construction error. err: ", err);
})

INFERENCE


let ht_model_url = 'Xenova/opus-mt-mul-en';
let translation_settings = {};

if(hf_model_url.endsWith('-mul-en') || hf_model_url.endsWith('-en-mul')){

	if(hf_model_url.endsWith('-mul-en')){
		//console.log("TRANSLATION WORKER: MUL -> EN");
		sentence = '>>' + input_language + '<< ' + sentence;
	}
	else if(hf_model_url.endsWith('-en-mul')){
		//console.log("TRANSLATION WORKER: EN -> MUL");
		sentence = '>>' + output_language + '<< ' + sentence;
	}
		
}
else{
	translation_settings['src_lang'] = input_language //input_language;
	translation_settings['tgt_lang'] = output_language //output_language;
}



console.log("TRANSLATION_WORKER: hf_model_url: ", hf_model_url);
console.log("TRANSLATION_WORKER: translation_settings: ", translation_settings);

// final cleaning of the string before translating it
sentence = sentence.replaceAll('\"','"');
sentence = sentence.replaceAll('\t','  ');
//sentence = sentence.trim();
//console.log("trimmed sentence: -->" + sentence + "<--");

console.log("TRANSLATION WORKER: TRANSLATING: FINAL SENTENCE:", sentence);


pipelines[hf_model_url].pipe(sentence,translation_settings)
.then((translation) => {
	console.log("TRANSLATION WORKER: TRANSLATED! \nfrom: ", sentence, "\n to: ", JSON.stringify(translation,null,2));
})
.catch((err) => {
	console.error("TRANSLATION WORKER: pipe: caught translation error. err: ", err);
})

flatsiedatsie · 2024-08-27T18:29:22Z

Sure! On it.

flatsiedatsie · 2024-08-27T18:36:20Z

Yep, that did it!

Thank you!

xenova · 2024-08-27T18:37:28Z

Great! Also, I'm terribly sorry about the delay... catching up on my long list of notifications! 😅

flatsiedatsie · 2024-08-27T18:42:34Z

No worries, I'm frankly amazed at how hard you work.

flatsiedatsie · 2024-08-27T18:52:51Z

Because the Opus MT MUL models weren't working, I was trying to find other WebGPU enhanced translation models today. So just as a matter of record, and not in any way as a request:

M2M100 runs on V3 via CPU, but not GPU.
MBart50 runs on V3 via CPU, but not GPU.

Both output an integer as error, along the lines of:

xenova · 2024-08-27T18:57:24Z

most likely an out-of-memory error. Are you running any quantized versions?

flatsiedatsie · 2024-08-27T20:58:54Z

That was my first thought as well. I tried with and without quantized, same result in either case.

Switch to WASM, and it works immediately.

xenova · 2024-08-27T22:08:11Z

@guschmue might be able to help debug

gyagp · 2024-08-28T08:12:25Z

@flatsiedatsie Can you share an app (workable on WASM, but failed with WebGPU)? We can take a look from ONNX Runtime WebGPU side.

flatsiedatsie · 2024-08-28T17:10:21Z

@gyagp I can. I'll send you an email with a sneak preview link to my soon-to-be-released project.

flatsiedatsie · 2024-08-28T17:39:17Z

Email sent.

flatsiedatsie added the question Further information is requested label Jul 9, 2024

flatsiedatsie closed this as completed Jul 9, 2024

flatsiedatsie reopened this Aug 27, 2024

flatsiedatsie changed the title ~~Is translation via opus-mt-mul-en not supported?~~ Support opus-mt-mul-en translation in WebGPU Aug 27, 2024

xenova closed this as completed Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support opus-mt-mul-en translation in WebGPU #841

Support opus-mt-mul-en translation in WebGPU #841

flatsiedatsie commented Jul 9, 2024

flatsiedatsie commented Jul 9, 2024 •

edited

Loading

flatsiedatsie commented Jul 9, 2024 •

edited

Loading

flatsiedatsie commented Jul 9, 2024

flatsiedatsie commented Jul 9, 2024

flatsiedatsie commented Jul 9, 2024 •

edited

Loading

flatsiedatsie commented Aug 27, 2024 •

edited

Loading

xenova commented Aug 27, 2024

xenova commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024 •

edited

Loading

flatsiedatsie commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024 •

edited

Loading

xenova commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024

xenova commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024 •

edited

Loading

xenova commented Aug 27, 2024

gyagp commented Aug 28, 2024

flatsiedatsie commented Aug 28, 2024

flatsiedatsie commented Aug 28, 2024

Support opus-mt-mul-en translation in WebGPU #841

Support opus-mt-mul-en translation in WebGPU #841

Comments

flatsiedatsie commented Jul 9, 2024

Question

flatsiedatsie commented Jul 9, 2024 • edited Loading

flatsiedatsie commented Jul 9, 2024 • edited Loading

flatsiedatsie commented Jul 9, 2024

flatsiedatsie commented Jul 9, 2024

flatsiedatsie commented Jul 9, 2024 • edited Loading

flatsiedatsie commented Aug 27, 2024 • edited Loading

xenova commented Aug 27, 2024

xenova commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024 • edited Loading

PREPARATION

INFERENCE

flatsiedatsie commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024 • edited Loading

xenova commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024

xenova commented Aug 27, 2024

flatsiedatsie commented Aug 27, 2024 • edited Loading

xenova commented Aug 27, 2024

gyagp commented Aug 28, 2024

flatsiedatsie commented Aug 28, 2024

flatsiedatsie commented Aug 28, 2024

flatsiedatsie commented Jul 9, 2024 •

edited

Loading

flatsiedatsie commented Jul 9, 2024 •

edited

Loading

flatsiedatsie commented Jul 9, 2024 •

edited

Loading

flatsiedatsie commented Aug 27, 2024 •

edited

Loading

flatsiedatsie commented Aug 27, 2024 •

edited

Loading

flatsiedatsie commented Aug 27, 2024 •

edited

Loading

flatsiedatsie commented Aug 27, 2024 •

edited

Loading