Processing fst text lines #1055

armusc · 2022-09-11T17:58:50Z

Hi

I didn't have this problem before when i installed k2 with conda
I have recently cloned and compiled directly from sources, and I have this problem in reading fst (created by kaldilm)

k2/build_release_cpu_torch_cpu/k2/csrc/fsa_utils.cc:295:void k2::OpenFstStreamReader::ProcessLine(std::string&) Invalid line: 5 0 4
99458 0, eof=true, fail=true, src_state=5, dest_state=0

looks to me that the absence of a cost field in the line causes this issue (i.e. fail=true)
If I add a 0.0 field as a 5th field does not happen

suggestions?

csukuangfj · 2022-09-12T01:53:20Z

Are you using the latest master?

armusc · 2022-09-12T07:05:58Z

right, after a merge it worked. Thanks.

by the way, this was when trying to save an arpa for use in LM rescoring.
THe unpruned arpa has 6GB and it causes segmentation fault when saving with torch
torch.save(G.as_dict(), f"{args.lm_dir}/G_4_gram_asdict.new.pt")

if I prune to have a 1 GB, I have no issues
I understand it's not k2, but may be you are aware of this issue?
it's something expected with LMs of that size?

danpovey · 2022-09-12T09:00:23Z

Can you show some debug info for the segmentation fault?

armusc · 2022-09-12T10:33:26Z

contrary to what I said, the segmentation fault is caused by the call to

G.as_dict()
rather than to torch.save

I'm not sure if it helps, I run the python script with dgb:

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
__memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:500
500 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.

csukuangfj · 2022-09-13T03:54:45Z

gdb --args python /path/to/xxx.py
(gdb) catch throw
(gdb) run
# When it segfaults
(gdb) backtrace

Please show the backtrace.

armusc · 2022-09-13T18:38:46Z

#0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:500
#1 0x00007fff92b8cfcf in k2::Array1 k2::Cat(std::shared_ptrk2::Context, int, k2::Array1 const**) ()
from /home/amuscariello/mediaspeech/k2/build_release_cpu_torch_cpu/lib/libk2context.so
#2 0x00007fff92b85471 in k2::FsaVecToTensor(k2::Raggedk2::Arc const&) () from /home/amuscariello/mediaspeech/k2/build_release_cpu_torch_cpu/lib/libk2context.so
#3 0x00007fff92ea21bb in ?? () from /home/amuscariello/mediaspeech/k2/build_debug_cpu_torch_cpu/lib/_k2.cpython-38-x86_64-linux-gnu.so
#4 0x00007fff92ec6d85 in ?? () from /home/amuscariello/mediaspeech/k2/build_debug_cpu_torch_cpu/lib/_k2.cpython-38-x86_64-linux-gnu.so
#5 0x000055555568ff8e in cfunction_call_varargs (kwargs=0x0, args=0x7ffff7963400, func=0x7fff92f48590) at /usr/local/src/conda/python-3.8.13/Objects/call.c:743
#6 PyCFunction_Call (func=0x7fff92f48590, args=0x7ffff7963400, kwargs=0x0) at /usr/local/src/conda/python-3.8.13/Objects/call.c:773
#7 0x0000555555678651 in _PyObject_MakeTpCall (callable=0x7fff92f48590, args=, nargs=, keywords=)
at /usr/local/src/conda/python-3.8.13/Python/errors.c:219
#8 0x0000555555674471 in _PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7fff8b097fd8, callable=0x7fff92f48590)
at /usr/local/src/conda/python-3.8.13/Include/cpython/abstract.h:125
#9 _PyObject_Vectorcall (kwnames=0x0, nargsf=, args=0x7fff8b097fd8, callable=0x7fff92f48590)
at /usr/local/src/conda/python-3.8.13/Include/cpython/abstract.h:115
#10 call_function (kwnames=0x0, oparg=, pp_stack=, tstate=0x5555558e64a0) at /usr/local/src/conda/python-3.8.13/Python/ceval.c:4963
#11 _PyEval_EvalFrameDefault (f=, throwflag=) at /usr/local/src/conda/python-3.8.13/Python/ceval.c:3469
#12 0x000055555568f886 in PyEval_EvalFrameEx (throwflag=0, f=0x7fff8b097e40) at /usr/local/src/conda/python-3.8.13/Python/ceval.c:738
#13 function_code_fastcall (globals=, nargs=, args=, co=) at /usr/local/src/conda/python-3.8.13/Objects/call.c:284
#14 _PyFunction_Vectorcall (kwnames=, nargsf=, stack=0x555557ca93b8, func=0x7fff8e377b80) at /usr/local/src/conda/python-3.8.13/Objects/call.c:411
#15 _PyObject_Vectorcall (kwnames=, nargsf=, args=0x555557ca93b8, callable=0x7fff8e377b80)
at /usr/local/src/conda/python-3.8.13/Include/cpython/abstract.h:127
#16 method_vectorcall (method=, args=0x555557ca93c0, nargsf=, kwnames=)
at /usr/local/src/conda/python-3.8.13/Objects/classobject.c:60

does that help?

csukuangfj · 2022-09-14T02:30:50Z

does that help?

Thanks!

Could you build a debug version of k2 and show the information about

(gdb) frame 1
(gdb) list

danpovey · 2022-09-14T03:39:40Z

It calls Cat on 4 arrays, including the arcs linearized to where each arc is 4 int32_t's. The size of that could definitely overflow int32_t, if the number of arcs were more than about 2**(32 - 3) [-1 because it's signed, -2 because of the factor of 4].
I can't see an easy way to fix that without breaking older formats or introducing redundant formats.

danpovey · 2022-09-14T03:43:26Z

.. I do see a problem though, at array_ops_in.h:349,
int32_t elem_size = src[0]->ElementSize();
this should be int64_t, so that when we multiply by the size it doesn't overflow.

armusc · 2022-09-14T09:40:29Z

(gdb) frame 1
#1 0x00007fff928de83d in k2::Cat (c=..., num_arrays=4, src=0x7fffffffd020) at /home/amuscariello/mediaspeech/k2/k2/csrc/array_ops_inl.h:353
353 memcpy(static_cast<void *>(ans_data),
(gdb) list
348 // CPU.
349 int32_t elem_size = src[0]->ElementSize();
350 for (int32_t i = 0; i < num_arrays; ++i) {
351 int32_t this_dim = src[i]->Dim();
352 const T *this_src_data = src[i]->Data();
353 memcpy(static_cast<void *>(ans_data),
354 static_cast<const void *>(this_src_data), elem_size * this_dim);
355 ans_data += this_dim;
356 }
357 } else {
(gdb)

armusc · 2022-09-14T10:00:09Z

replacing int32_t with int64_t has indeed solved the problem in my case (6GB 4-gram fst)

csukuangfj · 2022-09-14T10:35:28Z

(gdb) print elem_size
(gdb) print this_dim
(gdb) print elem_size * this_dim

to see whether elem_size * this_dim overflows.

armusc · 2022-09-14T10:36:58Z

so anything bigger than 4GB would fail?
a 4gram LM of that size is probably something that can be pruned, but I have seen big HLG

jtrmal · 2022-09-14T10:42:51Z

not sure if it's related but I think the Kaldi code of arpa2fst is smart and uses different data type depending on how big the LM is. I could imagine this causing issue somewhere where the graph would be processed by another tool not knowing about this. y.

…

On Wed, Sep 14, 2022 at 12:37 PM armusc ***@***.***> wrote: so anything bigger than 4GB would fail? a 4gram LM of that size is probably something that can be pruned, but I have seen big HLG — Reply to this email directly, view it on GitHub <#1055 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACUKYXZ2MFCQJKJBZBE4CELV6GTFPANCNFSM6AAAAAAQJ34YNY> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

armusc · 2022-09-14T10:58:42Z

(gdb) print elem_size
(gdb) print this_dim
(gdb) print elem_size * this_dim
to see whether elem_size * this_dim overflows.

(gdb) print elem_size
$2 = 4
(gdb) print this_dim
$3 = 771885848
(gdb) print elem_size * this_dim
$4 = -1207423904
(gdb)

danpovey · 2022-09-14T13:57:09Z

@armusc can you please make PR?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Processing fst text lines #1055

Processing fst text lines #1055

armusc commented Sep 11, 2022

csukuangfj commented Sep 12, 2022

armusc commented Sep 12, 2022

danpovey commented Sep 12, 2022

armusc commented Sep 12, 2022

csukuangfj commented Sep 13, 2022

armusc commented Sep 13, 2022

csukuangfj commented Sep 14, 2022

danpovey commented Sep 14, 2022

danpovey commented Sep 14, 2022

armusc commented Sep 14, 2022

armusc commented Sep 14, 2022

csukuangfj commented Sep 14, 2022

armusc commented Sep 14, 2022

jtrmal commented Sep 14, 2022 via email

armusc commented Sep 14, 2022

danpovey commented Sep 14, 2022

Processing fst text lines #1055

Processing fst text lines #1055

Comments

armusc commented Sep 11, 2022

csukuangfj commented Sep 12, 2022

armusc commented Sep 12, 2022

danpovey commented Sep 12, 2022

armusc commented Sep 12, 2022

csukuangfj commented Sep 13, 2022

armusc commented Sep 13, 2022

csukuangfj commented Sep 14, 2022

danpovey commented Sep 14, 2022

danpovey commented Sep 14, 2022

armusc commented Sep 14, 2022

armusc commented Sep 14, 2022

csukuangfj commented Sep 14, 2022

armusc commented Sep 14, 2022

jtrmal commented Sep 14, 2022 via email

armusc commented Sep 14, 2022

danpovey commented Sep 14, 2022