-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
basic infomation about chinese #34
Comments
@lelechen63 in |
@lelechen63 Could it be understood that these two functions are two methods for extracting landmarks, and in |
@lelechen63 I am a bit confused about landmarks. Does this parameter distinguish between training and testing? Is the PCA the same? U_lrw1.npy belongs to the training set, does the test set also contain a U_lrw1_test.npy? When I looked at the source code, I found that both training and testing used U_lrw1.py.Thanks. |
The released model is trained on English, but it can be tested on any other language. The reason is that we consider the audio input as 0.04 seconds, which is not sensitive to linguistic(semantic) information about the language type. |
I will clean the code again this month. I will notify you once I finished it. The main process for the landmark is 2 steps: 1. align the image using affine transformation 2. detect landmark. In the original code, we have the third step: normalize the landmark. But actually we do not need this step. |
The PCA for train and test are same. PCA parameters are extracted from train set and can be used for any videos including test set or videos in the wild. |
Regarding your answer, can I understand that the audio is not the same as the lips, in fact, it has nothing to do with the training language, is it related to the model itself? |
@lelechen63 Could you release the parameters of the training AT-net and VG-net?Otherwise, it is difficult for us to achieve the effect in the paper.Thanks. |
@lelechen63What are the meanings of |
臣附议,,,, @lelechen63 |
what is the mean of |
why face normalization is no need? From my point of view, individual face shape are not different, which also contain rotations(raw, yaw, pitch). All of this parameters are not relevant to audio inputs. So I'm wondering why normalization is no need? Hope for your reply^ |
Thanks for sharing your code.I ran the Chinese audio file with your demo, and my lips were not coordinated. Is there any solution? Does your model plan to train on the Chinese lip dataset?Thanks.
The text was updated successfully, but these errors were encountered: