Improving Gaze Detection #124
vladmandic
started this conversation in
Ideas
Replies: 1 comment
-
I just added code to calculate gaze vector: Line 15 in e0374f0 Results are definitely more useful than simple gaze recognition such as "looking center" although same precision issues remain |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In short, Human way of detemining gaze direction is very very similar to what you wrote in your email
gesture.ts
, there is a simple math (this is a small part of it):(this is just looking left/right, math for looking up/down is separate)
Where
res[i].mesh[33][0]
andres[i].mesh[263][0]
are x coordinates outside of points of each eye(you can see point indexes in https://github.com/vladmandic/human/blob/main/assets/facemesh.png)
So if a ratio is more than 3% to left, it will say
looking <direction>
(i put 3% as empiric value)Can it be improved? Of course. Feel free to suggest improvements
Couple of ideas from the top of my mind:
For example, if face is turned slightly to the right, it should only look at left eye results and not both eyes since right eye is clearly less visible
I've done a lot of work on that when analyzing face descriptors, that can be re-used for face mesh and iris analysis as well
Btw, sometimes blurring is better than sharpening as it reduces jitter returned by argmax functions when analyzing heatmaps
Human already has functionality to do that, its a question of playing and finding best values
and then reconstructing coordinates aftward would result in better precision of iris model
Or even more precise, do a double-pass: BlazeFace detects faces very loosely and then I crop the face pass it to mesh and iris model
It could be done so first pass of mesh model is used to find extreme points of the face,
then re-crop the face and pass it again and now with more precision again to mesh and iris model and take those keypoints as a result instead
I'm actively looking at alternatives at the moment
All this would be welcome contributions for Human
Also, I took a quick look at https://github.com/david-wb/gaze-estimation
It's a PyTorch model with pretrained checkpoint - and with several reasons why I wouldn't integrate it in Human:
30MB. I could probably strip and quantize it down to ~12MB,
but that would still be double the size of the second largest model I'm using
In my other projects, I use very large models (over 1GB stripped),
but one of key design goals for Human is to be fully portable which means keeping overall size to minimum
Output layers are fetched from model internals by name, etc
Alkso, key operation which analyzes heatmap returned by model is done in Python,
not in the model itself: for example, softargmax https://github.com/david-wb/gaze-estimation/blob/master/util/softargmax.py
I can handle all that by freezing model and doing a static signature definition and implenting external functions in JS,
but for me it's not worth it since this is only a small item in Human
CC: @lghasemzadeh
Beta Was this translation helpful? Give feedback.
All reactions