10 bit colors to numpy #1714
-
Hi! import av
import numpy as np
# Read hevc video in yuv422p10le format
with av.open("video.mp4") as container:
frame = next(container.decode(video=0))
np_frame = frame.to_ndarray(format="rgb48")
print(container.streams.video[0].pix_fmt) # >>> yuv422p10le
print(np_frame.dtype) # >>> uint16
print(np.unique(np_frame)) # >>> [ 0 1 2 ... 65498 65527 65535]
print(len(np.unique(np_frame))) # >>> 58702 (expected <= 1024 differnte values for yuv422p10le) Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
The issue here is that you're converting the YUV data to RGB format with Your original video is in YUV422p10le format, where each component indeed has 10-bit depth (0-1023 possible values)
To see the original 10-bit values, you should read the frame in its native YUV format, which I think is: np_frame = frame.to_ndarray() |
Beta Was this translation helpful? Give feedback.
-
It is not yet implemented in PyAV but you can convert the frame in yuv444p16le, then cast it as a numpy array. import av
import numpy as np
# read in yuv422p10le format
with av.open("video.mp4") as container:
frame_10 = next(container.decode(video=0))
frame_16 = frame.reformat(format="yuv444p16le")
frame_16_np = np.frombuffer(frame_16.planes[0]).view(np.uint16) # only y componant
print(f"there is {len(set(frame_16_np.tolist()))} differents values in the frame (<2**10)") |
Beta Was this translation helpful? Give feedback.
-
Update since the version 14.1.0, the type "yuv444p16le" is now supported. import av
import numpy as np
# read in yuv422p10le format
with av.open("video.mp4") as container:
frame_10 = next(container.decode(video=0))
frame_16_np = frame.to_ndarray(format="yuv444p16le")
frame_10_np = frame_16_np >> 16 - 10 # reverse bit shift |
Beta Was this translation helpful? Give feedback.
The issue here is that you're converting the YUV data to RGB format with
to_ndarray(format="rgb48")
. This conversion process is what's causing you to see more unique values than expected.Your original video is in YUV422p10le format, where each component indeed has 10-bit depth (0-1023 possible values)
When you convert to "rgb48" format:
To see the original 10-bit values, you should read the frame in its native YUV format, which I think is: