-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(feat): Windows support #85
Conversation
We still need to fix the path writers in the `PeriodConcatNode` as we use `os.path` for evaluating the new paths which will cause inserting backslashes in a file path that is considered a URL. we can possibly use `posixpath` module, but let's keep it like that till the multi-period concatenation is finished.
…pe into read and write pipes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great work! My main feedback is that I think it should be factored a little differently to encapsulate the "pipe" concept for all platforms, with platform-specific details hidden there.
run_end_to_end_tests.py
Outdated
npm_command = ' '.join(['npm', 'install']) | ||
subprocess.check_call(npm_command, shell=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use shell=True? If "npm" is in your PATH, then the old way should work.
Quoting the Python docs:
On Windows with shell=True, the COMSPEC environment variable specifies the default shell. The only time you need to specify shell=True on Windows is when the command you wish to execute is built into the shell (e.g. dir or copy). You do not need shell=True to run a batch file or console-based executable.
In general, I believe it's best practice to avoid shell=True when possible. Invoking a shell and passing all arguments as a single string can be problematic when the arguments contain spaces or other special characters. And if the arguments include user input, there is a risk of malicious inputs being used to run other commands (similar to an SQL injection attack). See also Python docs on "security considerations" for shell=True.
In this case, there's obviously no risk with shell=True, but I'd prefer to stick to shell=False whenever possible.
This reminds me, though, that we had a similar issue long ago in Shaka Player's build scripts (also written in Python). I think for windows, "npm" resolves to "npm.cmd" rather than "npm.exe", but subprocess won't try ".cmd" automatically like the Windows shell will. I believe we ended up with something similar to:
npm_exe = 'npm.cmd' if windows else 'npm'
subprocess.check_call([npm_exe, 'install'])
I found a stack overflow talking about the same thing: https://stackoverflow.com/a/50045443
It appears that the same technique is not required for "node", which is actually "node.exe" on Windows.
Please use something like that for now. I'm going to experiment with monkey-patching subprocess for Windows to find .cmd. It seems like someone should have done that by now, but I can't find it, so I'll write it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thaaaanks, It didn't make any sense to me why subprocess couldn't find it but i can invoke it from my cmd. thaaanks 😃
streamer/controller_node.py
Outdated
if os.name == 'posix': | ||
path = os.path.join(self._temp_dir, unique_name) | ||
readable_by_owner_only = 0o600 # Unix permission bits | ||
os.mkfifo(path, mode=readable_by_owner_only) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "type: ignore"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy complains on windows for this.
streamer/output_stream.py
Outdated
self.writ_pipe: Optional[str] = pipe | ||
# On posix systems, the read and write pipes will be the same. | ||
# But on Windows, we will have different read and write pipe names. | ||
if os.name == 'nt' and self.read_pipe and self.writ_pipe: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the os.name == 'nt' checks and path manipulations being spread around. This could be encapsulated into a class (called something like "Pipe"), and there could be "reader_path" and "writer_path" attributes on that class. For os.name == 'posix', they would be paths to FIFOs, and identical. For os.name == 'nt', they would be different, and would be built on your "winfifo" module. _create_pipe() would return one of these objects, and everywhere we used to use paths for pipes, we now could use a Pipe object.
Today, the end of the writer pipeline (PackagerNode) writes to a regular file. But it might be useful to avoid hard-coding knowledge of the termination of a pipeline into the nodes. For example, rather than PackagerNode taking Pipe objects for inputs and string paths for outputs, it could take a Pipe for both. The output Pipe for PackagerNode would be a variant where the writer_path was a plain file path.
Does this make sense?
And now that I'm reading through the WinFIFO class, I think it's not far off from what I'm suggesting. You could extend that class to cover the posix case, rename it, and pass it to the nodes in place of string paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need only the middle and the rightmost pipe in the blueprint up above, as the inputs
will represent leftmost pipe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have invested some time working on removing the 'file' terminology from the PackagerNode
and replacing it with an output pipe instead, but i am afraid that we might not be able to do that, since the output file is sometimes two and not one, and is a template for the initialization segment and other segments of the media file.
We might though be able to embed these knowledge in the OutputStream
class instead and let it deal with the pipe creation and file templates too.
streamer/winfifo.py
Outdated
# limitations under the License. | ||
|
||
from threading import Thread | ||
import win32pipe, win32file, pywintypes # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why type: ignore here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy complains on linux for this, also on windows, as there are no library type hints for these modules.
streamer/winfifo.py
Outdated
|
||
# The read pipe is connected to a writer process. | ||
self.read_side = win32pipe.CreateNamedPipe( | ||
WinFIFO.WRITER_PREFIX + self.pipe_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: indent 4 spaces when you put arguments on subsequent lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good. Nice work! Just a few requests, which I'm happy to discuss more if you disagree.
setup.py
Outdated
import setuptools | ||
|
||
import streamer | ||
|
||
with open('README.md', 'r') as f: | ||
long_description = f.read() | ||
|
||
install_prerequisites = ['PyYAML'] | ||
if os.name == 'nt': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will only run on the system where we build the package, which will usually be Linux. Checking os.name
here won't work, because no matter what platform we install it on later, this part is only run once when we build the package.
Instead, to express a platform-specific dependency, I just found out there is a syntax for that:
'pypiwin32 ; platform_system=="Windows"'
streamer/output_stream.py
Outdated
MediaType.VIDEO: '{dir}/video_{resolution_name}_{bitrate}_{codec}_init.{format}', | ||
MediaType.TEXT: '{dir}/text_{language}_init.{format}', | ||
} | ||
path_templ = INIT_SEGMENT[self.type].format(**self._features, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think I would prefer the complete words, for clarity: path_template
But then I look again and realize it's not a template, it's an actual path. The template was the value from the dictionary, but after format(), you get a concrete path. So how about path
?
streamer/input_configuration.py
Outdated
"""Get the path which the transcoder will use to read the input. | ||
|
||
For some input types, this is a named pipe. For others, this is .name. | ||
def reset_name(self, pipe: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Now that we have a Pipe class, calling this param pipe
makes it sound like it should be a Pipe object. How about path
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I overlooked this one.
since it's a string and not a pipe object, i agree with you it should be called path
instead.
but for clarity - as this path is a path of a pipe and not a file on the file system - let's call it pipe_path
.
return Pipe.create_file_pipe(path_templ, mode='w') | ||
|
||
def get_media_seg_file(self, **kwargs) -> Pipe: | ||
MEDIA_SEGMENT = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that you decided to move these here. I imagine this will make some concat tasks easier, since you can get the output path.
streamer/pipe.py
Outdated
self._write_pipe_name = '' | ||
|
||
# Windows specific declarations. | ||
self.buf_size = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me like this could also be private. Or is it used outside the class somewhere I overlooked?
streamer/pipe.py
Outdated
def create_ipc_pipe(temp_dir: str, suffix: str = '') -> 'Pipe': | ||
"""A static method used to create a pipe between two processes. | ||
|
||
On POXIS systems, it creates a named pipe using `os.mkfifo`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: POSIX
streamer/pipe.py
Outdated
elif mode == 'r': | ||
pipe._write_pipe_name = path | ||
else: | ||
raise RuntimeError('{} is not a valid file mode'.format(mode)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's unlikely, but in case someone passes a mode like 'rw' or 'wb' or 'a+', which are all valid modes for real files, let's change this to say something like "not a valid mode for Pipe".
streamer/pipe.py
Outdated
import win32pipe # type: ignore | ||
# Initialize the pipe object as a thread with `daemon` attribute set | ||
# to True so that the thread shuts down when the caller thread exits. | ||
Thread.__init__(pipe, daemon=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels kind of hacky to me to suddenly make Pipe a Thread. This feels like it should be more of a "has-a" relationship than "is-a". It appears that the run() method only uses three members of "Pipe": _read_side, _write_side, and buf_size. What about something like:
self._thread = Thread(
target=Pipe._thread_fn,
args=(self._read_side, self._write_side, self._buf_size),
daemon=True)
self._thread.start()
You could rename run()
to @staticmethod _thread_fn(read_side, write_side, buf_size)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes more sense since the Pipe
object now represents a POSIX/Win pipe.
The new changes tested on Windows. Tests passed :). |
scripts=['shaka-streamer'], | ||
classifiers=[ | ||
'Programming Language :: Python :: 3', | ||
'License :: OSI Approved :: Apache Software License', | ||
'Operating System :: POSIX :: Linux', | ||
'Operating System :: MacOS :: MacOS X', | ||
'Operating System :: Microsoft :: Windows', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove the OS classifiers now since we already support the three of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find any docs from pypi indicating one way or the other is best. Let's leave them for now. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for this! It's a big win for the project.
scripts=['shaka-streamer'], | ||
classifiers=[ | ||
'Programming Language :: Python :: 3', | ||
'License :: OSI Approved :: Apache Software License', | ||
'Operating System :: POSIX :: Linux', | ||
'Operating System :: MacOS :: MacOS X', | ||
'Operating System :: Microsoft :: Windows', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find any docs from pypi indicating one way or the other is best. Let's leave them for now. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, looks like there are a couple of merge conflicts. Can you please address those? Thanks!
resolving merge conflicts
No description provided.