Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): Windows support #85

Merged
merged 24 commits into from
Aug 4, 2021
Merged

Conversation

mariocynicys
Copy link
Member

No description provided.

We still need to fix the path writers in the `PeriodConcatNode` as we use `os.path` for evaluating the new paths which will cause inserting backslashes in a file path that is considered a URL.

we can possibly use `posixpath` module, but let's keep it like that till the multi-period concatenation is finished.
Copy link
Member

@joeyparrish joeyparrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work! My main feedback is that I think it should be factored a little differently to encapsulate the "pipe" concept for all platforms, with platform-specific details hidden there.

Comment on lines 274 to 275
npm_command = ' '.join(['npm', 'install'])
subprocess.check_call(npm_command, shell=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use shell=True? If "npm" is in your PATH, then the old way should work.

Quoting the Python docs:

On Windows with shell=True, the COMSPEC environment variable specifies the default shell. The only time you need to specify shell=True on Windows is when the command you wish to execute is built into the shell (e.g. dir or copy). You do not need shell=True to run a batch file or console-based executable.

In general, I believe it's best practice to avoid shell=True when possible. Invoking a shell and passing all arguments as a single string can be problematic when the arguments contain spaces or other special characters. And if the arguments include user input, there is a risk of malicious inputs being used to run other commands (similar to an SQL injection attack). See also Python docs on "security considerations" for shell=True.

In this case, there's obviously no risk with shell=True, but I'd prefer to stick to shell=False whenever possible.

This reminds me, though, that we had a similar issue long ago in Shaka Player's build scripts (also written in Python). I think for windows, "npm" resolves to "npm.cmd" rather than "npm.exe", but subprocess won't try ".cmd" automatically like the Windows shell will. I believe we ended up with something similar to:

npm_exe = 'npm.cmd' if windows else 'npm'
subprocess.check_call([npm_exe, 'install'])

I found a stack overflow talking about the same thing: https://stackoverflow.com/a/50045443

It appears that the same technique is not required for "node", which is actually "node.exe" on Windows.

Please use something like that for now. I'm going to experiment with monkey-patching subprocess for Windows to find .cmd. It seems like someone should have done that by now, but I can't find it, so I'll write it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thaaaanks, It didn't make any sense to me why subprocess couldn't find it but i can invoke it from my cmd. thaaanks 😃

if os.name == 'posix':
path = os.path.join(self._temp_dir, unique_name)
readable_by_owner_only = 0o600 # Unix permission bits
os.mkfifo(path, mode=readable_by_owner_only) # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "type: ignore"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy complains on windows for this.

self.writ_pipe: Optional[str] = pipe
# On posix systems, the read and write pipes will be the same.
# But on Windows, we will have different read and write pipe names.
if os.name == 'nt' and self.read_pipe and self.writ_pipe:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the os.name == 'nt' checks and path manipulations being spread around. This could be encapsulated into a class (called something like "Pipe"), and there could be "reader_path" and "writer_path" attributes on that class. For os.name == 'posix', they would be paths to FIFOs, and identical. For os.name == 'nt', they would be different, and would be built on your "winfifo" module. _create_pipe() would return one of these objects, and everywhere we used to use paths for pipes, we now could use a Pipe object.

Today, the end of the writer pipeline (PackagerNode) writes to a regular file. But it might be useful to avoid hard-coding knowledge of the termination of a pipeline into the nodes. For example, rather than PackagerNode taking Pipe objects for inputs and string paths for outputs, it could take a Pipe for both. The output Pipe for PackagerNode would be a variant where the writer_path was a plain file path.

Does this make sense?

And now that I'm reading through the WinFIFO class, I think it's not far off from what I'm suggesting. You could extend that class to cover the posix case, rename it, and pass it to the nodes in place of string paths.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understood that correctly, you mean to put a pipe before and after each node in the pipe line, like this.
image

So we would make a new file pipe.py and let all the platform diversion logic be in there, and also the pipe path naming(could be a file).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will need only the middle and the rightmost pipe in the blueprint up above, as the inputs will represent leftmost pipe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have invested some time working on removing the 'file' terminology from the PackagerNode and replacing it with an output pipe instead, but i am afraid that we might not be able to do that, since the output file is sometimes two and not one, and is a template for the initialization segment and other segments of the media file.
We might though be able to embed these knowledge in the OutputStream class instead and let it deal with the pipe creation and file templates too.

# limitations under the License.

from threading import Thread
import win32pipe, win32file, pywintypes # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why type: ignore here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy complains on linux for this, also on windows, as there are no library type hints for these modules.


# The read pipe is connected to a writer process.
self.read_side = win32pipe.CreateNamedPipe(
WinFIFO.WRITER_PREFIX + self.pipe_name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: indent 4 spaces when you put arguments on subsequent lines

Copy link
Member

@joeyparrish joeyparrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good. Nice work! Just a few requests, which I'm happy to discuss more if you disagree.

setup.py Outdated
import setuptools

import streamer

with open('README.md', 'r') as f:
long_description = f.read()

install_prerequisites = ['PyYAML']
if os.name == 'nt':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will only run on the system where we build the package, which will usually be Linux. Checking os.name here won't work, because no matter what platform we install it on later, this part is only run once when we build the package.

Instead, to express a platform-specific dependency, I just found out there is a syntax for that:

'pypiwin32 ; platform_system=="Windows"'

See https://stackoverflow.com/a/42946495

MediaType.VIDEO: '{dir}/video_{resolution_name}_{bitrate}_{codec}_init.{format}',
MediaType.TEXT: '{dir}/text_{language}_init.{format}',
}
path_templ = INIT_SEGMENT[self.type].format(**self._features, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think I would prefer the complete words, for clarity: path_template

But then I look again and realize it's not a template, it's an actual path. The template was the value from the dictionary, but after format(), you get a concrete path. So how about path?

"""Get the path which the transcoder will use to read the input.

For some input types, this is a named pipe. For others, this is .name.
def reset_name(self, pipe: str) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Now that we have a Pipe class, calling this param pipe makes it sound like it should be a Pipe object. How about path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I overlooked this one.
since it's a string and not a pipe object, i agree with you it should be called path instead.
but for clarity - as this path is a path of a pipe and not a file on the file system - let's call it pipe_path.

return Pipe.create_file_pipe(path_templ, mode='w')

def get_media_seg_file(self, **kwargs) -> Pipe:
MEDIA_SEGMENT = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that you decided to move these here. I imagine this will make some concat tasks easier, since you can get the output path.

streamer/pipe.py Outdated
self._write_pipe_name = ''

# Windows specific declarations.
self.buf_size = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like this could also be private. Or is it used outside the class somewhere I overlooked?

streamer/pipe.py Outdated
def create_ipc_pipe(temp_dir: str, suffix: str = '') -> 'Pipe':
"""A static method used to create a pipe between two processes.

On POXIS systems, it creates a named pipe using `os.mkfifo`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: POSIX

streamer/pipe.py Outdated
elif mode == 'r':
pipe._write_pipe_name = path
else:
raise RuntimeError('{} is not a valid file mode'.format(mode))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's unlikely, but in case someone passes a mode like 'rw' or 'wb' or 'a+', which are all valid modes for real files, let's change this to say something like "not a valid mode for Pipe".

streamer/pipe.py Outdated
import win32pipe # type: ignore
# Initialize the pipe object as a thread with `daemon` attribute set
# to True so that the thread shuts down when the caller thread exits.
Thread.__init__(pipe, daemon=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels kind of hacky to me to suddenly make Pipe a Thread. This feels like it should be more of a "has-a" relationship than "is-a". It appears that the run() method only uses three members of "Pipe": _read_side, _write_side, and buf_size. What about something like:

self._thread = Thread(
    target=Pipe._thread_fn,
    args=(self._read_side, self._write_side, self._buf_size),
    daemon=True)
self._thread.start()

You could rename run() to @staticmethod _thread_fn(read_side, write_side, buf_size).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes more sense since the Pipe object now represents a POSIX/Win pipe.

@mariocynicys
Copy link
Member Author

The new changes tested on Windows. Tests passed :).

setup.py Show resolved Hide resolved
scripts=['shaka-streamer'],
classifiers=[
'Programming Language :: Python :: 3',
'License :: OSI Approved :: Apache Software License',
'Operating System :: POSIX :: Linux',
'Operating System :: MacOS :: MacOS X',
'Operating System :: Microsoft :: Windows',
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the OS classifiers now since we already support the three of them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find any docs from pypi indicating one way or the other is best. Let's leave them for now. Thanks!

Copy link
Member

@joeyparrish joeyparrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for this! It's a big win for the project.

scripts=['shaka-streamer'],
classifiers=[
'Programming Language :: Python :: 3',
'License :: OSI Approved :: Apache Software License',
'Operating System :: POSIX :: Linux',
'Operating System :: MacOS :: MacOS X',
'Operating System :: Microsoft :: Windows',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find any docs from pypi indicating one way or the other is best. Let's leave them for now. Thanks!

Copy link
Member

@joeyparrish joeyparrish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, looks like there are a couple of merge conflicts. Can you please address those? Thanks!

@joeyparrish joeyparrish merged commit 9a7e0ec into shaka-project:master Aug 4, 2021
@mariocynicys mariocynicys deleted the windows-new branch August 5, 2021 13:06
@github-actions github-actions bot added the status: archived Archived and locked; will not be updated label Jul 25, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: archived Archived and locked; will not be updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants