-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
select.lua: select from the watch history with g-h #15655
base: master
Are you sure you want to change the base?
Conversation
Download the artifacts for this pull request: |
DOCS/man/mpv.rst
Outdated
@@ -328,6 +328,12 @@ g-l | |||
g-d | |||
Select an audio device. | |||
|
|||
g-h | |||
Select a file from the watch history. Requires | |||
``--script-opt=select-save_watch_history=yes``. The history path is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be native option. The fact that it is implemented in select.lua shouldn't force use to use --script-opt=select-
which is little bit verbose. Also I consider script-opts to something that should be used to already specific script behavior. Here we have whole separate feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean only save_watch_history
should be an option but watch_history_path
and any future config should be a script-opt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I mean all options should be an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also on that note, I think the history saving logic should be moved to own script, select should be left for selection only imho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The saving part is just 17 lines out of 85 though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I felt that it didn't make sense to separate them when select.lua has to be perfectly aware of which format it was saved in right above
player/lua/select.lua
Outdated
@@ -18,6 +18,13 @@ License along with mpv. If not, see <http://www.gnu.org/licenses/>. | |||
local utils = require "mp.utils" | |||
local input = require "mp.input" | |||
|
|||
local options = { | |||
save_watch_history = false, | |||
watch_history_path = "~~state/watch_history.txt", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For symmetry with watch_later
, maybe don't use .txt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
watch_later
is a directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you meant the files inside watch_later
. Dunno I think those are a weird case. I think the txt extension makes it easy to open it in a text editor by default e.g. in Windows or with zsh suffix aliases which are what I use. I also used console_history.txt
in the other PR so that would have to be changed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, maybe use .json
or .jsonl
in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well as I wrote in the commit message I didn't use jsonl only because I can't figure out whether ndjson or jsonl is preferable. Will switch to jsonl.
player/lua/select.lua
Outdated
local path = mp.command_native({"normalize-path", mp.get_property("path")}) | ||
local title = mp.get_property("playlist/" .. mp.get_property("playlist-pos") .. "/title") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move those after if not history_file
check.
fd03b7c
to
d80ee92
Compare
This works well. Comparing it to memo.lua which is a history script. It appends at end of file. When items hit around 400 history entries or so it won't open in one or two seconds. Sometimes it takes 5 seconds or more. So I trim the history file just keeping the last 333 entries.
It doesn't show duplicates which is fine. However, a suggestion I have if it doesn't create a problem with long text that won't fit on the osd that is. Not sure if it's wrapped or what. Is to add the current chapter within the file appending it after a # symbol. In the bookmarker-menu.lua I modified it to save bookmarks with the current chapter. This will help differentiate rather than just timestamps or time position to indicate where one is within the file.
input.conf
mpv.conf
|
This has a clear history entry and is usable with tens of thousands of lines, in fact I was testing the performance of searching in 150k lines yesterday and it was still usable if we switch to the C version of fzy, though This doesn't trim duplicates but it doesn't really have duplicates as files are logged with different timestamps every time which we can be useful to check. Text longer than the OSD is cropped. This appends to history when files are opened so there is no chapter or time position information at all. We have watch later files for resuming from the last position, so I assume you were talking about the watch later selection? That also doesn't show any time position, let alone chapters. I don't know if showing the time position before resuming is needed, I have never felt the need for that as it automatically resumes from that position. If you mean to use it for bookmarks of scenes watch later is not suitable for that since config files are deleted on resume, EDL is. |
423c3ba
to
30e0c38
Compare
sounds great performance wise. "Clear history" sounds better than "Clear the history" Oh wasn't aware time it appends upon open so yeah that's a no-go like you say. So watch-later probably wouldn't work since path is there already and text would be too long. Forgot also on memo.lua I add chapters to the history entries
|
If opening is slow, we should consider not using json. It brings no value. Simple separated values in each line will work the same and avoid parsing json.
I'd prefer for it to be de-duplicated in select, else searching for a file, with list this file with multiple dates, which makes searching history more troublesome. |
I don't have a real stake in this, but since it already came up before, IMO the range of input sizes should be discussed, and agreed upon, because this can affect the solution. Personally I don't care about 150K lines of anything. I don't use playlists with more than 20 items, I don't have thousands of tracks I need to choose from, etc. I'm using my own console with my own history implementation (deduplicatd, recent-sorted), and I know that it just doesn't keep growing beyond orders of magnitude less than 150K, despite me using it very frequently for many years now, without cleaning up the history. But maybe there could be cases of many many thousands of entries, like playlists, etc, that people agree should be handled well, and in that case the target sizes should be agreed. If there's an agreement that such sizes are required to be handled well, then IMO more appropriate solutions should also be considered, like sqlite, or some performant key-value store etc. Maybe specifically 150K can still be handled carefully without external tools/libraries, also on the RPI, but at some stage, if the size being mentioned keeps growing - as it seems to be happening all the time, it can becomes too big to keep handling it internally in some naive fashion IMO. |
How can loading 150k lines not be slow JSON or not? CSV will not work or is not simple when paths or titles contain newlines, and a new parser may be slower if not written in C like |
Well, there are databases that processes billions of rows fast. It's all matter of perspective. You brought up that it is slow. So I'm offering a different view on this. Whether we want to keep it slow is another question.
There is nothing wrong with C. Whole mpv is written in C. watch-later is also C and I believe it correctly escapes filenames, no? EDIT: Also loading a full file on every search is slow, which could be easily avoided. Again the question is how much slowness is acceptable. |
Databases can search through billions of rows quickly with indexes but if you actually retrieve all of them at once to use them somewhere else that is slow. watch later indeed escapes incorrectly since it converts newlines to _, such files don't work in the watch later selector. There was also an issue about cookie options with newlines not working in watch later files. Added deduplication. |
Should this be fixed? |
player/lua/select.lua
Outdated
for i, seen_path in ipairs(paths) do | ||
if seen_path == path then | ||
table.remove(items, i) | ||
table.remove(paths, i) | ||
break | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this n^2 search will be slow. How about we keep a map for such that map[path] = i
and get i
if exist, and instead of removing, just update the items[i]
with the new date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that at first but it doesn't work because i is updated as you prepend items. Also I tried doing this with the console history and it was actually much slower than looping with huge histories, maybe huge tables with long string keys cause issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested again with 100k+ lines and performance is almost the same between looping and keeping a map. Also updating items[i]
as you said doesn't work because the new watched entry should be at the beginning. So while with the loop so we can just prepend items, with the map we have to decrement every i
in the map after the removed one and revert tables at the end, that's probably slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless, I would still like it to be gone from reading loop.
How about we add after for line in history_file:lines()
local seen = {}
local _items = {}
local _paths = {}
for i, path in ipairs(paths) do
if not seen[path] then
seen[path] = true
table.insert(_items, items[i])
table.insert(_paths, path)
end
end
EDIT: wait, I'm still thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should really use a BST to store history in this case since you want the history to be always sorted by date with frequent search/insert/delete. You can keep a map such that map[path] = date
and then finding the entry with such date and moving it to the front would be only O(log n)
instead of O(n)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested that this is only marginally faster with 100k unique lines and 50% faster by repeatedly concating 600 history lines until you get a 100k lines file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind I was still prepending to the tables. It's 3 faster even with unique lines by just replacing prepending with appending.
Ignore my wrong measurements, this is over 40 times faster than before with unique lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I like simple "append only" file, because we don't need to modify the rest of it.
Isn't modification needed if you want history deduplication at runtime without reloading the history file? That's where BST is useful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it is reloaded on each search.
Do you think added complexity is worth it? After first file load it would be cached by the kernel anyway, so consecutive loads should be fast, except maybe next time after it is evicted. Keeping local state of history is valid, but not sure it is worth the added complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends whether if the current performance is good enough. I only mentioned the "big O" because there seemed to be some performance problems with a very large history file.
It seems the author already had a fix in #13016, I didn't see that before. Though paths in comments would have to be fixed separately from options, select.lua doesn't have access to the config file parser. At least with JSON we get it working for free. |
DOCS/man/options.rst
Outdated
|
||
``--save-watch-history`` | ||
Whether to save which files are played (default: no). These can be then | ||
selected with the default ``g-h`` key binding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
documenting it like this might make people think it's an option built into the mpv core, so it should say that it belongs to select.lua
(or at least that it requires Lua)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it could have been kept as a script-opt if that's an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it an issue? It's in implementation detail how it's implemented imho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I think it would be nice to know what exactly you lose when you disable Lua, but it's not of high importance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should remain a script-opt. No other core options do nothing, like this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as many other options that do nothing if mpv is compiled without some library dependency.
See #14624: the direction is that core options should either warn or don't exist at all when compiled without dependency. And there will be even more complexity when dealing with this history feature which can be unloaded at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add a 20 line script just to append to the history? Since the writing code is 20 lines vs 90 lines for the selection code, the selection code needs to know exactly how the writing code above formats the JSON, and they both needed the same watch_history_path
script-opt, I thought they might as well be together. And if not using global options it would have been very confusing if users have to configure script-opts related to saving the history in history.lua and script-opts related to reading and formatting the history in select.lua, and that 1 of the 2 scripts has to call read_options
with the name of the other script to get shared script-opts like watch_history_path
. Not sane.
select.lua doesn't really handle selection, it gathers and formats data to be selected in console. I don't see how gathering and formatting history is different from any of its other selectors.
I don't understand what not overriding OSD styles has to do with options vs script-opts for these history ones. I made scripts respect existing OSD options when feasible to not make users configure the same thing multiple times. Actually this does exactly the same thing with --osd-playlist-entry
. But IMO it makes sense that options that only have effect within a script are its script-opts, it makes it clear what script they affect, what script you need to edit if you want to alter the behavior, and that you lose that functionality by disabling Lua. It also avoids bloating global options if this gains several script-opts in the future, imagine if every OSC script-opt was a global option. Maybe there is value in having easier to type shortcuts like we already have in --ytdl-format
and --ytdl-raw-options
but dunno if that's worth losing clarity and consistency. At that point if we want to make it more usable a better way would be some GUI button that appends the option to enable history in mpv.conf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe only the writing part can be converted to C since that's small, after mp_delete_watch_later_conf()
. That provides a reason to use global options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need to put an end to this ongoing debate about Lua vs. C implementation.
This issue came up in the other PR as well. Initially, it was implemented in C, then it was requested to be reimplemented in Lua. Afterward, a major concern arose: "What if Lua isn't enabled?"... can we not troll ourselves again?
We need to make a clear decision: either we allow "core" functionalities to be implemented in Lua, or we don't. We can't keep standing in the middle of the road, indecisive, every time this comes up.
In my personal opinion which may not reflect opinion of others. I think implementing features like history in Lua is fine and many of mpv features could be implemented as such. It is also valid to implement it in C, in this cases, there is not really a difference, as the code to save/load watch-history is trivial in both cases (it's calling our json parser anyway). I'm fine with having it in Lua. But at the same time, I think it should be considered "core" option, meaning options should be top-level, not script-opt
, as the fact it is implemented in Lua shouldn't dictate the user interface to configure such feature. And if this is a problem, we should stop implementing features in Lua.
I'm fine with the current state of this PR. I don't see technical issue with the current implementation. It's small, works. But if others don't agree on this, we will need to think of other implementation. But please make it clear that we don't want to use Lua, so we don't make this mistake again in the future.
The only concern about Lua implementation is that, each new script spawns new mpv_client
with own event loop, which may not be a big deal, but the constant overhead is there. I agree though that putting everything in select.lua is bit out of place. But in the same time, spawning clients for 30 lines scripts is little bit meh.
EDIT: Also lua code will be bigger than compiled C code in final binary. So there is that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe only the writing part can be converted to C since that's small
Yeah, maybe. I think in this case small save function in C will resolve all concerns about options and need for separate history.lua and select.lua will have only selector, same as for watch-later. The save code in C will be less than it is in Lua actually...
02ef821
to
6afdfe2
Compare
382af52
to
d0e0679
Compare
history file may expose privacy-sensitive information, request to add a blacklist option to prevent some files from being stored in the history file, the syntax can be consistent with ytdl_hook-exclude . |
You can use something like
|
We can add Incognito profile
in |
Just another example of excluding images from history and write-watch mpv.conf [history watch images] |
At least a privacy warning needs to be added to the --save-watch-history option in the manual, just like the --write-filename-in-watch-later-config option. |
5fd7495
to
a2fdb36
Compare
We can add a note, but I think it is self-explanatory that watch history... keeps watch history. |
This will be used in the next commit.
player/loadfile.c
Outdated
json_write(&dst, &node); | ||
dst = talloc_strdup_append(dst, "\n"); | ||
|
||
fwrite(dst, strlen(dst), 1, history_file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could check return value and show error/warning if it fails.
player/lua/select.lua
Outdated
local time, path, title = unpack(entry) | ||
|
||
if not path then | ||
mp.msg.warn(line .. " in " .. history_file_path .. " has invalid data.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this in else branch of if entry then
in main loop. And instead printing line
, which will leak history in log and also if it is malformed it may not be printable, I would log i
, number of the line that fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The else branch also runs if the entry is well formed but an already seen path.
player/lua/select.lua
Outdated
for i, entry in pairs(entries) do | ||
local item = entry.path | ||
if entry.title and osd_playlist_entry == "title" then | ||
item = entry.title | ||
elseif entry.title and osd_playlist_entry == "both" then | ||
item = entry.title .. " (" .. entry.path .. ")" | ||
end | ||
items[i] = entry.date .. item | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this back it the main loop, can be inside if entry then
branch. I will see about making it more generic in another PR.
player/loadfile.c
Outdated
} | ||
|
||
char *title = (char *)find_non_filename_media_title(mpctx); | ||
mpv_node items[3] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mpv_node items[3] = { | |
mpv_node items[] = { |
return; | ||
|
||
void *ctx = talloc_new(NULL); | ||
char *history_path = mp_get_user_path(ctx, mpctx->global, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be null when no-config
is used. Or empty string, either way it has to be checked, before fopen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is watch_history.jsonl
with --no-config --save-watch-history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what happens it saves history in cwd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
player/lua/select.lua
Outdated
elseif entry.title and osd_playlist_entry == "both" then | ||
item = entry.title .. " (" .. entry.path .. ")" | ||
end | ||
items[i] = entry.date .. item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
items[i] = entry.date .. item | |
local status, date = pcall(os.date, "(%Y-%m-%d %H:%M) ", time) | |
items[i] = (status and date or entry.time) .. item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is showing UNIX timestamps to end users useful? It would only cause confusion.
player/lua/select.lua
Outdated
return { | ||
["date"] = date, | ||
["path"] = path, | ||
["title"] = title, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return { | |
["date"] = date, | |
["path"] = path, | |
["title"] = title, | |
} | |
return { | |
time = entry[1], | |
path = entry[2], | |
title = entry[3], | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or whatever, keep it as simple as possible.
The history could be formatted as CSV, but this requires escaping the separator in the fields and doesn't work with paths and titles with newlines, or as JSON, but it is inefficient to reread and rewrite the whole history on each new file, and doing so overwrites the history with an empty file when writing without disk space left. So this uses a hybrid of one JSON array per line to get the best of both worlds. This is called NDJSON or JSONL.
player/lua/select.lua
Outdated
local osd_playlist_entry = mp.get_property("osd-playlist-entry") | ||
|
||
for i = #lines, 1, -1 do | ||
error_message = "Line " .. i .. " of " .. history_file_path .. " is malformed." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about
error_message = "Line " .. i .. " of " .. history_file_path .. " is malformed." | |
error_message = error_message = history_file_path .. ": Parse error at line " .. i |
Implement selection of the entries in the watch history. --osd-playlist-entry determines whether titles and/or filenames are shown. But unlike in show-text ${playlist} and select-playlist, "file" and "both" print full paths because history is much more likely to have files from completely different directories, so showing the directory conveys where files are located. This is particularly helpful for filenames like 1.jpg. The last entry in the selector deletes the history file, as requested by Samillion.
Implement saving watched paths and selecting them.
--osd-playlist-entry determines whether titles and/or filenames are shown. But unlike in show-text ${playlist} and select-playlist, "file" and "both" print full paths because history is much more likely to have files from completely different directories, so showing the directory conveys where files are located. This is particularly helpful for filenames like 1.jpg.
The last entry in the selector deletes the history file, as requested by Samillion.
The history could be formatted as CSV, but this requires escaping the separator in the fields and doesn't work with paths and titles with newlines, or as JSON, but it is inefficient to reread and rewrite the whole history on each new file, and doing so overwrites the history with an empty file when writing without disk space left. I went with an hybrid of one JSON array per line to get the best of both worlds. And I discovered afterwards that this was an existing thing called NDJSON or JSONL. Since there are these 2 competing standards it is not clear if the file extension should be ndjson or jsonl, so I just used txt.
watch_history_path is awkwardly documented along with the key binding because I don't think it's worth adding a select.lua section to the manual just for this. I will add it and move it there if I add more script-opts in the future.