See the hooks heading below.
- State “CANCELLED” from “TODO” [2016-08-17 Wed 22:32]
Value of point or the current buffer may have changed in the meantime. We will leave this up to the user.
properly and the directory is empty.
- State “DONE” from “TODO” [2016-09-11 Sun 20:22]
See org-board-wget-call.
- State “DONE” from “TODO” [2016-08-17 Wed 22:24]
- State “DONE” from “TODO” [2016-08-17 Wed 22:24]
Say I’m downloading from Google Cache. They don’t allow a wget user agent, so we need to use the `alist’ option “No-Agent”. A pre-processing hook could fix this so that you don’t even have to add the WGET_OPTIONS value yourself: it should be added for you.
This would be for, say, auto-downloading videos from YouTube.
- State “DONE” from “TODO” [2016-08-17 Wed 22:24]
Done by showing which org entry it belonged to (using process-put).
- State “DONE” from “TODO” [2016-08-17 Wed 22:25]
- State “DONE” from “TODO” [2016-08-17 Wed 22:23]
- State “DONE” from “TODO” [2016-08-18 Thu 14:56]
Use something like `find’ in dired.
Steps:
- Take the most recent ARCHIVED_AT folder.
- Run something like `find . -name “*.html”’. If none, open the folder; if one, open it; if several, offer them up for choice. With a prefix argument open in Emacs.
- State “DONE” from “TODO” [2016-08-18 Thu 15:34]
- State “DONE” from “TODO” [2016-08-20 Sat 12:55]
- State “CANCELLED” from “TODO” [2016-08-24 Wed 22:00]
Does not seem like this will work. Use single quotes instead.
I cannot get –header=”Accept: text/html” to work, for example.
- State “DONE” from “TODO” [2016-08-24 Wed 22:00]
Via “ztree”.
Using a dispatcher, one line per archive in the current entry, using widgets, like recentf. The “open” dispatcher will open an archive, then close immediately, the “rename” will rename both the link description and the folder name, the “delete” will delete an archive.
There should also eventually be support for diffing from the dispatcher.
[2016-09-22 Thu 14:44] If not found, show all files recursively.
[2016-09-29 Thu 08:20] This works now, except for URLs with a list of “?” parameters at the end. Also, for instance Google Cache URLs would need a little algorithm to detect non-directory slashes.
List out all the files in the archival directory, take the Levenshtein distance with the URL, and open the closest one.
Then open the folder and open the file closest in name to the basename of the URL.
For example, we have a URL: http://www.foo.com/bar/taleb.html?params=test.
`wget’ will have downloaded the file “taleb.html?params=test” in the directory structure “www.foo.com/bar/”. So we need to extract the part after the protocol specification of the URL, (maybe use the Emacs URL library?). Then we can descend the directory structure and find the file we want to open.
There are some edge cases where descending the structure is not so easy. For example, a Google Cache URL contains literal slashes that are actually part of a filename, not a directory structure. `wget’ encodes these as `%2F’ in the saved filename, reducing the ambiguity.
Ignoring the colon issue, say we have a URL like this (similar to Google Cache’s) where the last slash is actually part of the filename: http://www.foo.com/bar/:cache:google.com/news
- We have to iteratively check first that “www.foo.com” is a directory, then check “www.foo.com/bar/”, then check “www.foo.com/bar/:cache:google.com” (which will fail) then use the failed part as part of the filename and open (whatever is closest to) that.
- This won’t work when there is a slash later that does descend the directory structure, but that can’t be very common.
- State “DONE” from “TODO” [2016-08-25 Thu 11:46]
- State “DONE” from “TODO” [2016-11-14 Mon 15:30]
The format for a date currently looks like 2016-08-24-22-23-20. Ideally it should look as much like an Org date as possible (without the spaces as they are a nuisance in filenames). Maybe they could be fontified somehow?
So that you can quickly differentiate between archives that you’ve taken with different arguments to wget.
Sometimes it happens that you made an error with the command, so there should be a quick way of doing this.
My current implementation does not take into account multiple archival processes.
Allow exporting an archived page or set of pages to PDF?
For this, you would need to set a default bookmark file… or maybe use org-capture as a backend?
You could have a capture template that runs a function which, in Eww, will take the URL of the current page and archive that one, or if not in Eww, take a URL from the clipboard.
With remote archival. `process-file’ is apparently for this.
See Github Issue #11.
See Github Issue #13.
All the let bindings are are copied from the worker functions. Extract to a macro.
With a captured page there should be some way to annotate it. Possibly JS in the browser, or possibly via links from a dedicated org annotation file? To discuss.
Exporting a bookmarks file (or an org-board entry) isn’t spectacular: the properties are not shown, for example, so there’s no internal link to the file archived.
[2016-09-22 Thu 15:11] Calling `message’ (like in org-board-archive-dry-run) with a string containing a literal “%” will have Emacs complaining, if there are no input arguments specified afterwards. See Xah Lee:
Tip for elisp coders. Often, while still writing your program and in the debugging process, you might put (mesage myStr) to print some variables. However, if your “myStr” happens to contain “%” char such as in URL, then you’ll get a mysterious error “Not enough arguments for format string”.
Fix it by using the “%s” format specifier.