Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete old Markdown file when editing the EXPORT_FILE_NAME #34

Closed
kaushalmodi opened this issue Jul 13, 2017 · 19 comments
Closed

Delete old Markdown file when editing the EXPORT_FILE_NAME #34

kaushalmodi opened this issue Jul 13, 2017 · 19 comments

Comments

@kaushalmodi
Copy link
Owner

If the EXPORT_FILE_NAME was "a" and then we changed that to "b", we will end up with both a.md and b.md.

Need to figure out how to delete the old file when new file is created.

@punchagan
Copy link
Contributor

One idea could be to maintain a "database" of the posts that have been exported from ox-hugo. Posts could be given a custom ID on export, and map that to some post metadata in the DB, including EXPORT_FILE_NAME.

If we go with this idea, we could also store a hash of the subtree/post contents, to detect changes to any of the posts, and let "export all subtrees" identify and export only the changed ones.

@kaushalmodi
Copy link
Owner Author

Would you like to implement this? Also I don't know what the performance impact would be for big blogs if the hash has to be calculated for dozens/hundreds of posts with, let's say, 2000 words each, for each export. Poor man solution would be to rely on git diff to see which posts changed, and export just those :) Git diff also helps catch any unintended text change in older posts.

About the implementation specific to deleting old Markdown files, here's my thought: Each time a post is exported, this one property should be saved to the subtree: EXPORTED_FILE_NAME.

So..

Before first export

* My Post
:PROPERTIES:
:EXPORT_HUGO_SECTION: posts
:EXPORT_FILE_NAME: my-post
:END:

After first export

* My Post
:PROPERTIES:
:EXPORT_HUGO_SECTION: posts
:EXPORT_FILE_NAME: my-post
:EXPORTED_FILE_NAME: posts/my-post
:END:

(assuming the extension to be always .md)


After EXPORT_FILE_NAME rename (and even section change!) -- Before export

* My Article
:PROPERTIES:
:EXPORT_HUGO_SECTION: articles
:EXPORT_FILE_NAME: my-article
:EXPORTED_FILE_NAME: posts/my-post
:END:

Here as (concat <EXPORT_HUGO_SECTION> "/" <EXPORT_FILE_NAME>) does not match with <EXPORTED_FILE_NAME>, the renaming/file moving will happen.

After EXPORT_FILE_NAME rename (and even section change!) -- After export

* My Article
:PROPERTIES:
:EXPORT_HUGO_SECTION: articles
:EXPORT_FILE_NAME: my-article
:EXPORTED_FILE_NAME: articles/my-article
:END:

.. and after the export, the <EXPORTED_FILE_NAME> will be once again updated.

@punchagan
Copy link
Contributor

Would you like to implement this? Also I don't know what the performance impact would be for big blogs if the hash has to be calculated for dozens/hundreds of posts with, let's say, 2000 words each, for each export. Poor man solution would be to rely on git diff to see which posts changed, and export just those :) Git diff also helps catch any unintended text change in older posts.

I would like to try using the after save hook that you have provided with ox-hugo for a while, and see how it feels, before trying to implement this. Piggybacking on git diff is a nice idea!

Each time a post is exported, this one property should be saved to the subtree: EXPORTED_FILE_NAME.

This seems like a reasonable way to go about it. 👍

@takaxp
Copy link
Contributor

takaxp commented Aug 30, 2018

Any updates?

Piggybacking on git diff is a nice idea!

I'd support this approach.

@kaushalmodi
Copy link
Owner Author

kaushalmodi commented Aug 30, 2018

@takaxp

Any updates?

No, I haven't been working on this, and so have tagged as a wishlist item. Would you like to work on this?

But "this", I mean, creating a hash of all the subtrees and detecting which subtree content changed vs not.

It would be a great feature to implement, but I can use some help. This feature can also live as a separate package; either in ox-hugo repo or as a separate repo altogether . May be not as a separate repo because the subtree detection logic (that a subtree should have an EXPORT_FILE_NAME property) is specific to ox-hugo.

@takaxp
Copy link
Contributor

takaxp commented Aug 30, 2018

I'd like to work on this issue but currently I focus on auto-set-lastmod issue that I haven't reported yet.

Hmm... I feel now the problem is that an unexpected post could be exposed to a hugo website.

For example,

  1. Create a subtree with a title as "a-title-with-typo"
  2. Export the post by "C-c C-e H H", then "a-title-with-typo.md" will be created.
  3. After that, a user realize the typo and fix the typo as "a-correct-title"
  4. Export the same post by "C-c C-e H H" again, then "a-correct-title" will be created but "a-title-with-typo.md" still exists.
  5. Execute "hugo" and transfer the public directory to a website
  6. Then, "a-title-with-typo/index.html" will be published unexpectedly.

If a user configures (setq org-hugo-auto-export-on-save t) option, the above issue will be happen more frequently and it's critical.

But the point is when we should check the unexpected files are placed in content and public directories.

Should we have to check them all at the time exporting even a single post by introducing potential heavy calculations based on hash database?

@kaushalmodi
Copy link
Owner Author

Export the same post by "C-c C-e H H" again, then "a-correct-title" will be created but "a-title-with-typo.md" still exists.

Yes, I understand the issue. The way I avoid is by carefully looking at the git diffs when committing. This of course doesn't help if one is not using the git flow for site deployment, and is instead directly copying the files from content, public, etc. via rsync, etc.

@takaxp
Copy link
Contributor

takaxp commented Aug 30, 2018

The way I avoid is by carefully looking at the git diffs when committing.

True. In my flow, I directly copy the public directory via rsync, so I can check the differences by dry-run.

@kaushalmodi
Copy link
Owner Author

kaushalmodi commented Sep 5, 2018

Note to self: Incorporate the hash calculation has done by @itf in itf/org-export-head@32fc582.

@itf
Copy link

itf commented Sep 5, 2018

Thanks! I got the hash idea from this answer: https://emacs.stackexchange.com/a/39376/20156

@reedlaw
Copy link

reedlaw commented Feb 9, 2020

Maybe this is naive but why not delete the entire export (with all sub-trees) and export again? Maybe this behavior could be an option for C-c C-e H A (export all sub-trees to Md files).

@kaushalmodi
Copy link
Owner Author

@reedlaw

why not delete the entire export (with all sub-trees) and export again?

I am making ox-hugo behave conservatively so that people do not lose their Markdown content unintentionally. Also ox-hugo does not maintain a "database" that tracks what the old export path was vs new (in case the user changed the HUGO_SECTION or EXPORT_FILE_NAME). If the user is so confident, they can always rm -rf HUGO_BASE_DIR/content/ and C-c C-e H A.. may be they can even advise an ox-hugo export function so that before running that, it always runs (delete-directory "/path/to/content/").

But I certainly cannot make ox-hugo take this risky step!

@reedlaw
Copy link

reedlaw commented Feb 11, 2020

@kaushalmodi I see your point. Although I think it would safe if using version control or only editing org files without manually touching the markdown.

I was able to implement this for myself using this:

(add-hook 'org-export-preprocess-final-hook (delete-directory "~/blog/content" t))

@kaushalmodi
Copy link
Owner Author

kaushalmodi commented Feb 11, 2020

@reedlaw

(add-hook 'org-export-preprocess-final-hook (delete-directory "~/blog/content" t))

I'd be surprised if that actually worked, because you'd need to wrap the elisp form in a wrapper defun or a bare lambda (which I don't recommend). Then you'd do (add-hook 'org-export-preprocess-final-hook #'wrapper-proc-identifier).

But, I still wouldn't do that:

  • That hook triggers when I do any Org export, not just ox-hugo export of a particular site.
  • Also this would be bad if I have multiple Hugo sites (which I do).

Many people might have a might of original content in Markdown and Markdown exported using ox-hugo. So blindly deleting the content directory is not an option. That's why the only way I'd move forward with this is to maintain a database of files touched by ox-hugo.


My suggestion for you would be to have a simple elisp function that you run only when your export your blog content, which will simply first do the delete-directory and then the ox-hugo export. You are looking for disaster if you add that delete-directory to a generic Org hook like that.


With great power comes great responsibility -- Spiderman's uncle

@kaushalmodi
Copy link
Owner Author

@reedlaw If you anyways want to delete the content directory each time and do C-c C-e H A, you can do those over a CI.

I do that use a Makefile to generate the ox-hugo doc site. If you notice, I don't commit the Markdown files for the doc site; I just commit the ox-hugo-manual.org. On Netlify (try it out, it's free!), I then export the while manual using ox-hugo, then run hugo, and then deploy the site. So from my end, I just commit the Org file and push, and then Netlify takes care of the rest.

@reedlaw
Copy link

reedlaw commented Feb 11, 2020

@kaushalmodi thanks for the feedback and suggestions. As I said, this is my naive idea not knowing elisp well. I tried this:

(defun delete-blog-content ()
(delete-directory "~/blog/content" t))
(add-hook 'org-export-preprocess-final-hook #'delete-blog-content)

But it doesn't work. I have one buffer running watch ls blog/content/posts and in the other I'm editing the org file by adding and removing posts. If I eval (delete-blog-content) directly it works.

My current setup uses git to push the content directory to a gitolite server with a post-receive hook that runs hugo in the blog directory. I will probably try adding a org-hugo-export-wim-to-md step to the post-receive hook so that I only have to push changes to the org file. Locally I only use org-export with this blog so I'm not worried about wiping the content directory.

@kaushalmodi
Copy link
Owner Author

kaushalmodi commented Feb 11, 2020

I haven't tested this, but something like this might work for you:

(with-eval-after-load 'org
  (defun my/reset-content-and-re-export-all-posts ()
    "Delete the entire Hugo content/ directory for my blog, and
re-export all sub-trees from current Org file."
    (interactive)
    (delete-directory "/path/to/hugo/content/dir/" :recursive :trash) ;Change the path in this elisp form
    (org-hugo-export-wim-to-md :all-subtrees))

  ;; Set a binding to "C-c h" key in `org-mode-map',
  (define-key org-mode-map (kbd "C-c h") #'my/reset-content-and-re-export-all-posts))

@reedlaw
Copy link

reedlaw commented Feb 12, 2020

@kaushalmodi brilliant, that works great. Thanks for your help and for your work on ox-hugo!

@kaushalmodi
Copy link
Owner Author

I am closing this issue because I haven't seen a real need for this to be baked into ox-hugo in these many years.

And we already have an Elisp solution above for people who want to always delete the old contents dir.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants