Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement lynx-backed browser tool #205

Closed
ErikBjare opened this issue Oct 20, 2024 · 2 comments · Fixed by #214
Closed

Implement lynx-backed browser tool #205

ErikBjare opened this issue Oct 20, 2024 · 2 comments · Fixed by #214

Comments

@ErikBjare
Copy link
Owner

ErikBjare commented Oct 20, 2024

This would be a lot more lightweight than using playwright, is a lot more responsive, and honestly results in better output.

lynx --dump --display_charset=utf-8 https://erik.bjareholt.com

Runs in 1sec!

Output
   #[1]publisher

   [2]erik.bjareholt.com (BUTTON)
     * [3]Blog
     * [4]Wiki
     * [5]About
     * [6]Jobs

     * [7]Contact

Hi! I'm Erik 👋🏼

   I build free and open source software for fun and the betterment of
   mankind üåéüåçüåè. I also do things with markets for üí∏.

   I'm hiring! Check out the [8]jobs page.

Projects

     * Founder and maintainer of [9]ActivityWatch üìä ([10]GitHub)
          + "The worlds best free and open-source automated time-tracker"
     * Maintainer of [11]uniswap-python üìà
          + Python wrapper for the [12]Uniswap contracts.
     * Maintainer of [13]eeg-notebooks üß
          + "Democratizing the cognitive neuroscience experiment"
          + A collection of EEG experiments & notebooks.
     * My [14]MSc thesis about classifying brain activity (EEG) of
       developers üß
     * And lots of [15]other stuff ‚ú®

This website

   Here you can read my [16]blog where I (used to) write about things I
   enjoy thinking about. You can also learn more about me by checking out
   the [17]about page or visit some of my various profiles online as
   listed below.

   Note: Unfortunately the blog and wiki haven't been updated in a long
   time, and that's probably not going to change anytime soon.

Latest blog posts

     * 20th April 2014 » [18]VR: Looking the other way
     * 10th April 2014 » [19]What would you do with your data?
     * 18th March 2014 » [20]Humble beginnings

Selected wiki pages

     * [21]The Importance of Open Recommender Systems
     * [22]Quantified Self

   Check out the [23]wiki index for a full list of pages.

Follow me

   Twitter
   GitHub
   LinkedIn

I'm also on

   Keybase
   Reddit
   Facebook
   StackOverflow
   Quora
   Wikipedia
   YouTube

   Erik Bjäreholt 2014-

References

   Visible links:
   1. https://plus.google.com/+ErikBjareholt
   2. https://erik.bjareholt.com/
   3. https://erik.bjareholt.com/blog/
   4. https://erik.bjareholt.com/wiki/
   5. https://erik.bjareholt.com/about/
   6. https://erik.bjareholt.com/jobs/
   7. https://erik.bjareholt.com/contact/
   8. https://erik.bjareholt.com/jobs
   9. https://activitywatch.net/
  10. https://github.com/ActivityWatch
  11. https://github.com/shanefontaine/uniswap-python/
  12. https://uniswap.org/
  13. https://github.com/NeuroTechX/eeg-notebooks
  14. https://github.com/ErikBjare/thesis
  15. https://github.com/search?o=desc&q=user:ErikBjare&s=stars&type=Repositories
  16. https://erik.bjareholt.com/blog
  17. https://erik.bjareholt.com/about/
  18. https://erik.bjareholt.com/blog/2014/04/20/looking-the-other-way/
  19. https://erik.bjareholt.com/blog/2014/04/10/What-would-you-do-with-your-data/
  20. https://erik.bjareholt.com/blog/2014/03/18/humble-beginnings/
  21. https://erik.bjareholt.com/wiki/importance-of-open-recommendation-systems/
  22. https://erik.bjareholt.com/wiki/quantified-self/
  23. https://erik.bjareholt.com/wiki/

   Hidden links:
  25. https://twitter.com/ErikBjare
  26. https://github.com/ErikBjare
  27. https://www.linkedin.com/in/erikbjareholt
  28. https://keybase.io/erb
  29. https://www.reddit.com/user/ErikBjare
  30. https://www.facebook.com/erik.bjareholt
  31. https://stackoverflow.com/users/965332/erb
  32. https://www.quora.com/Erik-Bjäreholt
  33. https://en.wikipedia.org/wiki/User:Erik.Bjareholt
  34. https://www.youtube.com/@ErikBjareholt

Can even do:

lynx https://arxiv.org/pdf/2406.06592 --dump > 2406.06592.pdf

Notably, lynx https://arxiv.org/pdf/2406.06592 --dump | head -n1 becomes %PDF-1.5, so we could always check for this if we accidentally read a PDF file thinking it was readable text.

@aborruso
Copy link

For this kind of task, I love trafilatura too

trafilatura -u https://erik.bjareholt.com

I build free and open source software for fun and the betterment of mankind 🌎🌍🌏. I also do things with markets for 💸.
I'm hiring! Check out the jobs page.
Here you can read my blog where I (used to) write about things I enjoy thinking about. You can also learn more about me by checking out the about page or visit some of my various profiles online as listed below.
Note: Unfortunately the blog and wiki haven't been updated in a long time, and that's probably not going to change anytime soon.
Check out the wiki index for a full list of pages.

@ErikBjare
Copy link
Owner Author

@aborruso Thanks for the tip, never heard of it before, looks great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants