-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persistently store scraped tweets #23
Comments
We should also store the time the tweet was created and discard tweets after a certain time or allow the user to select a time-range. The latter would probably require the map to be generated client-side. |
I agree a json-file is probably the best option. I don't think we should generate things client side, especially because that might add unnecessary lag, especially in places where there might be very slow internet because of the current circumstances. I want to serve a static html to keep the load times as low as possible. Lets just keep set discard tweet time as a parameter server side. |
We can consider SQLite here too, since it's simple and file-based. It sounds like we're performing some conditional manipulation, and this will help us cut down on time complexity. |
@DomiiBunn mentioned firebase, would work here. |
It depends on the complexity you'd look for. Firebase is a nice balance between file storage(JSON files, SQLite, etc) and standalone databases as it's almost as flexible as and handles security, hosting, high availability and at the usage, we'd be expecting it should be fully free. As long as DB reads are cached that is. |
The reason I'm a little hesitant about firebase is that it adds another steps for developed looking to reproduce the repo and contribute. The simpler the project, the easier it is to contribute (as long as it doesn't impact performance or features). |
Use a config file and specify
That way for a larger deployment it's worth caching and for personal deployment it's still working fine without added complexity |
Or using redis but idk how painful it is to implement with python And i think it would be a bit of an over kill. |
I am working on a fix for duplicate tweets. |
Let's just go with a json file. |
Sounds good to me |
Nvm, I failed miserably at it. |
I'd love to help but python ain't my coup of tea |
Sure-a-mundo |
As discussed in #16, the current storage of scraped tweets is not optimal, because the newly scraped tweets will just be appended to the existing
tweets.txt
-file, creating a lot of duplicates.Integrating a database is probably not necessary at this point, we could store the scraped tweets with their ID in a json-file and only add new ones in the run of the application.
The text was updated successfully, but these errors were encountered: