-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW]: Trash AI: A Web GUI for Serverless Computer Vision Analysis of Images of Trash #5136
Comments
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks. For a list of things I can do to help you, just type:
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
|
|
Wordcount for |
|
@domna, @luxaritas – This is the review thread for the paper. All of our communications will happen here from now on. Please read the "Reviewer instructions & questions" in the first comment above. Please create your checklist typing:
As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines. The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention We aim for the review process to be completed within about 4-6 weeks but please make a start well ahead of this as JOSS reviews are by their nature iterative and any early feedback you may be able to provide to the author will be very helpful in meeting this schedule. |
Review checklist for @domnaConflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
|
Review checklist for @luxaritasConflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
|
Just finished an initial pass looking over this, and wanted to share a few high-level comments (not submitting issues at this point because I'm not sure how much is actionable in that context, but I'm happy to for anything where it would be useful):
|
Hey @luxaritas, Thanks for the initial pass. I am creating some issues now based on these helpful comments and have responses below. Will update you when the issues are fixed. It does not appear the submitting author (@wincowgerDEV) made significant contributions to the software itself - the only commits I see attributed are the paper and a minor tweak to deployment configurations Joss submission guidelines require that the submitting author is a major contributor of the software: https://joss.readthedocs.io/en/latest/submitting.html I'm on the fence about whether this qualifies under the JOSS substantivity requirement. On one hand, I'm definitely on board with the value of having an intuitive interface to do this work. On the other hand, I'm not fully sold on the value as currently implemented: The current service lacks substantial "workflow" tools (eg, some sort of projects/tagging, multiple collaborators, additional analytics, ...) and some level of polish/UX work to the degree that it almost feels like a Google Colab notebook could serve the purpose more efficiently (point it to a folder of images, run the nodebook, get some summary stats back out). **I disagree with this comment. Most researchers in my field would not know how to use a Google Colab notebook to do this work. The fact that computer vision is almost solely run using programming languages is preventing its adoption in trash research, this is the premise of our work. I also am not aware of any needs they would have for these additional workflow tools that you are mentioning but would be happy to consider them if you can provide some specifics. During the prereview some of these questions came up from @arfon as well and I had additional responses there: #5005 Please refer to the JOSS Submission guidelines here for the substantiative requirement and consider our in-line responses to why we should be considered meeting it: https://joss.readthedocs.io/en/latest/submitting.html. The guidelines state these requirements:
It also seems like the cloud deployment approach here doesn't provide much value for someone wanting to deploy it on their own - this seems like a great candidate to deploy to something like GitHub pages, Netlify, etc whereas the current approach is pretty complex for what it is. The only thing that prevents it from being deployed in that way is the collection of images to feed back to the model, which hasn't actually been followed through on yet. Even with the current structure, I'll also note that the deployment scripts and such could probably be simplified and consolidated a bit - when taking a brief look at it, it felt like I had to jump around a bit to find all the different bits and pieces and had a lot of different calls between different tools (eg, why use makefiles vs just having a single docker-compose in the root and use docker-compose up?). Thanks for this comment, I just created an issue for the second part of this comment: code4sac/trash-ai#106. For the first part, I do not believe that deployment to proprietary infrastructures is within the primary scope of JOSS: https://joss.readthedocs.io/en/latest/submitting.html#submissions-using-proprietary-languages-development-environments. We made a local deployment option: https://github.com/code4sac/trash-ai/blob/production/docs/localdev.md which uses a simple docker-compose deployment. Have you tried to use this? Additionally, while I know the model isn't yours, when I sampled a couple images, it seems like the model is sufficiently unreliable at least to me that I'm not confident I would trust it to do the analysis for me as a researcher. I also have concerns about whether it would be fast enough just running locally to analyze the amount of images that would be needed in a research context (though not being a subject matter expert, I can't make that judgement with any real confidence). **We did create the model and are working on it to improve it. However, the model itself isn't within the scope of a JOSS review to my knowledge, it would need to be reviewed by a machine learning journal. We are aware that it can still be improved in accuracy and that was mentioned in the video. One of the main challenges with improving its accuracy is a lack of labeled images from the diversity of possible settings that trash can be in. We are hoping that people will use the platform and share images to it so that we can relabel them and improve the model in the long term. Have you attempted to use the local deployment option? It is extremely fast and runs asynchronously so it can be run in the background. ** To be clear I definitely think this has the promise to be really useful, but I'm not sure it hits the mark as-is. Thanks for the kind words. While the video walkthrough is great, it would probably be best to have a bit more written instruction on usage in the README/on the website. I think the setup documentation could also be improved a bit. The local setup seems to refer to specifics of WSL a lot where it could be more system-agnostic, and while the usage of GH workflows is novel, the AWS deployment instructions are a bit nonstandard in that regard plus seem to assume that you're deploying it from the repo itself/as a repo owner. **Thanks for the kind words, I created this issue to correct these issues: code4sac/trash-ai#107 ** There doesn't seem to be any automated tests. While the desired behavior here is relatively clear/trivial, there's no formal instructions to verify behavior. Created this issue: code4sac/trash-ai#108, there are many automated tests baked into the docker workflow but you are correct that they should be more clear in the documentation and there should be a formal validation proceedure. As tagged by editorialbot, it looks like there are some citation formatting issues Thanks for pointing that out. Created this issue to resolve them: code4sac/trash-ai#109 |
Thanks for the clarifications @wincowgerDEV!
Understood - thanks for the additional detail to help verify.
My perspective on this is limited due to my limited knowledge of this field, so I appreciate your perspective. I will note that it should be possible to use a Colab notebook without actually having to be familiar with the programming itself, though I recognize that it's not the most user-friendly experience (and on investigating the approaches I had in mind, I think it's probably not as good an option as I thought!).
Yes, I have, and it works great. However, I still think this is at least somewhat relevant. In order for this package to be useful to non-technical researchers, it has to be deployed. That can either be serviced by your own hosted instance (this is great, and I think is a real value add based on your statement of need, though if I incorporate that into my consideration of amount-of-work then I should include it as at least some small component of the review), or by having it self-deployed (in which case, someone else needs to be able to have a clear way to host it, whereas right now your options are either 1) what is designed to be a local development environment or 2) a cloud-based deployment workflow which is hard to adapt for third-party use (and in both cases has a bunch of components that a typical install wouldn't really need). Additionally, I would consider ease of self-deployment as part of a high-quality open-source package, and if you include this as part of your primary documentation (as you do in the README) I would consider it as something which should be functional and understandable for an end-user (rather than an internal detail of the repo).
Ah, I misunderstood what I read previously - I see now. While the model itself is not directly within the scope of JOSS, I think it is still somewhat relevant - without a usable model, the value of your application to researchers in the way your describe can't really be realized. I don't intend for this to be a critical component to the review itself, but moreso an additional data point to other aspects of the review. This is especially true as because you've trained the model yourself, the model you're providing is part of the product being delivered (and so has additional implications on the "functionality" requirements).
I was just basing this off of what I saw in the demo video, as I don't have a lot of sample data to work with myself, so my comment on performance is somewhat speculative - again I don't have knowledge of how this would be used in the wild, so my intent was to flag that as something I could see as being an issue, but don't have the knowledge to verify myself. If you've found the performance to be sufficient with your understanding of the use cases, I don't have any real concern. |
@luxaritas Thanks for the thoughtful response back. Some responses below to follow up. We will get to work on these aspects and circle back when they are ready.
I completely agree that it is possible to do all of this within a Colab notebook or some other programmable interface. Will share with the group to think some more about how we can better integrate this application with programmable interfaces. I think it will have to be part of a longer term development timeline though since several other workflows exist.
Happy for any comments you have about the remote hosting options and we will do our best to incorporate them. Agreed that this is the option that most developers will want to use. Absolutely agree that the primary documentation should lead to function and understandable self deployment. If you end up running into any issues related to that please let us know and we will fix them.
Agreed here too! It is our highest priority for development right now to improve the model. Definitely, if the model accuracy is not leading the functional use of the tool then its an issue we should resolve in this tool and I don't think that the only resolution is to improve the model accuracy. We currently report the model confidence in the labels which I believe helps the user to interpret the accuracy. Perhaps there is another aspect of the software you think would be helpful to add for a user who is facing functionality-related issues?
Appreciate your flexibility on this point. |
Hi @wincowgerDEV and @luxaritas , thank you for the detailed discussion. It already addressed a lot of points I had while reviewing the software. However, I'd like to add my view to a few points and add some additional thoughts. DeploymentI agree with @luxaritas view on the deployment, that the software should be easily deployable to any hosting solution or local deployment. While you provide a
PrivacyWhile it is stated in the paper that some of the images are uploaded to an S3 storage there is no information about it on your trashai.org webpage. I think it is at least necessary to put a disclaimer to users that their image data is uploaded and that images may contain delicate information through exif data (because I think the targeted audience is not always aware of what information images may contain). In my view it would even be better to have an opt-out option for image upload and/or stripping unnecessary parts of the exif data while uploading and/or informing users of uploading process to aws while also putting information resources to exif data directly on the page. Example data / TutorialI find your video tutorial very nice and good to follow. Additionally, I think it would be nice if you'd also have a folder of example data with which new users could directly follow your tutorial. Maybe even with a post-processing example of json data. Documentation and data structureI agree with @luxaritas that the written documentation could be expanded. What I miss the most is a detailed explanation of the json data structure for post-processing software. Which fields are expected and what information do they contain? Are there fields which are not available for all data. As I am not a familiar with trash research I was also wondering whether there is some standardized format for data exchange on trash location data and labelling which could be used here (I think the Hapich et al 2022 paper is elaborating on this, so it would be nice to read some more details of the connection of this trash taxonomy to the data format of trashai). Also I understand that your targeted audience is less technical so dealing with json data may be a barrier. I think it would also be good to offer a pdf based overview (like an analysis page which could just be printed into a pdf - so users could directly have a map overview of their trash expedition). Reference to other softwareI was missing information on which software people typically use for researching trash images. Since I'm not familiar with the field I may not be aware that such software is not really available, but I think it would be useful to give a glimpse into the working process around your tool (like: what do I actually do with the downloaded json data?). It would also be interesting to know whether there are some widely known databases where you can put your trash data, as I think the useful part of such information is putting a lot of data from different researchers together to give a broad overview of littering places. I think especially it would be nice to have some guidance what to do with the json data downloaded by your tool. Is there some analysis software for that? I was also very happy to read that you plan to upload your data to the taco dataset. Do you also plan on adding a feature where users can directly annotate their data in trashai and then upload the annotation together with the images to trashai. I think this could be a powerful tool of advancing the taco dataset, since it would target the erroneous classifications of the model directly. Future of the softwareThis is just more out of curiosity for your future plans. I find your software a super helpful tool for annotation. However, I was wondering if you plan to bring the data of different sources together. As you state in your paper such information is often used for policy decisions and I think the approach is really useful when data from different places come together. So going in the direction of having a database for trash researchers where data can be easily uploaded from trashai and policy makers can search for locations to get information in this area could be extremely helpful. Are you already thinking in this direction? |
To be clear, I'm not necessarily advocating for this as an addition to the GUI - the reason I brought it up was more around trying to figure out the value proposition of the UI itself and whether it was providing something that couldn't be done in a simpler way, and it sounds like the UI is important to make CV tools accessible to researchers. Integration with programable interfaces could still be valuable though (at the very least, I'd imagine it would be a good idea for the model to be available directly and not just via UI).
The confidence labels are definitely a great idea. At some level though model accuracy itself is a barrier to usefulness of this tool - no matter what else you do, if you have to manually re-tag all the images anyways it's not helping automate the process which is the whole point... Unfortunately it's a weird situation where the model itself isn't the primary thing under scrutiny for JOSS, but it's a critical part of the application's functionality |
@domna Thanks so much for the comments and thorough review. Some responses in line below.
|
@luxaritas Thanks for the follow up clarification.
|
@wincowgerDEV Thank you for the detailed response and opening of issues.
I think I explained this point a little over-complicatedly. I just meant that it would be good to have a documentation on how to generate a static javascript bundle and upload it to a self-hosted webspace (e.g. locally for people wanting to use this model in an intranet or so).
in the documentation. Regarding the disabled backend I've seen that it is possible to just use the frontend without any issues while testing this.
Yes, you directly build in the docker-compose file, so people need to clone the repository. If you would upload a pre-compiled container people could just download a docker-compose file and just run it from there. The container would be downloaded automatically from the repository and started. Also your A simple
inside the frontend dir. This would just create an nginx based container with the frontend copied as static files into its public directory.
👍️
👍️
👍️
👍️
Yes, bringing all of this data from different sources together is probably a hard endeavor. While I think in the long run it will be extremely helpful to have such a database I think it's a lot of work of harmonising different metadata schemes and building the apis. I don't know if there has anything happened yet in this direction in your research community (like metadata standards or so) |
@domna Thanks for the feedback here. I created new issues for both of your first two points, love the ideas and its a lot more clear to me now what you were thinking. I need to circle back with the developers on this to see how challenging it will be to integrate these but will definitely give them a shot.
Totally agree its a big messy process to harmonize schemas. There some standards for reporting in the field but nothing like a JSON Schema to my knowledge. Everything is expected to live in local databases in the field today or in spreadsheet-like files. We will build from those advancements to inform our work here. |
Hey @luxaritas and @domna, Hope you are both doing well. Just checking in that you are both done with your first round of reviews. If so, I will begin making revisions on the repo. |
Hi @wincowgerDEV, yes, thank you. Please go ahead and start your revisions. Feel free to also refer to me in the issues and I'll follow the updates. Looking forward to your changes! |
👍 I think it makes sense to start incorporating reviewer feedback at this stage @wincowgerDEV. |
Awesome! Thank you 🙂
…On Fri, Apr 14, 2023, 12:12 AM Arfon Smith ***@***.***> wrote:
Hey @luxaritas <https://github.com/luxaritas> and @domna
<https://github.com/domna>, Hope you are both doing well. Just checking
in that you are both done with your first round of reviews. If so, I will
begin making revisions on the repo.
👍 I think it makes sense to start incorporating reviewer feedback at
this stage @wincowgerDEV <https://github.com/wincowgerDEV>.
—
Reply to this email directly, view it on GitHub
<#5136 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGMUJU37IB5LIVVWSZKBI4DXBD2FHANCNFSM6AAAAAAUVXU22I>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@wincowgerDEV – looks like we're very close to being done here. I will circle back here next week, but in the meantime, please give your own paper a final read to check for any potential typos etc. After that, could you make a new release of this software that includes the changes that have resulted from this review. Then, please make an archive of the software in Zenodo/figshare/other service and update this thread with the DOI of the archive? For the Zenodo/figshare archive, please make sure that:
|
Thanks @arfon, responses below:
Went through one last time and all seems correct.
I made this zenodo repository with the most up to date release, titled it the same as the manuscript and added the same authors in order: |
@editorialbot set 10.5281/zenodo.8384126 as archive |
Done! archive is now 10.5281/zenodo.8384126 |
@editorialbot recommend-accept |
|
|
👋 @openjournals/dsais-eics, this paper is ready to be accepted and published. Check final proof 👉📄 Download article If the paper PDF and the deposit XML files look good in openjournals/joss-papers#4629, then you can now move forward with accepting the submission by compiling again with the command |
@wincowgerDEV – can you please merge this PR? code4sac/trash-ai#192 |
@editorialbot recommend-accept |
|
|
👋 @openjournals/dsais-eics, this paper is ready to be accepted and published. Check final proof 👉📄 Download article If the paper PDF and the deposit XML files look good in openjournals/joss-papers#4632, then you can now move forward with accepting the submission by compiling again with the command |
@editorialbot accept |
|
Ensure proper citation by uploading a plain text CITATION.cff file to the default branch of your repository. If using GitHub, a Cite this repository menu will appear in the About section, containing both APA and BibTeX formats. When exported to Zotero using a browser plugin, Zotero will automatically create an entry using the information contained in the .cff file. You can copy the contents for your CITATION.cff file here: CITATION.cff
If the repository is not hosted on GitHub, a .cff file can still be uploaded to set your preferred citation. Users will be able to manually copy and paste the citation. |
🐘🐘🐘 👉 Toot for this paper 👈 🐘🐘🐘 |
🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨 Here's what you must now do:
Any issues? Notify your editorial technical team... |
@domna, @luxaritas – many thanks for your reviews here! JOSS relies upon the volunteer effort of people like you and we simply wouldn't be able to do this without you ✨ @wincowgerDEV – your paper is now accepted and published in JOSS ⚡🚀💥 |
🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉 If you would like to include a link to your paper from your README use the following code snippets:
This is how it will look in your documentation: We need your help! The Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:
|
Wonderful! Thank you so much @arfon, @luxaritas, and @domna for all your hard work helping us bring this project to life. |
Congrats 🎉 |
Submitting author: @wincowgerDEV (Win Cowger)
Repository: https://github.com/code4sac/trash-ai
Branch with paper.md (empty if default branch):
Version: 1.0
Editor: @arfon
Reviewers: @domna, @luxaritas
Archive: 10.5281/zenodo.8384126
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@domna & @luxaritas, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review.
First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @arfon know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Checklists
📝 Checklist for @domna
📝 Checklist for @luxaritas
The text was updated successfully, but these errors were encountered: