Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to parse MimeMessage without fetching attachment data from server #346

Merged
merged 1 commit into from
Dec 24, 2021

Conversation

drencrom
Copy link
Contributor

@drencrom drencrom commented Oct 22, 2021

I´d like to know of there is some way to parse a MimeMessage (like EmailConverter.mimeMessageToEmail) that does not read the contents of all attachments. I have a use case where it is useful to have the Email object but the emails have large attachments that take some time to download from the IMAP server and are not needed.
I suppose it is possible to create an Email object that reads the attachment contents only when they are requested, for example by executing the getAttachments function.
If this does not exist maybe I will develop it myself unless you think that is not possible for some reason I'm not seeing right now.

Thanks

@bbottema
Copy link
Owner

So before I accept this change, what is the use case here? What problem are we solving?

@drencrom
Copy link
Contributor Author

I have an issue with a web application that uses this api (thanks by the way) to parse mails from an IMAP server and show them to the users. Usually I do not need to read the content of the attachments to show the email. It is enough showing the email body and the attachment names. I only need to read the attachment data from the IMAP server when the user wants to open a specific attachment.
I have a client whose IMAP server is the Office360 IMAP server on the cloud and they are very picky about traffic. If you request much data from the server you get a back-off time and can't request data for a few minutes. They also get emails with very large attachments (~ 30 MiB) and to limit traffic to the server I need a way to parse the email without requesting the attachment data.
Maybe this application is too specific but I sent the pull request anyway in case you consider it may be useful for other people.

@bbottema
Copy link
Owner

bbottema commented Oct 26, 2021

Makes perfect sense, but I'm wondering if this really delays downloading the attachments from the server. When you fetch the MimeMessage from IMAP server, doesn't it contain all the data inside as data sources?

final byte[] content = readContent(retrieveInputStream(dataSource));

I would think dataSource is already a fully populated byte array.

@drencrom
Copy link
Contributor Author

drencrom commented Oct 26, 2021

I tested it looking at the IMAP debug log (javax.mail.Session.setDebug(true)) and that shows that the attachment contents are retrieved from the server just when the DataSource InputStream is read and not before that. I have not tested with POP.

As I said in the issue #345 maybe a different implementation could read the DataSource only when the user wants to read the Email AttachmentResource data. I didn't implement it that way because it is harder as I'm not familiar with the code.

@bbottema
Copy link
Owner

Excellent, I learned something new today. I'll integrate the change soon. But don't expect a release soon, though. It's going to be part of the 7.0.0 release.

@bbottema
Copy link
Owner

bbottema commented Dec 24, 2021

I'm merging this, but I'm changing the behavior a bit. I don't want to return the datasource as-is as it isn't always formed the same way depending on the client that sent it.

So I'm retaining the original behavior with the change that the input stream is not fetched and read to a byte array, but the original input stream is reused. Do you see any issue with this?

InputStream is = retrieveInputStream(dataSource);

ByteArrayDataSource result = fetchAttachmentData
	? new ByteArrayDataSource(readContent(is), contentType)
	: new ByteArrayDataSource(is, contentType);

@bbottema bbottema modified the milestones: 7.0.0, 6.6.3 Dec 24, 2021
@bbottema bbottema merged commit 7ff414a into bbottema:develop Dec 24, 2021
@bbottema bbottema changed the title Add option to parse MimeMessage without loading attachment data Add option to parse MimeMessage without fetching attachment data from server Dec 25, 2021
@bbottema
Copy link
Owner

Released in 6.7.0!

@drencrom
Copy link
Contributor Author

Sorry for the delay in answering. I'm not sure if this change will work. Looking at the code for ByteArrayDataSource:

https://github.com/javaee/javamail/blob/8106b9dd15e917da63e96c33f8c6aea1cad40045/mail/src/main/java/javax/mail/util/ByteArrayDataSource.java#L83

It seems that the constructor reads the InputStream itself. I can confirm this on Monday.

@bbottema
Copy link
Owner

Crap, I think you're right. I did this change quick and dirty in between the maintenance releases. In that case I'm not sure what the middle road is yet...

bbottema added a commit that referenced this pull request Dec 25, 2021
…a from server -> Properly return named datasource without fetching all the data if unwanted
@bbottema bbottema modified the milestones: 6.7.0, 6.7.1 Dec 25, 2021
@bbottema
Copy link
Owner

Ok, I think I have a better solution now, released in 6.7.1. Please have a look when you can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants