Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 2 does not like - application/x-ole-storage #343

Open
kapso opened this issue Jan 25, 2025 · 16 comments · Fixed by #347
Open

Version 2 does not like - application/x-ole-storage #343

kapso opened this issue Jan 25, 2025 · 16 comments · Fixed by #347
Assignees
Labels
bug Something isn't working

Comments

@kapso
Copy link

kapso commented Jan 25, 2025

Some older versions of Microsoft Office documents use this type...

You must pass valid content types to the validator (ArgumentError)
'application/x-ole-storage' is not found in Marcel::TYPE_EXTS
@Mth0158
Copy link
Collaborator

Mth0158 commented Jan 25, 2025

Hi @kapso
Thanks for reporting it, it must come from a content_type : [...] option right? The issue would still be the same if you had used a single allowed content type in v1 btw.
If so, you can register the missing mime type in Marcel as mentioned in the readme to fix this issue.
Can you give us more details about your validator & test case?

@Mth0158 Mth0158 added the question Further information is requested label Jan 27, 2025
@tagliala
Copy link
Contributor

Hello, I'm having the same issue/question

I do not understand exactly how this should be registered as a mime type in marcel, because Marcel appears to know it already as a parent mime type (like application/pdf).

This content type is used as a fallback for some old office files, I'm afraid that registering it as msg will turn all application/vnd.ms-outlook into application/x-ole-storage

https://github.com/rails/marcel/blob/170458c687ed22f07d8829043a04e008a2b1936b/lib/marcel/mime_type/definitions.rb#L8-L9

Marcel::MimeType.extend "application/vnd.ms-excel", parents: "application/x-ole-storage"
Marcel::MimeType.extend "application/vnd.ms-powerpoint", parents: "application/x-ole-storage"

Ref: rails/marcel#54

I will investigate this further

@Mth0158
Copy link
Collaborator

Mth0158 commented Jan 27, 2025

Ok I have found the issue.

We needed to extend a bit our allowed content_types, we were only allowing Marcel::TYPE_EXTS whereas we should allow Marcel::TYPE_EXTS + Marcel::MAGIC.

The PR is ready, it should be released soon :)

@Mth0158 Mth0158 added bug Something isn't working and removed question Further information is requested labels Jan 27, 2025
@Mth0158 Mth0158 linked a pull request Jan 27, 2025 that will close this issue
Mth0158 added a commit that referenced this issue Jan 27, 2025
…ke---applicationx-ole-storage

[Validator] Extend allowed content_types using Marcel (#343)
@Mth0158 Mth0158 reopened this Jan 27, 2025
@Mth0158
Copy link
Collaborator

Mth0158 commented Jan 27, 2025

(reopened until released)

@Mth0158 Mth0158 self-assigned this Jan 28, 2025
@kapso
Copy link
Author

kapso commented Jan 28, 2025

@Mth0158 I tried v2.0.1, and now seeing this for .docm files, there's also .xlsm and .pptm files

/Users/kapil/.rbenv/versions/3.4.1/lib/ruby/gems/3.4.0/gems/active_storage_validations-2.0.1/lib/active_storage_validations/content_type_validator.rb:186:in 'block in ActiveStorageValidations::ContentTypeValidator#ensure_content_types_validity': You must pass valid content types to the validator (ArgumentError) 'application/vnd.ms-word.document.macroEnabled.12' is not found in Marcel content types (Marcel::TYPE_EXTS + Marcel::MAGIC)

@Mth0158
Copy link
Collaborator

Mth0158 commented Jan 28, 2025

@kapso arf, I thought this release would have resolved all these issues
Can you specify the content type you are validating so I can investigate? And can you validate that it now works for x-ole-storage (it should)?

@tagliala
Copy link
Contributor

application/vnd.ms-word.document.macroEnabled.12

I guess this is not a known mime type and it may make sense to add it to Marcel, but I also guess that this can mess up with existing files

@Mth0158 can I ask why this validation check is required?

@Mth0158
Copy link
Collaborator

Mth0158 commented Jan 29, 2025

@kapso I was a bit disturb by the fact that I was finding your content_type referenced in Marcel ... but here is the catch, the correct content_type is application/vnd.ms-word.document.macroenabled.12 (with lower e for enabled).

See: https://github.com/rails/marcel/blob/170458c687ed22f07d8829043a04e008a2b1936b/lib/marcel/tables.rb#L1560

@Mth0158
Copy link
Collaborator

Mth0158 commented Jan 29, 2025

@tagliala the check is required because the gem will compare the validator given content_type options with the blob detected content_type.
This blob detected content_type is done by Rails using Marcel (link) :) Therefore if you give a content_type option not referenced in Marcel the gem will never allow validation.

@Mth0158 Mth0158 closed this as completed Jan 29, 2025
@kapso
Copy link
Author

kapso commented Jan 29, 2025

@Mth0158 thanks, yea lower case e solved the issue. But now getting this issue

'application/x-zip-compressed' is not found in Marcel content types (Marcel::TYPE_EXTS + Marcel::MAGIC)

@Mth0158
Copy link
Collaborator

Mth0158 commented Jan 30, 2025

@kapso, reading Marcel gem, I found that application/x-zip-compressed is an alias for application/zip. You can replace it by application/zip.

  <mime-type type="application/zip">
    <_comment>Compressed Archive File</_comment>
    <tika:link>http://en.wikipedia.org/wiki/ZIP_(file_format)</tika:link>
    <tika:uti>com.pkware.zip-archive</tika:uti>
    <alias type="application/x-zip-compressed"/>
    <magic priority="50">
      <match value="PK\003\004" type="string" offset="0"/>
      <match value="PK\005\006" type="string" offset="0"/>
      <match value="PK\x07\x08" type="string" offset="0"/>
    </magic>
    <glob pattern="*.zip"/>
  </mime-type>

https://github.com/rails/marcel/blob/170458c687ed22f07d8829043a04e008a2b1936b/data/tika.xml#L4891C20-L4891C35

@kapso
Copy link
Author

kapso commented Jan 31, 2025

@Mth0158 thanks yea that fixed the zip issue.

But now seeing this for .stl file - its Used in 3D printing, CAD software, and 3D modeling

'model/stl' is not found in Marcel content types (Marcel::TYPE_EXTS + Marcel::MAGIC)

@Mth0158
Copy link
Collaborator

Mth0158 commented Jan 31, 2025

@kapso This one is really not found in Marcel, to make it work, you will need to extend Marcel behaviour with something like:

Marcel::MimeType.extend "application/ino", extensions: %w(ino), parents: "text/plain" # Registering arduino INO files

Be sure to define at least the extensions or parents options. There is an issue with extending marcel that is solved on master but not yet released (will be released with 2.0.2 today or tomorrow).

@Mth0158 Mth0158 reopened this Jan 31, 2025
@tagliala
Copy link
Contributor

tagliala commented Jan 31, 2025

@Mth0158 thanks for all your insights and also for the fixes

I have a question that I would like to ask on Marcel's repo, but since we are discussing about mime types and parents, I would like to ask if you have an opinion about the following snippet from Marcel:

https://github.com/rails/marcel/blob/170458c687ed22f07d8829043a04e008a2b1936b/lib/marcel/mime_type/definitions.rb#L8-L9

Marcel::MimeType.extend "application/vnd.ms-excel", parents: "application/x-ole-storage"
Marcel::MimeType.extend "application/vnd.ms-powerpoint", parents: "application/x-ole-storage"

We have issues on some email messages from legacy Outlook clients being detected as x-ole-storage (which as per marcel itself is a fallback of some old office files rails/marcel#54)

I was expecting to see there also

Marcel::MimeType.extend "application/vnd.ms-outlook", parents: "application/x-ole-storage"

but it is not the case. Do you think that it should be there?

@Mth0158
Copy link
Collaborator

Mth0158 commented Jan 31, 2025

Hi @tagliala,

From my understanding, if some of your files are detected as application/x-ole-storage it means that their binary first 16 bytes contain the following b["\320\317\021\340\241\261\032\341"] (https://github.com/rails/marcel/blob/170458c687ed22f07d8829043a04e008a2b1936b/lib/marcel/tables.rb#L2613)

It is detected as application/x-ole-storage primarily because Rails detects content_type through the following:

Marcel::MimeType.for io, name: filename.to_s, declared_type: content_type

Rails: https://github.com/rails/rails/blob/main/activestorage/app/models/active_storage/blob.rb#L345
Marcel: https://github.com/rails/marcel/blob/170458c687ed22f07d8829043a04e008a2b1936b/lib/marcel/mime_type.rb#L29C1-L32C10

This method from Marcel returns the most precise content_type detected based on the content_types detected by each of its parameters.

I guess that in your case, Marcel finds application/x-ole-storage content_type through the file io, and something similar or less precise through your file filename / declared_type. It probably never finds application/vnd.ms-outlook as a valid content type for filename / declared_type, would it have found it, it would return application/vnd.ms-outlook rather than application/x-ole-storage since vnd is more precise.

@Mth0158
Copy link
Collaborator

Mth0158 commented Feb 2, 2025

@kapso FYI 2.0.2 has been released with the Marcel extend fix.

Let me know if it solves your issue :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants