Skip to content

Added standard MIME types for all existing types#44

Merged
AJMitev merged 2 commits intoAJMitev:masterfrom
mentallabyrinth:feat/mime-types
Sep 26, 2025
Merged

Added standard MIME types for all existing types#44
AJMitev merged 2 commits intoAJMitev:masterfrom
mentallabyrinth:feat/mime-types

Conversation

@mentallabyrinth
Copy link
Copy Markdown

Based on this issue #42 I took a pass at MIME type inclusion because it's something needed for the application I'm development.

For the most part associations are analog. I used this resource https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/MIME_types/Common_types for many of the types. For file types not listed, I used web search and the file to verify mime types (developing on OSX so windows file type confirmation is out of reach). There's a few file types that are not straight forward: Executable and Mp4 to name a few.

Executable has three associated mime types:

  • application/x-msdownload: more specific, non-standard (indicated by the x- prefix) MIME type for executables on the Windows platform. Some web servers and applications may use it to indicate that a downloaded file is a Windows executable.
  • application/x-msdos-program: used for legacy MS-DOS programs but is sometimes still associated with Windows .exe files due to their backward-compatible MZ header.
  • application/vnd.microsoft.portable-executable: more specific, vendor-registered MIME type that explicitly defines a modern Windows Portable Executable (PE) file. It is the most technically accurate type for contemporary .exe files, but it is not as widely supported or implemented as the others.

I selected for "application/x-msdownload" for executables as it's more widely accepted

MP4 has three associated mime types:

  • video/mp4: for files container both video and audio streams.
  • audio/mp4: for files containing only audio
  • application/mp4: less common -- for files that contain neither visual nor audio streams, but rather other data encapsulated in the MPEG-4 container. Used for specialized applications.

I selected "video/mp4" for MP4, but I suspect the right implementation is determining the magic bytes and creating separate types for each case e.g., Mp4AudioVideo, Mp4Audio, and Mp4Application

But before putting more effort into this endeavor I want to get your feedback and verify this implementation is the right path to travel.

Btw the https://github.com/AJMitev/FileTypeChecker/blob/master/CONTRIBUTING.md is 404 but did my best to follow the development patterns in the code base, including tests. Thank you for the excellent work on the library, and look forward to feedback.

jerone referenced this pull request Sep 22, 2025
@AJMitev
Copy link
Copy Markdown
Owner

AJMitev commented Sep 22, 2025

Could you please share some details about why and where this need is coming from? Why this change is beneficial?

@AJMitev AJMitev self-assigned this Sep 22, 2025
@mentallabyrinth
Copy link
Copy Markdown
Author

Could you please share some details about why and where this need is coming from? Why this change is beneficial?

Sure @AJMitev. The system that I managed uses MIME types, not extensions as a routing mechanism for determining what post processing steps are required after a file is uploaded. Once a file is checked into the system, the old file name is replaced with an auto generated one (extension lost) the only way we can identify the file type is by MIME type.

In addition to my use case, I believe this change is beneficial because MIME types act as an identifier by category. The hierarchal nature of the type, and subtype is well organized and useful. For example, I can write concise code like this:

if (file.MediaType.StartsWith("image"))
{
    await ProcessImageAsync(file, ct);
    return;
}

Rather than something like this:

if (file.Name.EndsWith("png") || file.Name.EndsWith("jpg") || file.Name.EndsWith("webp") || file.Name.EndsWith("heic"))
{
    await ProcessImageAsync(file, ct);
    return;
}

Simply, the MIME type serves as a standardized, hierarchal, content-based label for files that we find to be more useful then straight extensions. I believe this augmentation will be useful for other users of this library too.

Thanks for taking the time to look at my PR :)

@AJMitev
Copy link
Copy Markdown
Owner

AJMitev commented Sep 23, 2025

In your example what is the type of "file"? Is it a stream, a byte array or other? Can't you just use the extension methods that the library currently provides?

bool isImage = file.IsImage();
bool isPdf = file.Is<PortableDocumentFormat>();

@mentallabyrinth
Copy link
Copy Markdown
Author

In your example what is the type of "file"? Is it a stream, a byte array or other? Can't you just use the extension methods that the library currently provides?

bool isImage = file.IsImage();
bool isPdf = file.Is<PortableDocumentFormat>();

The example provided represents a post-process step after the file has been uploaded, and identified (using the changes in this PR). file references a DB record. The extension methods wouldn't work for this case. In this scenario, it's cheaper to access the DB record, than the file content its self.

@jerone
Copy link
Copy Markdown

jerone commented Sep 24, 2025

I was actually hoping the mime-type would be part of the validation check for IFormFile, to validate if the mime-type prop is also valid.

The validation should probably be part of FileTypeChecker.Web.

@AJMitev
Copy link
Copy Markdown
Owner

AJMitev commented Sep 25, 2025

@mentallabyrinth, Actually I am not sure how this change helps your case if you have a db record. In order to receive the FileType you have to use FileTypeValidator.GetFileType with either a Stream or a byte[].

@jerone Why you worry about the MIME type if the file signature is validated?

@mentallabyrinth
Copy link
Copy Markdown
Author

@AJMitev here's a visualization for my previous comment:
image

While the file is uploading we scan the first 4000 bytes using File.TypeChecker. With the code changes proposed, the MIME type is discovered and saved with other file details. Then the file is uploaded into cloud storage, and the ID for the record is put into a queue. A post processing step downstream from the upload step reads the queue and then performs async operations on the file record based on it's MIME type.

Reading the file bytes again from cloud storage to get the MIME Type is not efficient for our operation for the following reasons:

  • Accesses speed is slower than accessing the DB record
  • Cheaper in the long run to read a DB record over reading the file

@jerone
Copy link
Copy Markdown

jerone commented Sep 25, 2025

@jerone Why you worry about the MIME type if the file signature is validated?

For the same reason the extension should be validated too; to confirm the IFormFile is actually correct.
Right now it's possible to craft an IFormFile with an invalid mime-type and a filename that does not match the magic bytes in the stream.

Anyways, we thought the FileTypeChecker.Web packages validated the whole IFormFile, including filename and mime-type. But after some testing and reading the source, we noticed it only checks the stream. So we created our own validation attribute, that first checks the magic bytes and than matches those to the filename extension and mime-type for a complete validation result.

@AJMitev
Copy link
Copy Markdown
Owner

AJMitev commented Sep 26, 2025

@mentallabyrinth Great explanation.

@jerone I believe that MIME Types are overused when working with files because there are no easy ways to check the file's signature. I still don't get it why people care so much about the MIME type or file's extension when you can know the exact FileType. The whole point of this library is to validate the uploaded file through its signature and not to care about other stuff but I believe with this pull request we will introduce this concept of knowing the MIME type.

Great work creating your own attribute. The second goal of mine was to provide a way if you have some specific use case to be able to write your own custom attribute. Btw if you feel like it can match a common use case and can help others you can contribute it to the project or you can share it so I can include it in a future release.

@AJMitev AJMitev self-requested a review September 26, 2025 04:27
@AJMitev AJMitev merged commit 31975ac into AJMitev:master Sep 26, 2025
2 checks passed
@AJMitev AJMitev linked an issue Sep 26, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Add Support for Getting MIME Type

4 participants