Skip to content

Conversation

@glpetrikov
Copy link

@glpetrikov glpetrikov commented Nov 27, 2025

Add Premake as a separate language (premake5.lua)

Description

Premake is a quite popular Lua-based build configuration tool used extensively in the C++ community.

Currently premake5.lua files are classified as generic Lua, which significantly distorts language statistics for many repositories.

This PR adds Premake as a distinct language with its canonical filenames, official brand color, and a heuristic.

Example of a notable project using Premake:

  • Hazel Engine (12.7k stars, 1.6k forks) - Game engine by The Cherno, widely used as a learning resource in the C++ community

Official links:

Real World examples:
Source: https://github.com/TheCherno/glfw/blob/9bed794ab7c1b961aaca259403695bbd3870d3b3/premake5.lua
License: zlib/libpng (GLFW)

Checklist:

      workspace "MyWorkspace"
        configurations { "Debug", "Release" }
      
      project "MyProject"
        kind "ConsoleApp"
        language "C++"
        files { "**.h", "**.cpp" }
      
        filter { "configurations:Debug" }
          defines { "DEBUG" }
          symbols "On"
      
        filter { "configurations:Release" }
          defines { "NDEBUG" }
          optimize "On"
  Source: https://premake.github.io/docs/What-Is-Premake
  License: BSD-3-Clause (Premake documentation)
  • I have added a color

    • Hex value: #e67e22

    • Rationale: one of Premake's Official Brand Colors, used in the logo

  • I have updated the heuristics to distinguish my language from others using the same extension.

    • Matches Premake-specific top-level keywords while avoiding false positives on regular Lua scripts

@glpetrikov glpetrikov requested a review from a team as a code owner November 27, 2025 10:07
@glpetrikov
Copy link
Author

In my projects, as in many others, GitHub's language statistics give the impression that a significant portion of the code is written in Lua, especially if the project has multiple premake-files, when in fact these are just premake files. This PR helps distinguish premakes from regular Lua.

@lildude
Copy link
Member

lildude commented Nov 27, 2025

I'm don't think this should be pulled out into it's own entry as Premake appears to be a build system, like Bazel, that is written in Lua and not it's own language. We don't have a specific entry for Bazel as it's not a language. Instead it is included in Starlark which is the language Bazel uses.

Build systems are essentially like a framework written in another language. We don't add support for frameworks as that's not the language, hence we don't have support for React or Rails or Phoenix or any other framework.

@glpetrikov
Copy link
Author

glpetrikov commented Nov 27, 2025

Hey @lildude, thanks a lot for the quick review!

I totally understand the reasoning about build systems vs languages — Bazel → buildSystem, CMake → buildSystem, C++ → Lang, etc.

One important distinction: frameworks like React, Rails, or Phoenix are libraries used inside an existing language. Premake is not a library — it is a DSL-based build configuration system with a canonical filename, just like CMake, Meson, and Ninja.

Premake is a bit of a special case, and that’s why I opened this PR:

1. Filename-based, not extension-based

Premake files are always named exactly premake5.lua or premake4.lua (never .premake, never arbitrary filenames).
This is not a generic Lua script — it's the canonical entry point, just like CMakeLists.txt is for CMake.

2. CMake has been a separate language since 2012

CMake is also “just a build system with its own DSL”, yet it has its own entry, color, and heuristics.
The same logic has been applied to Meson, Ninja, Buck, etc.

3. Massive distortion of statistics

  • TheCherno/Hazel (12.7k) shows 2.3% Lua — 100% of that comes from the premake5.lua file.
  • My own project (FrameLog) shows 18% Lua — again, 100% of that is Premake files.
  • Thousands of C++ GitHub repositories show 5–35% “Lua” purely because of Premake.

This is exactly the same statistical problem that the CMake entry solved years ago.

4. Zero risk

The heuristic is extremely strict (matching top-level Premake-specific keywords like workspace, project, kind, cppdialect, include, etc.) and only applies to the two canonical filenames.

5. Avoiding misleading interpretations

Lua appearing in statistics makes some users assume the project requires an embedded Lua runtime or that part of its logic is written in Lua, which is not the case.
This is a common source of confusion for C/C++ repositories.

Additionally, one of Linguist’s core principles is to avoid skewed language statistics.
Leaving Premake unrecognized distorts statistics for a very large number of repositories in the C/C++ ecosystem.

This is literally the same precedent as CMake — just 13 years later.

I’d be happy to adjust anything needed, but I believe this fits very cleanly into Linguist’s existing policy for configuration-oriented DSLs with canonical filenames.

Thanks again for considering!

Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed supporting explanation for this which makes sense to me.

Along with my inline comments, we need a real-world sample for every filename added. Please add a sample for each and update the PR template to state where they come from (not your fork of linguist 😉) and the licence of each sample.

- language: OASv3-yaml
pattern: 'openapi:\s?''?"?3.[0-9.]+''?"?'
- language: YAML
- extensions: ['.lua']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not adding an extension as part of your new entry so this heuristic is not needed as it'll never be used. Please remove it.

filenames:
- premake5.lua
- premake4.lua
interpreters:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is for what is likely to occur in shebangs. Your search results indicate these files never have a shebang so we don't need this section.

@glpetrikov
Copy link
Author

Thank you @lildude!

I've made all the necessary edits:

  • Removed the heuristics section (not needed for filename-based detection)
  • Removed the interpreters section (premake files don't use shebangs)
  • Added real-world samples for both premake4.lua and premake5.lua with sources and licenses

Ready for review!

@glpetrikov glpetrikov requested a review from lildude November 28, 2025 08:12
@glpetrikov
Copy link
Author

Hey @lildude, just wanted to check if there’s anything else needed from my side to get this PR merged? All requested edits have been applied and real-world samples added. Thanks!

Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have included a real-world usage sample for all extensions added in this PR:

You need to actually include the files in the PR, not the PR template, as they're used to train the classifier. We need at least one per filename being added with the template updated to link to each source (not your fork of linguist) with the license for each stated. The samples also need to be real-world examples so sourcing from the docs generally isn't a good indicator of real-world usage.

You'll also need to update the grammars README so that it reports the source of the grammar. The command to run is shown when you run the tests so I've approved CI to run on this PR.

@glpetrikov
Copy link
Author

Hey @lildude, I think I've made all the required changes.
I deleted premake4.lua because it's long outdated and hard to find a real example.
I also added a real-world premake5.lua example with source and license, and updated vendor/grammars/README.md with a link to it.

@glpetrikov glpetrikov requested a review from lildude December 9, 2025 14:16
@lildude
Copy link
Member

lildude commented Dec 9, 2025

You've not resolved the ordering test failure.

@glpetrikov
Copy link
Author

I hope I fixed it now?

@glpetrikov glpetrikov marked this pull request as draft December 9, 2025 16:14
@glpetrikov glpetrikov marked this pull request as ready for review December 9, 2025 16:31
@glpetrikov glpetrikov marked this pull request as draft December 9, 2025 16:37
@lildude
Copy link
Member

lildude commented Dec 9, 2025

I hope I fixed it now?

Nope. The entry needs to be in alphabetic order.

@glpetrikov glpetrikov marked this pull request as ready for review December 9, 2025 17:08
@glpetrikov
Copy link
Author

It's in alphabetical order, isn't it?
Something happened with my git, and the commits got mixed up for some reason. I rebased it and it seems to have fixed everything.

@glpetrikov
Copy link
Author

glpetrikov commented Dec 9, 2025

Look again, when you looked there, the commits were mixed up (and first five commits ended up at the top in reverse order and broke everything). It should be correct now.

@glpetrikov
Copy link
Author

Ready for review!

I've removed premake4.lua because it is outdated.
I’ve also added a real-world premake5.lua example with source code and a license, and updated vendor/README.md using a script.
Everything is now in alphabetical order.

@lildude
Copy link
Member

lildude commented Dec 10, 2025

Everything is now in alphabetical order.

It's not. If you run the tests you'll see it'll still fail. The section needs to go after Praat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants