Skip to content

San7o/micro-lex.h

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

micro-lex.h
===========

Simple lexer in C99 for lisp-like languages and others.

Recognized tokens:

   OPEN_ROUND   // '('
   CLOSE_ROUND  // ')'
   OPEN_SQUARE  // '['
   CLOSE_SQUARE // ']'
   OPEN_CURLY   // '{'
   CLOSE_CURLY  // '}'
   ATOM         // see below
   COMMENT      // '#'
   QUOTE        // '''

Recognized atoms:

   STRING    // "Hello World"
   INTEGER   // 69, -420
   FLOAT     // 69.420
   BOOL      // true, false
   SYMBOL    // print, +
   QUOTE     // 'test

Author:  Giovanni Santini
Mail:    [email protected]
Github:  @San7o


Example
-------

    #define MICRO_LEX_IMPLEMENTATION
    #include "micro-lex.h"

    int main(void)
    {
      char *input = "( 123 ]";
      int input_size = strlen(input);
      int offset = 0;
      int token_len;

      // Advance whitespaces / newlines / tabs
      offset += micro_lex_trim_left(input, input_size, NULL, NULL);

      // Do lexing
      MicroLexToken first_token =
               micro_lex_next_token(input + offset, input_size,
                                    &token_len, NULL, NULL);
      printf("Lexed %s\n", micro_lex_get_token_str(first_token));

      assert(first_token == MICRO_LEX_OPEN_ROUND);
      offset += token_len;

      //...

      return 0;
    }


Usage
-----

Do this:

  #define MICRO_LEX_IMPLEMENTATION

before you include this file in *one* C or C++ file to create the
implementation.

i.e. it should look like this:

  #include ...
  #include ...
  #include ...
  #define MICRO_LEX_IMPLEMENTATION
  #include "micro-lex.h"

You can tune the library by #defining certain values. See the
"Config" comments under "Configuration" below.

This library provides the functions `micro_lex_trim_left` which
returns the number of characters to trim from the input, and
`micro_lex_next_token` which lexes the input and returns a token
type. You can pass additional information to these functions, which
is described in the documentation of said functions.

An atom is a particular token that holds a value, such as integers
and strings.  If the lexer encounters an atom, it will allocate an
atom and save it to a user provided pointer (if non NULL).  You can
free, deepcopy, and get the string representation of an atom with
the micro_lex_atom_* functions.


Code
----

The official git repository of micro-lex.h is hosted at:

    https://github.com/San7o/micro-lex.h

This is part of a bigger collection of header-only C99 libraries
called "micro-headers", contributions are welcome:

    https://github.com/San7o/micro-headers

This lexer was originally implemented for Haplolang:

    https://github.com/San7o/haplolang/

About

Simple lexer in C99 for lisp-like languages, and others.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published