Gregex

Gregex is a powerful regular expression library that compiles regex patterns to Non-deterministic Finite Automata (NFA) at compile-time using Glushkov's construction algorithm. Write regex patterns as strings and let Rust's procedural macros do the rest!

Features

String-based regex parsing: Write natural regex syntax like regex!("(a|b)+")
Compile-time construction: Zero runtime regex parsing overhead
Type-safe: Leverages Rust's procedural macros for safety
NFA-based matching: Uses Glushkov's construction for efficient matching
Rich operator support: *, +, ?, |, concatenation, and grouping

Quick Start

Add gregex to your Cargo.toml:

cargo add --git https://github.com/Saphereye/gregex

Simple Example

use gregex::*;

fn main() {
    // Natural regex syntax - parsed at compile time!
    let pattern = regex!("(a|b)+c");
    
    // Use standard regex API methods
    assert!(pattern.is_match("abc"));      // Find pattern anywhere
    assert!(pattern.is_match("prefix_abc_suffix"));
    assert_eq!(pattern.find("xabcy"), Some((1, 4)));  // Get match position
}

API Methods

Gregex provides a standard regex API similar to Rust's regex crate:

Method	Description	Example
`is_match(text)`	Check if pattern exists in text	`pattern.is_match("hello")`
`find(text)`	Get first match position	`pattern.find("text")` → `Some((start, end))`
`find_iter(text)`	Iterator over all matches	`pattern.find_iter("text").collect()`

Regex Syntax Reference

When using string-based syntax with regex!("..."), the following operators are supported:

Syntax	Description	Example	Matches
`a`, `b`, `c`	Literal characters	`regex!("abc")`	"abc"
`ab`	Concatenation (implicit)	`regex!("hello")`	"hello"
`a\|b`	Alternation (OR)	`regex!("a\|b")`	"a" or "b"
`a*`	Kleene star (zero or more)	`regex!("a*")`	"", "a", "aa", ...
`a+`	Plus (one or more)	`regex!("a+")`	"a", "aa", "aaa", ...
`a?`	Question (zero or one)	`regex!("a?")`	"" or "a"
`(...)`	Grouping for precedence	`regex!("(ab)+")`	"ab", "abab", ...

Wildcard Patterns

Note: The . wildcard (match any character) and .* patterns are not currently supported in the parser. However:

Use (a|b|c)* to match specific character sets with repetition
Use alternation (a|b|c)+ for one-or-more of specific characters
The is_match() method finds patterns anywhere in text, so pattern.is_match() behaves similarly to .*pattern.* in standard regex

Future Enhancement: Full wildcard support (. and \w, \d, etc.) is planned for a future version.

Usage Examples

1. String-Based Syntax (Recommended)

The most natural and recommended way to use Gregex:

use gregex::*;

// Simple patterns with new API
let pattern = regex!("a+@b+");
assert!(pattern.is_match("aaa@bbb"));
assert!(pattern.is_match("prefix_aa@bb_suffix"));

// Complex patterns with operators
let identifier = regex!("(a|b)(a|b|c)*");
assert!(identifier.is_match("abc"));
assert!(identifier.is_match("bca"));

// Find match positions
let pattern = regex!("a+b?c*");
if let Some((start, end)) = pattern.find("xyzaabccxyz") {
    println!("Found match from {} to {}", start, end);
}

// Nested grouping
let nested = regex!("((a|b)+c)*");
assert!(nested.is_match("acbc"));

Examples

Run the included examples to see gregex in action:

Basic Operator Examples

These examples demonstrate individual regex operators:

# Basic concatenation (matching "abc")
cargo run --example 01_basic_concatenation

# Alternation/OR operator (a|b|c)
cargo run --example 02_alternation

# Kleene star - zero or more (a*)
cargo run --example 03_kleene_star

# Plus operator - one or more (a+)
cargo run --example 04_plus_operator

# Question operator - zero or one (a?)
cargo run --example 05_question_operator

# Grouping and operator precedence
cargo run --example 06_grouping_and_precedence

Advanced Examples

# Complete API methods demonstration
cargo run --example 07_api_methods

# Compile-time NFA construction verification
cargo run --example 08_compile_time_construction

Use Case Examples

Real-world applications demonstrating practical pattern matching:

# Validate programming identifiers
cargo run --example usecase_identifier_validator

# Match URL-like paths
cargo run --example usecase_simple_url_matcher

# Search for patterns in text
cargo run --example usecase_text_search

How It Works

Gregex uses Glushkov's construction algorithm to convert regular expressions into NFAs:

Linearization: Each symbol in the regex is assigned a unique index
Set Construction: Computes prefix, suffix, factors, and nullability sets
NFA Generation: Constructs the NFA based on these sets
Simulation: Runs the input string through the NFA to determine if it matches

This approach generates NFAs with states equal to the number of terminals plus one, making it efficient for pattern matching.

Testing

Run the comprehensive test suite:

cargo test --all

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
examples		examples
gregex-logic		gregex-logic
gregex-macros		gregex-macros
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gregex

Features

Quick Start

Simple Example

API Methods

Regex Syntax Reference

Wildcard Patterns

Usage Examples

1. String-Based Syntax (Recommended)

Examples

Basic Operator Examples

Advanced Examples

Use Case Examples

How It Works

Testing

License

Contributing

About

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

License

Saphereye/gregex

Folders and files

Latest commit

History

Repository files navigation

Gregex

Features

Quick Start

Simple Example

API Methods

Regex Syntax Reference

Wildcard Patterns

Usage Examples

1. String-Based Syntax (Recommended)

Examples

Basic Operator Examples

Advanced Examples

Use Case Examples

How It Works

Testing

License

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 2

Uh oh!

Languages