Constructing Parsers
Character Matchers
CombinedParsers.AnyChar — FunctionAnyChar() = AnyValue(Char)CombinedParsers.AnyValue — TypeAnyValue(T=Char)Parser matching exactly one x::T, returning the value.
julia> AnyChar()
. AnyValue
::CharCombinedParsers.Bytes — TypeBytes{N,T} <: NIndexParser{N,T}Fast parsing of a fixed number N of indices, reinterpret(T,match)[1] the parsed vector as T, if isbitstype, or T(match) constructor otherwise.
Provide Base.get(parser::Bytes{N,T}, sequence, till, after, i, state) where {N,T} for custom conversion.
Endianness can be achieved by just mapping bswap
julia> map(bswap, Bytes(2,UInt16))([0x16,0x11])
0x1611
julia> Bytes(2,UInt16)([0x16,0x11])
0x1116CombinedParsers.ValueMatcher — TypeValueMatcher match value at point c iif ismatch(c, parser). A ValueMatcher{T}=NIndexParser{1,T} and has state_type MatchState.
See AnyValue, ValueIn, and ValueNotIn.
CombinedParsers.CharIn — FunctionCharIn(a...; kw...) = ValueIn{Char}(a...; kw...)CombinedParsers.UnicodeClass — TypeUnicodeClass(unicode_category::Symbol...)used in ValueIn, ValueNotIn and succeeds if char at cursor is in one of the unicode classes.
julia> match(ValueIn(:L), "aB")
ParseMatch("a")
julia> match(ValueIn(:Lu), "aB")
ParseMatch("B")
julia> match(ValueIn(:N), "aA1")
ParseMatch("1")Supported Unicode classes
julia> for (k,v) in CombinedParsers.unicode_class
println(":",k, " is a ",v[1],", ", v[2],".")
end
:L is a Letter, any kind of letter from any language.
:Ll is a Lowercase Letter, a lowercase letter that has an uppercase variant.
:Lu is a Uppercase Letter, an uppercase letter that has a lowercase variant.
:Lt is a Titlecase Letter, a letter that appears at the start of a word when only the first letter of the word is capitalized.
:L& is a Cased Letter, a letter that exists in lowercase and uppercase variants (combination of Ll, Lu and Lt).
:Lm is a Modifier Letter, a special character that is used like a letter.
:Lo is a Other Letter, a letter or ideograph that does not have lowercase and uppercase variants.
:M is a Mark, a character intended to be combined with another character (e.g. accents, umlauts, enclosing boxes, etc.).
:Mn is a Non Spacing Mark, a character intended to be combined with another character without taking up extra space (e.g. accents, umlauts, etc.).
:Mc is a Spacing Combining Mark, a character intended to be combined with another character that takes up extra space (vowel signs in many Eastern languages).
:Me is a Enclosing Mark, a character that encloses the character it is combined with (circle, square, keycap, etc.).
:Z is a Separator, any kind of whitespace or invisible separator.
:Zs is a Space Separator, a whitespace character that is invisible, but does take up space.
:Zl is a Line Separator, line separator character U+2028.
:Zp is a Paragraph Separator, paragraph separator character U+2029.
:S is a Symbol, math symbols, currency signs, dingbats, box-drawing characters, etc..
:Sm is a Math Symbol, any mathematical symbol.
:Sc is a Currency Symbol, any currency sign.
:Sk is a Modifier Symbol, a combining character (mark) as a full character on its own.
:So is a Other Symbol, various symbols that are not math symbols, currency signs, or combining characters.
:N is a Number, any kind of numeric character in any script.
:Nd is a Decimal Digit Number, a digit zero through nine in any script except ideographic scripts.
:Nl is a Letter Number, a number that looks like a letter, such as a Roman numeral.
:No is a Other Number, a superscript or subscript digit, or a number that is not a digit 0–9 (excluding numbers from ideographic scripts).
:P is a Punctuation, any kind of punctuation character.
:Pc is a Connector Punctuation, a punctuation character such as an underscore that connects words.
:Pd is a Dash Punctuation, any kind of hyphen or dash.
:Ps is a Open Punctuation, any kind of opening bracket.
:Pe is a Close Punctuation, any kind of closing bracket.
:Pi is a Initial Punctuation, any kind of opening quote.
:Pf is a Final Punctuation, any kind of closing quote.
:Po is a Other Punctuation, any kind of punctuation character that is not a dash, bracket, quote or connector.
:C is a Other, invisible control characters and unused code points.
:Cc is a Control, an ASCII or Latin-1 control character: 0x00–0x1F and 0x7F–0x9F.
:Cf is a Format, invisible formatting indicator.
:Cs is a Surrogate, one half of a surrogate pair in UTF-16 encoding.
:Co is a Private Use, any code point reserved for private use.
:Cn is a Unassigned, any code point to which no character has been assigned.CombinedParsers.ValueIn — TypeValueIn(x)Parser matching exactly one element c (character) in a sequence, iif _ismatch(c,x).
julia> a_z = ValueIn('a':'z')
[a-z] ValueIn
::Char
julia> parse(a_z, "a")
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
julia> ac = CharIn("ac")
[ac] ValueIn
::Char
julia> parse(ac, "c")
'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
julia> l = CharIn(islowercase)
[islowercase(...)] ValueIn
::Char
julia> parse(l, "c")
'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
CombinedParsers.CharNotIn — FunctionCharNotIn(a...; kw...) = ValueNotIn{Char}(a...; kw...)CombinedParsers.ValueNotIn — TypeValueNotIn{T}(label::AbstractString, x)Parser matching exactly one element (character) in a sequence, iif not in x.
ValueNotIn([label::AbstractString="", ]x...)
ValueNotIn{T}([label::AbstractString="", ]x...)Flattens x with CombinedParsers.flatten_valuepatterns, and tries to infer T if not provided.
julia> a_z = CharNotIn('a':'z')
[^a-z] ValueNotIn
::Char
julia> ac = CharNotIn("ca")
[^ca] ValueNotIn
::CharRespects boolean logic:
julia> CharNotIn(CharNotIn("ab"))("a")
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)Respects boolean logic:
julia> CharIn(CharIn("ab"))("a")
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
julia> CharIn(CharNotIn("bc"))("a")
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
julia> parse(CharNotIn(CharIn("bc")), "a")
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)CombinedParsers.ismatch — Functionismatch(c,p)returns _ismatch(c, p)
ismatch(c::MatchingNever,p)returns false.
CombinedParsers._ismatch — Function_ismatch(x::Char, set::Union{Tuple,Vector})::BoolReturn _ismatch(x,set...).
_ismatch(x, f, r1, r...)Check if x matches any of the options f, r1,r...: If ismatch(x,f) return true, otherwise return _ismatch(x, r1, r...).
_ismatch(x)returns false (out of options)
_ismatch(x, p)returns x==p
_ismatch(c,p::Function)returns p(c)
_ismatch(c,p::AnyValue)true
_ismatch(c,p::Union{StepRange,Set})returns c in p
Base.Broadcast.broadcasted — FunctionBase.broadcasted(::typeof((&)), x::ValueNotIn, y::ValueNotIn)Character matchers m like Union{ValueIn,ValueNotIn,T}, or any type T providing a ismatch(m::T,c::Char)::Bool method represent a "sparse" bitarray for all characters.
Please consider the broadcast API a draft you are invited to comment to.
julia> CharNotIn("abc") .& CharNotIn("z")
[^abcz] ValueNotIn
::Char
julia> CharIn("abc") .& CharNotIn("c")
[ab] ValueIn
::CharCombinedParsers.flatten_valuepatterns — Functionflatten_valuepatterns(x...)Used in ValueMatcher constructors.
Heuristic is roughly:
- collect
ElementIteratorsin aSet - collect everything else in a
Tuple(Functions etc.) - in the process the
labelis concatenated - return all that was collected as
Tuple{String, <:Set, <:Tuple}orTuple{String, <:Set}orTuple{String, <:Tuple}.
Repeating
CombinedParsers.Repeat — TypeRepeat(minmax::UnitRange, x...)
Repeat(x...; min=0,max=Repeat_max)
Repeat(min::Integer, x...)
Repeat(min::Integer,max::Integer, x...)Parser repeating pattern x min:max times.
julia> Repeat(2,2,'a')
a{2} |> Repeat
::Vector{Char}
julia> Repeat(3,'a')
a{3,} |> Repeat
::Vector{Char}
Base.:| — Method(|)(x::AbstractToken{T}, default::Union{T,Missing})Operator syntax for Optional(x, default=default).
julia> parser("abc") | "nothing"
|🗄 Either
├─ abc
└─ nothing
::SubString{String}
julia> parser("abc") | missing
abc? |missing
::Union{Missing, SubString{String}}
CombinedParsers.Repeat1 — FunctionRepeat1(x)Parser repeating pattern x one time or more.
Repeat1(f::Function,a...)Abbreviation for Base.map(f,Repeat1(a...)).
CombinedParsers.Optional — TypeOptional(parser;default=defaultvalue(result_type(parser)))Parser that always succeeds. If parser succeeds, return result of parser with curser behind match. If parser does not succeed, return default with curser unchanged.
julia> match(r"a?","b")
RegexMatch("")
julia> parse(Optional("a", default=42),"b")
42CombinedParsers.defaultvalue — Functiondefaultvalue(T::Type)Default value if Optional<:CombinedParser is skipped.
T<:AbstractString:""T<:Vector{E}:E[]T<:CombinedParser:Always()- otherwise
missing
get will return a CombinedParsers._copy of defaultvalue.
CombinedParsers._copy — Function_copy(x)copy(x) iif ismutable(x); used when defaultvalue of Optional results in get.
CombinedParsers.Lazy — TypeLazy(x::Repeat)
Lazy(x::Optional)Lazy x repetition matching (instead of default greedy).
julia> german_street_address = !Lazy(Repeat(AnyChar())) * Repeat1(' ') * TextParse.Numeric(Int)
🗄 Sequence
├─ .*? AnyValue |> Repeat |> Lazy |> !
├─ \ + |> Repeat
└─ <Int64>
::Tuple{SubString{String}, Vector{Char}, Int64}
julia> german_street_address("Konrad Adenauer Allee 42")
("Konrad Adenauer Allee", [' ', ' ', ' ', ' '], 42)PCRE @re_str
julia> re"a+?"
a+? |> Repeat |> Lazy
::Vector{Char}
julia> re"a??"
a?? |missing |> Lazy
::Union{Missing, Char}CombinedParsers.Repeat_stop — FunctionRepeat_stop(p,stop)
Repeat_stop(p,stop; min=0, max=Repeat_max)Repeat p until stop (NegativeLookahead), not matching stop. Sets cursor before stop. Tries min:max times Returns results of p.
julia> p = Repeat_stop(AnyChar(),'b') * AnyChar()
🗄 Sequence
├─ 🗄* Sequence[2] |> Repeat
│ ├─ (?!b) NegativeLookahead
│ └─ . AnyValue
└─ . AnyValue
::Tuple{Vector{Char}, Char}
julia> parse(p,"acbX")
(['a', 'c'], 'b')See also NegativeLookahead
CombinedParsers.Repeat_until — FunctionRepeat_until(p,until, with_until=false; wrap=identity, min=0, max=Repeat_max)Repeat p until stop (with Repeat_stop). and set point after stop.
Return a Vector{result_type(p)} if wrap_until==false, otherwise a Tuple{Vector{result_type(p)},result_type(until)}.
To transform the Repeat_stop(p) parser head, provide a function(::Vector{result_type(p)}) in wrap keyword argument, e.g.
julia> p = Repeat_until(AnyChar(),'b') * AnyChar()
🗄 Sequence
├─ 🗄 Sequence[1]
│ ├─ (?>🗄*) Sequence[2] |> Repeat |> Atomic
│ │ ├─ (?!b) NegativeLookahead
│ │ └─ . AnyValue
│ └─ b
└─ . AnyValue
::Tuple{Vector{Char}, Char}
julia> parse(p,"acbX")
(['a', 'c'], 'X')
julia> parse(Repeat_until(AnyChar(),'b';wrap=MatchedSubSequence),"acbX")
"ac"See also NegativeLookahead
Base.join — FunctionBase.join(x::Repeat,delim, infix=:skip)Parser matching repeated x.parser separated by delim.
julia> parse(join(Repeat(AnyChar()),','),"a,b,c")
3-element Vector{Char}:
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)julia> parse(join(Repeat(AnyChar()),',';infix=:prefix),"a,b,c")
('a', [(',', 'b'), (',', 'c')])
julia> parse(join(Repeat(AnyChar()),',';infix=:suffix),"a,b,c")
([('a', ','), ('b', ',')], 'c')Base.join(x::CombinedParser,delim; kw...)Shorthand for join(Repeat(x),delim; kw...).
Base.join(f::Function, x::CombinedParser, delim; kw...)Shorthand for Base.map(f,join(x,delim; kw...)).
Atomic
CombinedParsers.Atomic — TypeAtomic(x)A parser matching p, and failing when required to backtrack (behaving like an atomic group in regular expressions).
Sequences
CombinedParsers.Sequence — TypeSequence{P,S,T}of parts::P, sequence_state_type==S with sequence_result_type==T.
Sequence(parts::CombinedParser...; tuplestate=true)of parts, sequence_state_type(p; tuplestate=tuplestate) with sequence_result_type.
Sequences can alternatively created with *
julia> german_street_address = !Repeat(AnyChar()) * ' ' * TextParse.Numeric(Int)
🗄 Sequence
├─ .* AnyValue |> Repeat |> !
├─ \
└─ <Int64>
::Tuple{SubString{String}, Char, Int64}
julia> german_street_address("Some Avenue 42")
("Some Avenue", ' ', 42)Indexing (transformation) can be defined with
julia> e1 = Sequence(!Repeat(AnyChar()), ' ',TextParse.Numeric(Int))[1]
🗄 Sequence[1]
├─ .* AnyValue |> Repeat |> !
├─ \
└─ <Int64>
::SubString{String}
julia> e1("Some Avenue 42")
"Some Avenue"State is managed as sequence_state_type(parts; tuplestate). Overwrite to optimize state types special cases.
Base.:* — MethodCombinedParsers.sSequence — FunctionsSequence(x...)Simplifying Sequence, flatten Sequences, remove Always assertions.
julia> Sequence('a',CharIn("AB")*'b')
🗄 Sequence
├─ a
└─ 🗄 Sequence
├─ [AB] ValueIn
└─ b
::Tuple{Char, Tuple{Char, Char}}
julia> sSequence('a',CharIn("AB")*'b')
🗄 Sequence
├─ a
├─ [AB] ValueIn
└─ b
::Tuple{Char, Char, Char}See also Sequence
This function will be removed and replaced with a keyword argument
CombinedParsers.@seq — Macro@seq(x...)Create a sequence interleaved with whitespace (horizontal or vertical). The result_type is omitting whitespace.
CombinedParsers.sequence_result_type — Functionsequence_result_type(::Type{T}) where {T<:Tuple}Tuple type, internally used for Sequence result_type.
CombinedParsers.sequence_state_type — Functionsequence_state_type(pts::Type; tuplestate=true)MatchStateif allfieldtypesareMatchState,- otherwise if
tuplestate, a tuple type with thestate_typeofparts, - or
Vector{Any}if!tuplestate.
Todo: NCodeunitsState instead of MatchState might increase performance.
Recursive Parsers with Either
CombinedParsers.Delayed — FunctionDelayed(T::Type) =Either{T}().
CombinedParsers.Either — TypeEither{S,T}(p) where {S,T} = new{typeof(p),S,T}(p)Parser that tries matching the provided parsers in order, accepting the first match, and fails if all parsers fail.
This parser has no == and hash methods because it can recurse.
julia> match(r"a|bc","bc")
RegexMatch("bc")
julia> parse(Either("a","bc"),"bc")
"bc"
julia> parse("a" | "bc","bc")
"bc"
Base.:| — Method(|)(x::AbstractToken, y)
(|)(x, y::AbstractToken)
(|)(x::AbstractToken, y::AbstractToken)Operator syntax for Either(x, y; simplify=true).
julia> 'a' | CharIn("AB") | "bc"
|🗄 Either
├─ a
├─ [AB] ValueIn
└─ bc
::Union{Char, SubString{String}}
CombinedParsers.@syntax — Macro@syntax name = exprConvenience macro defining a CombinedParser name=expr and custom parsing macro @name_str.
DocTestFilters = r"map\(.+\)"julia> @syntax a = AnyChar();
julia> a"char"
'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
@syntax for name in either; expr; endParser expr is pushfirst! to either. If either is undefined, it will be created. If either == :text || either == Symbol(:) the parser will be added to CombinedParser_globals variable in your module.
julia> @syntax street_address = Either(Any[]);
julia> @syntax for german_street_address in street_address
Sequence(!!Repeat(AnyChar()),
" ",
TextParse.Numeric(Int)) do v
(street = v[1], no=v[3])
end
end
🗄 Sequence |> map(#50) |> with_name(:german_street_address)
├─ .* AnyValue |> Repeat |> ! |> map(intern)
├─ \
└─ <Int64>
::NamedTuple{(:street, :no), Tuple{String, Int64}}
julia> german_street_address"Some Avenue 42"
(street = "Some Avenue", no = 42)
julia> @syntax for us_street_address in street_address
Sequence(TextParse.Numeric(Int),
" ",
!!Repeat(AnyChar())) do v
(street = v[3], no=v[1])
end
end
🗄 Sequence |> map(#52) |> with_name(:us_street_address)
├─ <Int64>
├─ \
└─ .* AnyValue |> Repeat |> ! |> map(intern)
::NamedTuple{(:street, :no), Tuple{String, Int64}}
julia> street_address"50 Oakland Ave"
(street = "Oakland Ave", no = 50)
julia> street_address"Oakland Ave 50"
(street = "Oakland Ave", no = 50)CombinedParsers.substitute — Functionsubstitute(name::Symbol)Define a parser substitution.
substitute(parser::CombinedParser)Apply parser substitution, respecting scope in the defined tree:
- Parser variables are defined within scope of
Eithers, for all itsNamedParseroptions. Substitutionparsers are replaced with parser variables.strip_either1is used to simplify in a second phase.
Substitution implementation is experimental pending feedback.
todo: scope NamedParser objects in WrappedParser, Sequence, etc.?
julia> Either(:a => !Either(
:b => "X",
:d => substitute(:b),
substitute(:c)),
:b => "b",
:c => substitute(:b)
) |> substitute
|🗄 Either
├─ |🗄 Either |> ! |> with_name(:a)
│ ├─ X |> with_name(:b)
│ ├─ X |> with_name(:b) |> with_name(:d)
│ └─ b |> with_name(:b) |> with_name(:c)
├─ b |> with_name(:b)
└─ b |> with_name(:b) |> with_name(:c)
::SubString{String}Example
With substitute you can write recursive parsers in a style inspired by (E)BNF. CombinedParsers.BNF.ebnf uses substitute.
julia> def = Either(:integer => !Either("0", Sequence(Optional("-"), substitute(:natural_number))),
:natural_number => !Sequence(substitute(:nonzero_digit), Repeat(substitute(:digit))),
:nonzero_digit => re"[1-9]",
:digit => Either("0", substitute(:nonzero_digit)))
|🗄 Either
├─ |🗄 Either |> ! |> with_name(:integer)
│ ├─ 0
│ └─ 🗄 Sequence
│ ├─ \-? |
│ └─ natural_number call substitute!
├─ 🗄 Sequence |> ! |> with_name(:natural_number)
│ ├─ nonzero_digit call substitute!
│ └─ * digit call substitute! |> Repeat
├─ [1-9] ValueIn |> with_name(:nonzero_digit)
└─ |🗄 Either |> with_name(:digit)
├─ 0
└─ nonzero_digit call substitute!
::Union{Nothing, Char, SubString{String}}
julia> substitute(def)
|🗄 Either
├─ |🗄 Either |> ! |> with_name(:integer)
│ ├─ 0
│ └─ 🗄 Sequence
│ ├─ \-? |
│ └─ 🗄 Sequence |> ! |> with_name(:natural_number) # branches hidden
├─ 🗄 Sequence |> ! |> with_name(:natural_number)
│ ├─ [1-9] ValueIn |> with_name(:nonzero_digit)
│ └─ |🗄* Either |> with_name(:digit) |> Repeat
│ ├─ 0
│ └─ [1-9] ValueIn |> with_name(:nonzero_digit)
├─ [1-9] ValueIn |> with_name(:nonzero_digit)
└─ |🗄 Either |> with_name(:digit)
├─ 0
└─ [1-9] ValueIn |> with_name(:nonzero_digit)
::Union{Char, SubString{String}}Base.push! — FunctionBase.push!(x::Either, option)Push option to x.options as parser tried next if x fails.
Recursive parsers can be built with push! to Either.
See also pushfirst! and @syntax.
Base.push!(x::WrappedParser{<:Either}, option)Push option to x.options of repeated inner parser.
Base.pushfirst! — FunctionBase.pushfirst!(x::WrappedParser{<:Either}, option)Push option as first x.options of repeated inner parser.
CombinedParsers.either_result_type — Functionreturn tuple(statetype,resulttype)
Parser generating parsers
CombinedParsers.FlatMap — TypeFlatMap{P,S,Q<:Function,T} <: CombinedParser{S,T}Like Scala's fastparse FlatMap. See after
CombinedParsers.after — Functionafter(right::Function,left::AbstractToken)
after(right::Function,left::AbstractToken,T::Type)Like Scala's fastparse FlatMap
julia> saying(v) = v == "same" ? v : "different";
julia> p = after(saying, String, "same"|"but")
🗄 FlatMap
├─ |🗄 Either
│ ├─ same
│ └─ but
└─ saying
::String
julia> p("samesame")
"same"
julia> p("butdifferent")
"different"
Assertions
CombinedParsers.AtStart — TypeAtStart()Parser succeding if and only if at index 1 with result_type AtStart.
julia> AtStart()
re"^"
CombinedParsers.AtEnd — TypeAtEnd()Parser succeding if and only if at last index with result_type AtEnd.
julia> AtEnd()
re"$"
CombinedParsers.Always — TypeAlways()Assertion parser matching always and not consuming any input. Returns Always().
julia> Always()
re""
CombinedParsers.Never — TypeNever()Assertion parser matching never.
julia> Never()
re"(*FAIL)"
Look behind
CombinedParsers.Lookbehind — FunctionLookbehind(does_match::Bool, p)PositiveLookbehind if does_match==true, NegativeLookbehind otherwise.
CombinedParsers.PositiveLookbehind — TypePositiveLookbehind(parser)Parser that succeeds if and only if parser succeeds before cursor. Consumes no input. The match is returned. Useful for checks like "must be preceded by parser, don't consume its match".
CombinedParsers.NegativeLookbehind — TypeNegativeLookbehind(parser)Parser that succeeds if and only if parser does not succeed before cursor. Consumes no input. nothing is returned as match. Useful for checks like "must not be preceded by parser, don't consume its match".
julia> la=NegativeLookbehind("keep")
re"(?<!keep)"
julia> parse("peek"*la,"peek")
("peek", re"(?<!keep)")Look ahead
CombinedParsers.Lookahead — FunctionLookahead(does_match::Bool, p)PositiveLookahead if does_match==true, NegativeLookahead otherwise.
CombinedParsers.PositiveLookahead — TypePositiveLookahead(parser)Parser that succeeds if and only if parser succeeds, but consumes no input. The match is returned. Useful for checks like "must be followed by parser, but don't consume its match".
julia> la=PositiveLookahead("peek")
re"(?=peek)"
julia> parse(la*AnyChar(),"peek")
("peek", 'p')
CombinedParsers.NegativeLookahead — TypeNegativeLookahead(parser)Parser that succeeds if and only if parser does not succeed, but consumes no input. parser is returned as match. Useful for checks like "must not be followed by parser, don't consume its match".
julia> la = NegativeLookahead("peek")
re"(?!peek)"
julia> parse(la*AnyChar(),"seek")
(re"(?!peek)", 's')
Logging and Side-Effects
CombinedParsers.NamedParser — TypeNamedParser{P,S,T} <: WrappedParser{P,S,T}Struct with
name::Symbol
parser::P
doc::StringCombinedParsers.with_name — Functionwith_name(name::Symbol,x; doc="")A parser labelled with name. Labels are useful in printing and logging.
See also: @with_names, with_name, log_names
CombinedParsers.@with_names — Macro@with_namesSets names of parsers within begin/end block to match the variables they are asigned to.
so, for example
julia> @with_names foo = AnyChar()
. AnyValue |> with_name(:foo)
::Char
julia> parse(log_names(foo),"ab")
match foo@1-2: ab
^
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)CombinedParsers.log_names — Functionlog_names(x,names=true; exclude=nothing)Rebuild parser replacing NamedParser instances with with_log parsers. Log all NamedParser instanses if names==true or name in names and not name in exclude.
See also: with_log, log_parser, deepmap_parser
CombinedParsers.log_parser — Functionlog_parser(message::Type, x::CombinedParser, a...; kw...)
log_parser(message::Function, x::CombinedParser, a...; kw...)Transform parser including logging statements for sub-parsers of type message or for which calling message does not return nothing.
CombinedParsers.with_log — Functionwith_log(s::AbstractString,p, delta=5;nomatch=false)Log matching process of parser p, displaying delta characters left of and right of match.
If nomatch==true, also log when parser does not match.
See also: log_names, with_effect
CombinedParsers.with_effect — Functionwith_effect(f::Function,p,a...)Call f(sequence,before_i,after_i,state,a...) if p matches, f(sequence,before_i,before_i,nothing,a...) otherwise.
other
CombinedParsers.MappedSequenceParser — TypeMappedSequenceParser(f::F,parser::P) where {F<:Function,P}Match parser on CharMappedString(f,sequence), e.g. in a caseless parser.
CombinedParsers.MemoizingParser — TypeMemoizingParser{P,S,T}WrappedParser memoizing all match states. For slow parsers with a lot of backtracking this parser can help improve speed.
(Sharing a good example where memoization makes a difference is appreciated.)
CombinedParsers.WithMemory — TypeWithMemory(x) <: AbstractStringString wrapper with memoization of next match states for parsers at indices. Memoization is sometimes recommended as a way of improving the performance of parser combinators (like state machine optimization and compilation for regular languages).
A snappy performance gain could not be demonstrated so far, probably because the costs of state memory allocation for caching are often greater than recomputing a match. If you have a case where your performance benefits with this, let me know!
```