|
I don't use regex for parsing, but I do use it for lexing, just because it's the most compact and quick way to get a bunch of match rules into list
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
I tried it and tbh the first few things work really well. But then you start adding more constructs, and have to account for the fact that you can have a do-while or while-do and they can be nested, and in my case the syntax doesn't require a terminating ; after the last statement in a loop etc.
And it all spirals into exponential madness.
|
|
|
|
|
Part of that is you're using the wrong tool. Just from describing it I can tell you're using an NFA based regex.
That's not great for lexing. For lexing you want good old DFA, no backtracking.
Here are your main operators - this is how simple it is:
[] () | + * ?
You can pretty much do what you need with those in the case of lexing.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
honey the codewitch wrote: it is virtually always worth it to hand roll your own parser
Of course 'worth' is relative.
But I am rather sure that most experienced compiler/interpretive writers, those that do it for reasons besides just a toy, always hand modify the results.
honey the codewitch wrote: and to create a context free grammar to describe that language anyway.
If you are going to call it a language then you probably really must do that.
honey the codewitch wrote: I spent a long time to come up with the above 3 little points
I worked on an internal company product years ago where the original developer didn't understand any of that.
He, literally, did not even write a real parser. Rather the interpreter read the source code every time. So a loop would re-process the 'while' text each time through the loop. No surprise that the users constantly complained about the speed.
honey the codewitch wrote: I didn't get saddled financially for it though, so yay for tha
The only post college degree class I ever took was an introduction to Compiler Theory. I consider that the best class I ever took. Also the most fun.
|
|
|
|
|
jschell wrote: If you are going to call it a language then you probably really must do that.
Umm, I do?
Lexicon:
Context-Free-Grammar/CFG - The document describing the structure of the language
Language - A Chomsky type 2 language describable with a CFG
Parser - A stack based FA machine that can - given an input grammar - parse the corresponding language.
Maybe you just didn't understand me or something.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
All this talk of Context-Free Grammars has me wondering if there's such thing as a context-dependent grammar?
The difficult we do right away...
...the impossible takes slightly longer.
|
|
|
|
|
There are. Chomsky type 1 and type 0 languages are context dependent.
They model human language.
You can represent them with an Earley grammar, but an Earley grammar is not practical to parse on a real system. It's strictly theoretical.
It's also possible to parse context sensitive languages with a context free grammar using a GLR parser. You will get multiple trees back due to the ambiguity of such languages without context. Your job would then be to decide which tree is valid.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Sure, English is a prime example. Note the "prime" ambiguity that can only be solved in a context-dependent manner.
Some might argue that English doesn't really have a grammar per se, but mostly a collection of use cases and exceptions.
Mircea
|
|
|
|
|
Interesting question. So I went looking.
The following programming language claims that at least some of it is context dependent.
Chapter 4 - Expressions[^]
"Words such as sell, at, and read have different meanings in different contexts. The words are relative expressions -- their meaning is context dependent."
|
|
|
|
|
It's indeed possible to apply context to a narrow parse path. In fact, even the C grammar requires this because introducing a new struct or typedef introduces a new non-terminal into the grammar. It can be had by "hacking" the parser in one particular area such that it can apply a specific and narrow kind of context. That's the how the context is represented in that particular case. However, a generalized mechanism for context is not really feasible.
Chomsky Type 1 and 0 languages require context throughout in order to parse. They need something like an Earley grammar. Implementations of Earley grammars that have been proposed write new context-free-grammars on the fly during the parse. The problem with that is it takes a long time to turn a CFG into an actual parser. Generating the tables takes a lot of time. It's simply not practical. There are better ways of language processing that don't use this tech at all. See AI speech recognition.
Edit: Looks like someone attacked it a different way as well: https://www.sciencedirect.com/science/article/pii/S2590118422000697[^]
(paywalled)
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
honey the codewitch wrote: Looks like someone attacked it a different way as wel
Interesting. Goes beyond what you said to point out that multiple languages are a mix. Although perhaps that is not surprising.
From the link you posted.
"Furthermore, despite the strength of CFGs, some aspects of modern programming languages cannot be modeled with context-free grammars alone as some language constructs depend on the wider context they appear in [12]. Most such cases must be dealt with during or after parsing using more or less formalized techniques, e.g., name resolution, type checking, etc., which are far less formalized than the parser itself. "
|
|
|
|
|
honey the codewitch wrote: Umm, I do?
I meant that as a general comment for all of those out there that create their own language. I suspect that at least some of them don't use a BNF.
|
|
|
|
|
Oh I see. I thought you meant I was being inconsistent about calling a language a language.
BNF and EBNF are just "file formats" for lack of a better term. They are like
Foo:= Bar "+" Baz;
Bar:= "Bar";
Baz:= { "Baz" }+;
(inexact representation as the specs are imprecise kind of like regex)
it's just a format for a context free grammar specification. EBNF and BNF are the most well known formats, which is why I mentioned them.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
honey the codewitch wrote: BNF and EBNF are just
Yes. Like you I have created my own languages in the past. One time formally. But most times just adhoc.
|
|
|
|
|
To parse C++, I used recursive descent and even wrote the lexical analysis routines from scratch. Fixing bugs is fairly straightforward, but I wouldn't have a clue how to fix them in a bottom-up parser. It's about 13K lines of code, including comments and blanks, and anyone familiar with C++ can look at the code and probably figure it out with relative ease. The "compiler" part, however, which has to understand what the parsed code is doing in order to perform static analysis, is much larger, probably 3x that size.
robust-services-core/src/ct/Parser.cpp · GitHub[^]
|
|
|
|
|
Yeah, in my Slang parser the parsing was the easy part. Establishing a CodeDOM based AST after the fact was not.
For example
class Foo {
static public int Bar;
}
if I do Foo.Bar, do I create a CodeFieldReferenceExpression or a CodePropertyReferenceExpression, or even a CodeMethodReferenceExpression?
I have to crawl the existing code I've already resolved to an AST to determine what Bar is, so I can then make the appropriate object.
It's not compiling. It's doing something completely different, but it's very compiler-like.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Warning: Some of you will be disgusted by this post (and if there are follow ups, they might be disgusting as well).
There are farms in some parts of the world growing plants - mostly vegetables, but maybe some fruits as well - in 100% inorganic beds. The root do not dig down into natural soil, but cling to balls of expanded clay aggregate, being 'fed' by a nutrition solution washing over their roots. Anyone raising critical remarks to this is immediately classified along with 'tree huggers' and the like.
For animals we have much more concern. Even if we kill them and cut their bodies into small, plastic wrapped pieces, we have regulations for how we kill them. Ideally, they shall not be aware that they are being killed for our purpose. So we either knock them out with poison gas before the actual killing, or we let it happen so fast that we think they won't notice.
Many people know that if you connect electrodes to the ends of muscles (read: meat) of a newly slaughtered animal, the muscle will contract. A muscle receives the energy it needs to do its work though the blood (and some minor channels), including the proteins in needs to grow. Its needs are fairly well understood by modern medicine. We can feed the muscle what it needs to exercise and grow bigger, just like we feed those plant in the bed of expanded clay balls with nutrition dissolved in the water flushing their roots, and with electricity we can exercise the muscles (meat) to grow as big as that of a weightlifter.
Assume that we slaughter that piglet (or whatever), chopping its head off. We keep its body as a single unit, connecting tubes to its arteries and veins, adding to the blood stream whatever the body has consumed and then some for growth. Electrical signals stimulate muscle activity to provoke more muscle growth. For a living animal, a significant part of the fodder is used to grow hoofs and other parts of limited usage; if we remove those early, we do not waste fodder on them. Of course: There is no brain, which usually consumes a log of energy. Lots of the entrails are also useless when the muscles get what they need directly through the arteries. Most or all of the digestive system serves no purpose, and can be removed; maintaining it is a waste of energy and protein. The ideal is a body where all the energy and protein fed into it goes into building the muscles - the meat that we want to eat.
I don't know for sure that farming science is yet ready for this kind of meat production: Slaughtering the animals while they are still piglets, lambs or young calves, and after their slaughtering feed nutrition to the bodies that remain of them, in a way that will make the bodies grow into full size, although with all the non-meat-producing parts removed.
But if this is, or becomes possible, how morally acceptable is it? Can we do this when the animal's head is cut off, the animal is slaughtered? If we can feed plants clinging to expanded clay ball a flow of nutrition that make them grow, what is the difference to feeding a similar nutrition flow to the body of a slaughtered animal?
|
|
|
|
|
Would that qualify to still be called meat or it is just rubber?
|
|
|
|
|
Small atrocities lead to bigger atrocities.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
For each system, brain, digestion, vascular that is removed, it would need to be replaced by a more efficient system.
Does the brain of the animal use more or less energy than the controller boards to run the system?
How much energy to generate the correct solutions that are consumed by the muscle.
Filtering and separating the toxins produced by exercising the muscles?
You would lose a ton of redundancy. One bad power outage and the whole barn/warehouse is spoiled?
I will trust that millions of years of evolution have solved these problems for us simply and elegantly.
I think the vat of protein approach will win. Probably yeast based.
Or go Ring World and create C-H-O-N factories.
|
|
|
|
|
Ring World - now there is a trilogy I have not read in a long, long time.
Charlie Gilley
“They who can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.” BF, 1759
Has never been more appropriate.
|
|
|
|
|
Where are you getting this information?
1. Killing animals as fertilizer is quite inefficient. Not a standard in US to my knowledge.
Using their fecal waste is much more common as a renewable fertilizer,
especially when source is large mammals such as cattle, sheep, pigs, etc. i.e. honey wagons (poop vans).
2. Organic vs inorganic is complicated. Hard to trace unless purposely done so and probably not be relevant.
In the end fertilizers are chemicals, whether naturally or synthetic produced.
"A little time, a little trouble, your better day"
Badfinger
|
|
|
|
|
If you've already gone this far, why not just take the muscle only from a prize meat animal, grow it in a vat, cutting off pieces as necessary? There is no reason why a clump of cells could not live for a very long time, with appropriate care.
Kobe beef for the masses. Yum!
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
Daniel Pfeffer wrote: There is no reason why a clump of cells could not live for a very long time Apoptosis is one good reason.
The only current solution is to use immortal cell lines like HeLa cell lines and I really don't think that people are going to want to eat those.
“That which can be asserted without evidence, can be dismissed without evidence.”
― Christopher Hitchens
|
|
|
|
|
GuyThiebaut wrote: Apoptosis is one good reason.
Well, no one ever claimed that I was a biologist...
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|