I started on the Strict project two days ago and the past 48 hours have been immensely joyful. Not only do I get to work with my friend Alexandre Bencz, but I am also getting to know several other, highly competent developers, all of whom have a shared passion for programming language design and implementation. The team now consists of five people, including Ben, so stuff is happening and quite fast even!
Obviously, the first few days are very hectic and overwhelming, because of all the new tools, procedures, habits, attitudes, and standards that the newcomer has to learn. I personally expect to learn more from joining this project than I’ve learned on my own during the past 5 years. There’s nothing that can speed up your own development as the process of working with people with the same interests and all kinds of useful skills.
Strict is a great vision, which we are currently working hard to both document and formalize as the compiler begins to take shape. An example of this is that Ben added a draft grammar in BNF format yesterday. This helps everybody on the team and outside to quickly visualize what the project is all about.
The next steps we’re working on are to formalize things like the Intermediate Representation (IR) and talk about the upcoming VM has begun as well, while Alexandre is working busily on the compiler itself. Harald is looking into the IDE support and stuff will happen there soon as well.
My role is, given the short period of employment so far, still a bit uncertain, but I do love to write documentation, so perhaps that will be part of what I do. I hope to return to these pages in not too long, with more exciting news from and documentation for the Strict project, and am already working on various updates to the existing documentation.
Last month we had some new people starting to work Strict, but sadly after a few weeks progress came back to a halt. We got a new tokenizer and a clearer vision which parts are next and most important (IDE, VM), however the new people are doing their own thing again. Progress is much slower again, I am still super busy with growing the team, and AI projects. There will be a few months of training and getting the new guys on the AI projects up to speed, but hopefully after I will have more time on myself to continue on the next important parts of Strict.
I noticed even with compiler and programming language experts it was still hard to get the functionality, syntax and vision across. In the beginning everyone was hyped, but once the hard problems come up to solve we need longevity to finish the parts up and when the vision is not in everyones head, it gets hard. So my main focus is to write a functional prototype start to finish (which was what was planned anyway). Maybe a minimum viable product is possible now to get all the parts working (language parsing, running code, IDE experience, SCrunch, etc.) and then even play around with the AI generating some of this code. I am thinking 8 kyu/7 kyu CodeWars level, basic kind of problems, hello world, loops, simple state machines, some conversion of arrays and lists. A huge part is string manipulation and lists/collections/queries/etc. which is probably best left out in this first prototype iteration.
Once that works, I will show it to some programmers and see what they think and if it clicks easier than the current iteration of ideas. And maybe focus more on hiring, the C# Compiler Job position is still open in the meantime.
Strict is easy to read and write, there is usually only one way to do things and it doesn't need fluff like end of line characters. Blocks are indented and have no start, end or brakets (like in Python). All lines are expressions and have to evaluate to true, otherwise the execution and even compilation stops at this point. Callers can use catch blocks to check for this.
These grammar files are not really used to generate any lexer, parser, tokenizer. They are here for informational purposes and to generate syntax highlighting like for Textmate (.tmLanguage), which can be imported to Visual Studio Code, Textmate, Atom, Ace, Sublime, etc.
We recently got some interest again in developing Strict and got some freelancer help. Our job posts for full-time Compiler Engineers and a C# TDD Developer for our main project are still open: The intelligent robot arm.
Last time I talked about parsing libraries like Pidgin, Sprache and Superpower. The main idea still stands: Don't use the external lexer/parser code generator tools. Instead use combinational parsers and do everything in one go, the current code base shows this very nicely. To be honest I was a bit stuck last year with the supersimple approach of just fixing one test at a time until I ran into trouble with not looking forward or backward in the parser, which is very much needed for expressions in method bodies. We now have a custom tokenizer (thanks to Alexandre) and parsing solution again and things seem to work out.
The main reason nothing happened with Strict this year yet is simply that I have been busy 24/7 with the AI and robotics work, there was absolutely no time for anything else. Plus we recently trying to add some employees and there is a lot of interviews and teaching, learning, code reviews, etc. going on. Abir helps a lot with that recently.
The Strict documentation is still mostly valid, even my C# Coding Guidelines from 2012 are still used for every new programmer that joins the team and they have not changed much in the past 10 years. However recently in interviews applicants noticed that we could be more clear about the current state, what works, what is next, what are the immediate next steps. Hopefully this blog post helps a bit. I will also edit the Documentation once we have more things working (e.g. the tokenizer work from today), I hope the other Strict-ers can also join the fun and help out with writing up what is going on. Wiki and Websites will always be important for Strict as the source code is not allowed to contain any comments, it all has to be on the web instead (AI won't read or understand that anyway atm).
Coverage not at 100%
Instead of pushing the coverage back to 100% with the mess I left behind last year with the tokenizer only working for simple usecases by commenting out problems and barely getting it up and working again, I moved all the commented out code and TODOs back and we should fix them one by one. Not much work really .. however there is plenty of unfinished stuff with both the backend (e.g. c# or c++ code generation) and the virtual machine (mostly not done, just some low level tests).
I added some Cuda experiments late last year and they are very promising, we could easily parallize any code that makes sense to parallize (big loops, neural networks, math, matrices) by running on Gpu or Cpu or both. We have quite a lot of decent computers in the office as well and connecting them all up with our own networking stack (Tachyon, very much a faster version of SignalR), similar to NCrunch work queue servers. This is not easy and we will probably revisit this much later this year. However the Cuda stuff has made some advances this year, we created our own internal repositories for our engine and AI work to handle Cuda code more easily. Still mostly hand-written, but there is also great help with libraries like cudnn that provide most of the math we need for neural networks. Maybe in Q3 we can check this out for Strict as well.
This month (May 2021) our plan is to get all of the low level important parts up and running, there will be a lot of learning, teaching, discussions around many smaller problems like memory management, string handling, math, numbers. Next up is doing small hello world programs, expressions, and finally solving some 8kyu and 7kyu codewars.com katas in strict.
June is all about integrating strict as quickly as possible into IDEs, most importantly Visual Studio Code, but also Visual Studio 2019, IntelliJ and others via Language Server Protocol. We have some early stuff working from last year, but as usual there is a lot of fiddly work to be done to get it all nice and shiny. Especially SCrunch, nice auto-completion, always on compilation, super fast speed and easy refactoring, debugging and all the other great features any decent IDE brings.
In July we want to revisit some old use-cases and talk about new usecases we can then accomplish with the language, maybe focus on compiling Strict with Strict and see whats missing. Maybe networking, maybe parallization, concurrency or building neural networks with Cuda, who knows, we will find out. Most likely we have to go through the existing backlog and see if we are ready to give the language to other programmers and let them solve some katas with it.
Obviously all depends how much time I and the freelancers can spend on this and how successful we are. The most important goals as always are (in this order):
Clean Code with Tests written first!
Super fast always on compilation (I am talking nanoseconds here, with any backend this is not possible, so in our own Virtual Machine)
Very short and easy to understand code (our strict rules will mostly enforce this)
Almost all aspects of the language should be functional (deterministic, no inheritance, composition, most things are only calculated once and reused all the time). There will be 10% of mutatable fields and methods modifying those be allowed for special problems and optimizations, but this is not the norm.
Running the code also must be fast, C++ comparable, all impacted tests are always executed (later with slower integration tests that only run at CI server or checkin times). This includes parallization, concurrency, networking, Cuda and lots and lots of optimizations
And finally our main goal is to build AIs and let Strict be controlled by an AI as well -> we will start with normal Neural Networks like the ones we already write and maintain, up to evolutionary systems and meta parameters.
Till next time, I plan to blog about the progress weekly from now on, gives us also a good overview about our progress.
Btw: Abir and me do weekly Sharp Clean Code 1h live streams on https://twitch.tv/deltaengine and talk about very related things as well, mostly solving some interesting codewars kata or TDD problem.
There are always times when something important has to be fixed or be ready for a presentation, release or milestone. In these times the temptation is very strong to just quickly hack it together and test test test until it works. Short term this is fine and this is pretty much how any Game Jam works, sadly for most games it leads to throw-away code which most people just notice when they start the next project.
We just had such a week last week and I tried to steer the team away from hacking it quickly together for the presentation/milestone. It was still stressful and I didn't really have time to finish my refactoring work on Strict. Since two weeks ago I am still in the process of changing the parsing to the Pidgin library, which works great, but I still have to go through most lines of code, throw away stuff, fix tests and coverage, etc.
Pidgin is a pretty good library similar to Sprache or Superpower with even better performance. It is very similar to the Parsec Monadic parser from the Haskell world, which combines lexing and parsing into a bunch of functions to find expressions this way. Debugging and developing parsing this way is much more comfortable than going the lexer/parser route or using external lexer/parser code generator tools. These work great if you want to do exactly what many have done before you, they generate much better code than a newbie can write himself and it will perform much better. However Strict is not doing much in the traditional way and I still am constantly changing how things work, the more complicated it is to change how the parsing works, the more work it is. Originally I was writing my own parsing (as you see from the earlier commits) and I might continue with that later on, but for now it is nicer to have something working quickly to experiment around until the language is more fleshed out. Pidgin is very well tested and fast, only method bodies need complex parsing in Strict and they are evaluted lazily when needed, most code is not executed and there is no point in loading it or getting it ready. This makes loading files in Strict much faster than in any other programming language, you can load as many files as you like in parallel, more similar to database or json loading and less like c++ compiling.
A good example is the strict ruleset for source code in Strict, we do not want multiple ways to write code (very similar to Python, just more strict and even more basic). There shouldn't be multiple ways to format your code, write loops or indent code or blocks for your conditions. Since the end goal is to generate code via the Stricti AI, there should be the least amount of possible variations leading to compiling code producing the right results (most preferbly there should be exactly one solution).
Own company cryptocurrency token
Another small side project I had over the past two weekends was to create a cryptocurrency token for our company, we have a little internal economy going, basically giving employees a way to earn extra story points from sprints or just a thank you for doing good work. Originally I tried creating an Ethereum smart contract, but fees are crazy high plus things are still very hard to do and test. After looking around a while (haven't done much crypto work for about a year) I went back to Neo and some other smart contract coins like Waves, which I immediately liked. It fit very well to our economy and idea, it also gives the new employees an easy way to get started and learn all about crypto. Things are heating up recently again, Ethereum went up 50% this month, Bitcoin just did a 20% move as well.
However the token was still not a good idea, the experiment ran fine, I got everyone to be their own bank, handle the tokens and explained how to use them. One guy played around with trading, but everyone else didn't do anything last week with it, it seems it still felt like Monopoly Money for most guys here. So I tried to assign a value to the token, but that didn't really work either, no one was exchanging it or even "getting" it.
So this week I discussed all these points with the team and we decided to switch to a stable coin instead (and burn all our tokens). This takes out all the fluctuations and makes it very clear how much each point is worth. Also if one stablecoin dollar is exactly one USD, it is clear what it means, even if it is still hard for some people to understand that have not done anything in crypto yet .. well, learning by doing I guess.
The other change was to change the way that everyone is their own bank back to the MyDashWallet bot system, where the bot has full control over your account and shares the private keys with you if you also want to have control. This way the Telegram bot we are using internally (like several others we have written before) can do whatever you want very easily and securely: tipping, receiving or sending coins, exchanging, raining, price information and many other cool features.
Had a short presentation today at our local crypto meetup and everyone got it immediately there and was very impressed, hopefully the employees will get it as well when using it more :)
Coverage back to 100%
Similar to our company work where we had to cleanup last weeks presentation work to get everything nice and clean again, all tests passing and coverage back to 100%, removing any dummy or hacky code immediatly I still have to do the same for Strict. I am still kinda stuck in the MethodBody parsing, which has to be rewritten as the old LineLexer and Tokenizer parsing logic doesn't make much sense anymore. Should hopefully be finished by tomorrow, I will try to blog more in August on more progress there. We are also discussing creating another blog for our Towers game development starting up right now (or in general game engine development, vr, games, etc.). On my old blog I had a lot of categories, my focus is still just this blog and hopefully the other blogs can be done by other team members.
Still working on the package loading code from the last blog entry. The main issue was the dummy repositories system I build a few days ago to grab code from a fixed folder, which didn't exist on the CI server. So instead of hacking another quick solution, the code was changed to download any repository from github and provide it at a StrictPackages local cache folder. This works very well and is also efficient, but there are so many problems to be solved, not just the caching and when to redownload the cached folders, but a huge amount of testing and CI issues took a long time to fix:
All good now, very fast for development and the CI server will just pull any github repository older than 1h and keep using it for all its tests, later with versioning and https://packages.strict.dev it will work much nicer. Also packages should not just be github repositories, but also be compiled and versioned, which will be much easier to download and use. Currently package management is not very high on the priority list, it just needs to work so testing can go on.
Once packages work the first obvious use case is to grab Types from them. As explained in the last blog post any public type (any package publicly available and any upper-case type in them) are always available in all .strict files, there is no need to import anything, the whole universe is always available. This is pretty cool when writing code and discovering existing types and features, but it makes type discovery quite a challenge and requires a ton of high level optimizations and caching plus low level code that performs very well going through the code trees. This is the picture from the last blog post:
The final implementation is actually just one expression body, but took me multiple days to find all the issues and write a lot of tests to cover all the required features. And even with it working now, the performance is not that great yet, see below for more optimizations.
FindDirectType is just a foreach loop on any type defined in the package directly (not in any sub folder, which are sub packages). It is about twice as fast as a similar Find or FirstOrDefault linq query. It is also usually inlined and only used at a few places:
public Type? FindDirectType(string name)
foreach (var type in types)
if (type.Name == name)
Next the FindType method skips over any private name (when a type starts with a lower case letter) because it wouldn't be allowed to use it any other package anyway. The final line first searches all children packages recursively via FindDirectType again, also excluding the context we are coming from (usually our package we jumped into from the Parent.FindType search).
private Type? FindTypeInChildrenPackages(string name, Context? searchingFromPackage)
foreach (var child in children)
if (child != searchingFromPackage)
var childType = child.FindDirectType(name) ??
(children.Count > 0 ? child.FindTypeInChildrenPackages(name, searchingFromPackage) : null);
if (childType != null)
Not the prettiest code, but it works and performs its job well. This was actually the most difficult part as I initially used FindType here recursively and had a lot of problem of sub trees not searching the same parent again or parents going into the same children over and over again (lots of StackOverflowExceptions).
The first rule of optimization is to measure. I pretty much knew that the main issue will be searching from the root package to all children, so this is where I added the cache. This high level optimization gave already a good boost (10-100x faster depending on the use case), it will probably be way faster in the long run when there are hundreds of packages and thousand or million of files.
This is my first line-by-line profiling on the finished working code with all tests green and ContextTests.LoadingTypesOverAndOverWillAlwaysQuicklyReturnTheSame used to check the performance of doing 1 milion calls to FindType. Without the cache it is around twice as slow (and sometimes would time out NCrunch, so the cache is really good), but as you can see from the profile result, that is not really the main problem.
It seems only 39% of the time is even spend in the code I wrote, most of it is wasted on system, string and collection code. First order of business is to reduce the amount of string manipulations done and maybe inline a few more properties and methods just passing data around (with line-by-line profiling there is a lot of overhead, so switched to sample profiling mode).
Digging deeper into the performance results I saw a lot of Enumerators being created and disposed, so I started removing any foreach loop or linq query and if there was any string manipulation or comparison, I tried to remove it or simpify it. Profiling a bit more after some optimizations showed that most time in my example was spent in the Root package checking the cache, which means it works very well already, almost no time is spent in the tree and the only optimization left is to make the cache faster.
After replacing Dictionary with FastDictionary it was time to profile again and surprise surprise, it was 3 times slower again. I guess .NET core 3.1 is already optimized quite well. I remembered that I could still make string operations about twice as fast by using StringComparer.Ordinal like this blog post talks about, except it didn't help either and made the code about 20% slower than just using non StringComparer methods. Last thing I tried was char.IsLower, replaced it by some custom if code making that part a bit faster as well, but I reverted it back because the .net core is quite optimized and good for this and much more capable than a quick if check (took like 3% of all string checks time, so not important anyway).
This is the final result, I spent over an hour trying out the above optimizations and just made it worse, so back to this version, good enough (78ms for 1 million FindType calls):
Tons of changes have been made in the last few days to load packages with all types and all their methods. The reason and use case was trying to put Strict into production already. Having a few unit tests work and experimenting around with simplified language ideas is all nice and good, but useless in the long run if I can't prove it works with actual code in the real world. It is obviously way too early to tell. However nothing is preventing us to write some tests that assume we can run the existing code already.
Initially I tried running some code in a quick self written interpreter (ala virtual machine) like shown in most compiler/interpreter books and getting some simple state machine and calculator parser and interpreter off the ground is not that hard if you have done it a few times. Not that exiciting for me or Strict, so I was looking for a full solution instead. The much older Strict parser and interpreter was written in NRefactory and then later ported to Roslyn (many many years ago when it came out first) and also using Irony for the SNF parsing. That code still works, but is quite complex and not very similar to the new functional way. We also got the strict sdk running in go and that is working fine too, but we don't have an interpreter/virtual machine here yet, just some backend code to generate source code in another language, which has quite a lot of issues as well (e.g. for c++ code to compile each type must work and currently it just isn't done yet).
Back to the problem at hand: Loading packages, which contain class types and sub packages. Types contain methods and all the statements are in those.
A good example for a package is the Strict.Base package, which gives us all the base types we usually need anyway (reduced the implementation to what is working now, there will be more types soon).
For now Any.strict (providing ComputeHashCode and IsEqualTo methods) was removed as we don't want to force everyone to implement those or autogenerate them for everything. Every type should get a hashcode, equal checks and conversion to text (ala ToString) automatically anyway.
Any trait: Basis for all classes, is always implemented. Provides to HashCode and to Text (both automatically implemented by default in the compiler, can be overwritten)
Mutable trait: Does not implement anything, just provides the compiler with the knowledge that this changes and is not threadsafe (and should be avoided)
Number class: Most used type for anything that requires computation, provides number manipulation methods and to / fromText, etc.
Character class: Needed for text, basically a number, but will be implemented as utf8 char
Count class: Mutable version of Number, which is only used in a single thread, often optimized away
HashCode class: Just implements number and stores the hashcode in the implementation (usually as int)
Text class: List of characters with a bunch of helpful text methods (implemented as string obviously)
Input trait: For getting data, usually from stdin, also reading files or any input device
Output trait: For writing data, usually to stdout, stderr or any file, display, data, etc.
Log class: implements Output is by default implemented to write to the Console (but the user can provide his own implementation, which would change usages)
App trait: Entry point for all apps (there can only be one per package, which must be in the main namespace), requires Run to be implemented
This should be enough to create a console app. If a file is a class or trait usually doesn't matter except when you try to implement it for a new class, where only traits are shown and allowed. Classes are used via has keyword as members. On purpose most complicated methods and features have been left out (localization and culture stuff, we always assume international ISO formats for now). Also no Type, Function or Iterator features yet. Again: We don't want to replace any framework here, just provide the basis so simple programs can be understood and generated by machines.
is(any) returns Booleanto returns HashCodeto returns Text
Defines all the methods available in any type (everything automatically implements Any). These methods don't have to be implemented by any class, they will be automatically implemented with default behavior if not provided. In the current iteration I removed the method keyword as it is obvious that returns is only used for methods (and None methods are easy to spot as well). Often Any is replaced by a specific type or trait to be more useful in an implementation, for example Input.
+(other) returns Number
+(5) is 5Number(3) + 4 is 7
return self + other
-(other) returns Number
-(5) is -5Number(3) - 2 is 1
return self - other
/(other) returns Number
/(50) is 0Number(1) / 20 is 0.05
return self / other
*(other) returns NumberNumber(3) * 4 is 12
return self * other
>(other) returns Is
test(0) is false
test(3) is true
return self > other
>=(other) returns Is
test(0) is true
return self >= other
<(other) returns Is
test(0) is false
test(3) is true
return self < other
<=(other) returns Is
test(0) is true
return self <= other
Currently implements all the basic math operations. Conversion to Text is done in that class.
test(7) is '7'
return '0' + number
test("b") is 'b'
to returns Text
test('a') is "a"
'7' is not valid yet, maybe Character will become private (thus character), not sure if there are any usecases outside Text for this. Converting numbers to Characters is helpful and getting the first Character from text is also good, same as converting back to Text.
implement NumberIncreaseCount(5).Increase is 6
self = self + 1DecreaseCount(3).Decrease is 2
self = self - 1
Here we can test methods that return None because they modify the state of itself (the Number), but we still allow the shortcut testing because we know that we talk about the thing before the None method call. This works everywhere else just as well (even with chaining). ++ or -- are not valid operators in Strict.
Nothing here yet except a number, probably will stay that way and the Any autoimplementation of to HashCode will just xor each member (with some optimizations for complex things like Text).
test(45) is "45"
return stream digit from digits(number)
digits(number) returns Iterator<Number>
test(1) is (1)
test(123) is (1, 2, 3)
if number / 10 > 0
yield digits(number / 10)
yield number % 10
+(other) returns Text
+("more") is "more""Hey" + " " + "you" is "Hey you"
return self.Characters + other.Characters
See the blog post June 17, 2020 As Simple As Possible for details. Because Characters ends with s, the type Character is used as an Iterator (readonly array). The + method adds two texts by using the + method for Iterators, which will just create a new bigger list containing both parts.
Read returns Any
Typical example of a trait in Strict, it is super short and easy to read. When loading files Iterator<Text> or Iterator<Number> might be more useful than just Any, but anything is allowed and can be limited when implementing.
Log implements Output via generic specification implements Output<Text>, so only text entries can be written (lines). The log trait is not implemented in Strict yet, the backend will provide us with a ConsoleLog version that will be injected. For testing we need a MockLog thingy as well and I am currently thinking about enforcing writing Mock implementation classes in Strict when using external classes.
Another very simple trait just telling us to implement Run, which is the entry point for our package (in case we want to run it, most packages will just be libraries).
All this was just done to force me to implement pre-loading types in a package for the current LoadStrictBaseTypes test, then pre-load each of the implementations, members and methods (which might use other not yet loaded types from the same package). And then do the same for the methods, which are evalutated lazily until they are needed. All types and methods defined in a method body need to be available to compile correctly.
This is not easy at all, I tried several approaches and had to revisit and update this a few times until it all made sense and worked, luckily unit tests helped to stay sane. The following picture shows the typical search steps and optimizations done. It is different from simple binary searchs or finding types in other languages because in Strict any public type can be used at any place. There is much more to be done to make this work by discovering types from packages.strict.dev, more on that later.
As described last week I tried to simplify the Strict syntax and get some low level type, member and method parsing working in a new simplified respository: https://github.com/strict-lang/Strict
It took a few evenings to make sense of it, now we got a pretty decent simple packages, type, members and method definition parsing system in around 250 lines. No lexing or actual tokenized parsing is going on, Strict is very strict about the syntax and we can assume a lot of things and just abort if a file doesn't match the expected pattern.
However with methods there is obviously a lot of flexibility and even more rules, this approach isn't going to work. However using a full parser is not the best choice either as it allows way too flexible input (ignoring whitespaces, comments, tabs, spaces, extra spaces at end of lines or files), which we want to avoid. The goal is still to get a 1:1 mapping of compiled packages back to their source code without losing anything going back and forth. Plus we want to find one true solution to a problem and not allow many possible ways to do it (which is impossible to archieve, but at least we can limit the search space a lot).
So I looked around in old code (including the sdk in go, older strict versions in C#, lua, python and C++). Code I found ranged from Domain-specific languages, simple state machine parsers, regex parsers, cool projects like Sprache and Superpower and of course the many available full fletched parsers (ANTLR, Irony, etc.), but nothing really fit out of the box. I tried plugging in some old code and got a few lines working, but I wasn't happy with the extra complexity.
I started back from the beginning with a very simple lexer and spits out tokens, which are then consumed by the MethodParser. Some parts might even be merged because the lexer isn't really doing much and the tokens have to be in an expected order anyway. But error reporting is nice this way and I am not sure about the complexity yet and we might be better off separating lexer, tokens and parsing so applying things like Observer pattern stays easy (have no usecase yet for that, so it is not implemented).
We use the lexer for each line and always start looking at the tabs first, we start at 1 and go deeper for nested statements (if, for, stream), there is no space following this token, but a space must be between every other token except ( and ).
test is the first one in any method
( and ) are needed to pass in arguments to test and method calls
is is our comparer (ala ==, which doesn't exist in Strict)
let allows to create scoped assignments (ala const, reassignment isn't allowed in Strict without the mutable keyword)
identifier to name let assignments, also might be a type, unknown here (actually we could know this and classify this different maybe)
= assign values to let, has or parameters, any expression more complex than a const value can only happen in methods (we don't have a complex parser at member or parameter level anyway)
+ example binary operator for now
returns also has been removed in the last post, the last statement in a method must either be a non-return statement and thus makes the method not return anything (None) or return a value of a specific type.
if, for, etc. coming soon.
MethodCall test or any other method call, currently must include () to tell the parser this is a method call as opposed to a member or let
LetAssignment assigns a value or expression to a local field
Return ends the function and can return a value
As you can see all this is still very easy and allows me to experiment around with different ideas very quickly.
test(1) is 2let doubled = number + number
The let is obviously useless and would be optimized away (which means the source code would change to return number + number automatically and more optimizations based on that). The whole method doesn't make much sense and probably won't be allowed in some future version, i.e. removing and inlining all code would make it much clearer (especially by just replacing it with 2 * number).
All code can be found at the usual location, coverage is 100%, TeamCity does a lot of nice extra checks and the code is still very clean, nice and short: https://github.com/strict-lang/Strict
While exploring options yesterday and today for creating a great editor experience for Strict, I discovered some new options. We already got a VSCode integration that provides basic syntax highlighting and works to write a few lines, but is not a fun experience at all if you are used to fully fletched IDEs. The Strict IntelliJ plugin we recently got working is good enough for some basic Auto Completion / IntelliSense, but there are thousand little issues, which makes the experience not very good yet (which is why it is not released yet and we have noone using or working it daily atm, as opposed to the sdk, Strict and VSCode code bases). I am by no means a Java Guru and don't really like working on top of the IntelliJ platform sdk, so I am unsure when this plugin is gonna be improved.
What sounds very promising is the Language Server Protocol and the growing numbers of implementations. Doing some early experiments a Stritc language server works in Visual Studio Code, Visual Studio 2019 and even IntelliJ (plus a lot of other IDEs and Editors that support it like Emacs, Vim, Atom, whatever people like to use). More on that in the next blog post.
As Simple As Possible, but not simpler
"Everything should be as simple as it can be but not simpler!" - Albert Einstein
While trying to get the Strict Language server plugin up and running for testing, I still noticed some pain points. I am currently preparing the upcoming work for the new employee Mahmoud (starting tomorrow). I can explain away most design decissions, but there are some open issues plus some simplifications that Merlin and me talked about in the past, but are not implemented yet. So instead of continuing with the current sdk in go, I thought why not try starting to bootstrap the Strict compiler directly in Strict .. but no, its not ready yet, I got stuck very quickly.
The sdk code base is already too large for quick experiments, so I just created a new one in c# (where I feel most comfortable until Strict is hopefully more useable later this year) and keep staring at the very old design, the redesign from last year and the redesign from this year (in go). The main thing I noticed is that many checks are just not needed and Strict is very clear on what is valid code and what isn't, so why not get away with no lexing or parsing at the file level at all.
We know a .strict file is describing a type. A type can either be just a trait (think interface) describing what should be implemented, or it is a class optionally implementing one or multiple traits. From the outside it doesn't really matter, you want to use some functionality like Account, Count, Computation, Number, Iteration, etc.
Everthing automatically derives from the Any.strict trait, which looks like this (notice there is no implementation):
Either a file contains no implementations, then it is a trait, or it has just implementations, which is most files. Let's look at some String.strict examples while simplifying the language.
From(5) is "5"From(123) is "123"let result = create StringBuilder()while number > 0
result = "0" + (number % 10) + result
number = number / 10
This was an early implementation idea, close to the current String.strict code. You can see it starts with a bunch of tests to make sure what we are doing makes sense and works. Strict enforces to have at least one test condition for every method (which can be any expression returning true, anything else would fail the test and thus compilation).
Here we implement the generic trait Sequence with the Character class, which is used in the next line to create an array (which is immutable like everything else not marked with the mutable trait). Next we have a special factory method called From, which has no method and no return type as it is a factory method to construct this type based on a number.
Next we create a result, which is not a class name, so here we see a type definition for the first time as the compiler can't figure out what we mean by result automatically (string, text, name would all be strings, stringBuilder would be a StringBuilder, but that is long and ugly). The StringBuilder internally keeps a mutable array of characters we can append to, which is useful in this usecase. Now we use a simple formula to add each base10 number at the beginning of result, then reducing the number by a factor of 10. Finally we return the StringBuilder, which has a to method to give us a String, which matches the characters defined above.
Now there are several problems with this code, first of all the number can't be mutated as everything is immutable by default in Strict. We can change that by making it mutable. Next is that we don't even have while loops, there is currently only one form of loops, which is the good old for loop.
Let's skin the code another way:
test(5) is "5"
test(123) is "123"
create result StringBuilderfor digit inRange(0, Log10(number))
result = "0" + (number % 10) + result
number = number / 10
Ok, here we removed factory, just named it from, which is a reserved keyword anyway to convert stuff to something else. We also added a mutable to the number (which is still of type Number) to allow changing it in our loop. The tests look better as they directly tell us what we are asserting (btw: complex tests with multiple lines can be written as indented code blocks like everything else). Also calling yourself and trying the method name again and again isn't produce, lets just use the test keyword and pass the parameters directly in here.
Next I have renamed let to create and removed all the assignment stuff and the parentheses as there is nothing we pass as parameters. The loop is now a for loop and got the Range going over the digits of the number and still does the same logic inside the loop.
This is still not very functional and it seems I am still trying to low level optimize, which should be the job of the compiler and not the coder. Let's try to go a more functional approach.
test(45) is "45"
return stream digit from digits(number)
test(1) is (1)
test(123) is (1, 2, 3)
if number / 10 > 0
yield digits(number / 10)
yield number % 10
Here we use streams, which are not documented well yet. I just added the streams page. Basically they grab any array, collection, sequence or data and pass it though the pipe in the lines below. Here we simply create a Character for each digit (which does the "0" + number thing for us). The stream combines it all back to an array of characters, which automatically matches our String we wanted to build (any type can be constructed by supplying the has members, no need to write any method, constructor or factory like that, as usual this is forbidden in Strict ^^).
This is not done yet and will be changed many times. I am currently just experimenting with parsing the above code and see if the AST that pops out makes any sense.
Anyway, methods contain code that needs to be parsed, everything else (implement, has, from, method) we can make up from simple rules, which is what I am currently trying at https://github.com/strict-lang/Strict
One final note: I completely removed imports as the Context that is used to parse a file knows all types already and if any unknown type is used in a .strict file, the parsing (and thus compiler) stops. There is probably some ordering that needs to be done and the optional build.yml file needs to allow users to point to more than just the default repository for all known types.
Just two nightly code sessions with most of the time thinking about simplifications what what makes sense, this repository will stay in flux for some time and should not be considered stable (the sdk repo works and is usable and any bugs there we will still fix till the new repo is remotely usable). The main goal here is to make the editor support and language server implementation much easier and also think on what makes sense while adding some code we can compile and run soon (using as much as possible from existing blocks).
Todays goal is just to get it all green on TeamCity CI (Continuous Integration), which is still complaining about some ugly comments, some small issues and not having 100% coverage yet .. no biggy.
My employee Abir poked me to blog more about Strict and the decisions I recently took for the IDE development. My old blog at https://BenjaminNitschke.com (2004-2016) is pretty much dead and I never found time to continue talking about Game Development in recent years, which is mostly because our company just didn't much game development work. All paid work for the past 5 years has been outside of creating games (aside from one mobile game app exception), we still did improve the Delta Engine, participated in Game Jams, worked on our Towers RTS game and stayed connected in the space. However, this was only if there was free time and paid work was always more important to keep the company and employees alive. Now we are doing much better with 10 employees at Delta Engine atm and our AI and Robotics work is getting off the road and use cases are in sight, however, no employee I currently have is a game developer atm (including me). So my focus in life will be work plus https://strict.dev
I am happy to announce that a new employee Mahmoud will help me will all this starting next week, continuing work on Strict, the IDE, compiling, SCrunch, etc. I am pumped to get more speed on the road again.
IntelliJ vs Visual Studio Code
Merlin and me had this discussion earlier this year already, it was actually about which IDEs Strict should support. Merlin is coming from the Java world and really likes IntelliJ, which I agree is the most productive IDE for Java and other JDK languages and we also use Goland to develop https://github.com/strict-lang/sdk, so we decided to focus on that first. I use Visual Studio day in and day out and I am by far the most productive in C# utilizing many tools (ReSharper, NCrunch) and workflows that only exist in Visual Studio. I did actually write an earlier version of strict with the Irony Compiler Kit in .NET over 10 years ago and started writing a Visual Studio Extension, but it was not easy to fix all issues and constantly add features. We also worked at Delta Engine on Visual Studio Extensions around 7 years ago and it was a mess and very hard to maintain. I checked back around a year ago and it is still not a great development experience. So IntelliJ was choosen instead, which is a fully fletched IDE (and since we are using ReSharper, similar in features and hotkeys to what we use daily in Visual Studio) with many amazing features we want for Strict and the platform sdk docs are very good as well: https://www.jetbrains.org/intellij/sdk/docs/intro/welcome.html
The downside to IntellJ (and the Java world in general) is the complexity, there is way too much code, so many little issues you have to know about, so many annoying patterns and copy+pasting until stuff works, which is very different from what we want to archieve with Strict. Development slowed down to a halt, Merlin was annoyed, I didn't ever work with IntelliJ plugins and the learning curve is steep. I created a freelancer project to finish the blocking issues and found a developer that could finish it after a lot of back and forth: https://www.freelancer.com/projects/java/Add-Smart-Auto-Completion-IntelliJ/details
We also have created a simple Visual Studio Code plugin to support the .strict file format and syntax highlighting: https://github.com/strict-lang/vscode-strict
Very powerful, full fletched IDE
Not as open and free as VS Code, thus smaller community outside of the Java world, but still work be free for Strict users (IntelliJ community is free)
Great docs, lots of people comfortable with it
Really Bloated, I really hate downloading 500MB+ just to launch gradle, grab some new sdk, get some IntelliJ version, etc.
Visual Studio Code
Free, small, slick, open source
Greatest plugin ecosystem of all IDEs, really small and fast plugins, similar to Chrome plugins, just nice
Not as powerful, haven't seen a really nice IDE experience or language implementation as good as Visual Studio or IntelliJ provide
Visual Studio 2019
Community edition is free as well
IMO best IDE for the past 20 years by far, pretty much any professional C# or C++ developer uses it (especially game developers, which is the world I am coming from)
Lots of commercial and professional plugins
Not as popular in the open source community, also most of the things are stuck in their world, you can't just reuse features of some plugin as they are mostly closed source
Extension development is not so nice and not so many people do it, smaller things are ok, but complex stuff is painful
Going the Visual Studio Code route for now, I will post more when I find some time this or next week about my little language service and running it with Strict.