From 3377fb9ba36800fd3f40616e3906d0f22228c740 Mon Sep 17 00:00:00 2001 From: Runxi Yu Date: Thu, 5 Dec 2024 20:51:59 +0800 Subject: New repo locations and file structure --- .gitignore | 2 +- Makefile | 14 +++-- README.md | 143 +++-------------------------------------------- language_description.md | 146 ++++++++++++++++++++++++++++++++++++++++++++++++ reasoning | 10 ---- 5 files changed, 164 insertions(+), 151 deletions(-) create mode 100644 language_description.md delete mode 100644 reasoning diff --git a/.gitignore b/.gitignore index 7bd2dc8..2d19fc7 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1 @@ -/README.html +*.html diff --git a/Makefile b/Makefile index 2a4029d..dcd9a2c 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,11 @@ -.PHONY: upload +.PHONY: upload default -README.html: README.md - pandoc --mathml -so README.html -c style.css README.md +default: language_description.html -upload: README.html style.css - rsync --mkpath README.html style.css runxiyu.org:/var/www/docs/e2/ +.SUFFIXES: .md .html + +.md.html: + pandoc --mathml -so $@ -c style.css $< + +upload: language_description.html style.css + rsync --mkpath language_description.html style.css runxiyu.org:/var/www/docs/e2/ diff --git a/README.md b/README.md index f61f934..655b1fe 100644 --- a/README.md +++ b/README.md @@ -1,135 +1,8 @@ ---- -title: $e^2$ language testing space -author: Test_User and Runxi Yu ---- - -Note: The name "$e^2$" (or "e2" in plain text) is subject to change. - -Many languages attempt to be "memory safe" by processes such as reference -counting, borrow checking, and mark-and-sweep garbage collection. These, for -the most part, are guided towards preventing programmer error that causes -use-after-frees, memory leaks, and similar conditions. We hereby refer to them -as "conventional memory safety features". - -However, in most cases, languages other than assembly (including these allegedly -memory safe languages) do not handle stack overflows correctly; -although dynamic allocation failures could be easily handled, correctly-written -programs could crash when running out of stack space, with no method to detect -this condition and fail gracefully. -) -Conventional memory safety features are not our priority, but we may choose to -include them in the future, likely with reference counting while allowing weak -pointers to be labelled. - -## General idea of how it should work - -We haven't decided on general syntax yet. We generally prefer C-like syntax, -although syntax inspired from other languages are occasionally used when -appropriate; for example, multiple return values look rather awkward in the C -syntax, so perhaps we could use Go syntax for that (`func f(param1, param2) -(return1, return2)`), although we'd prefer naming parameters with `type -identifier` rather than `identifier type`. - -For stack safety: When defining a function, the programmer must specify what to -do if the function could not be called (for example, if the stack is full). For -example, `malloc` for allocating dynamic memory would be structured something -like follows: - -```e2 -func malloc(size_t s) (void*) { - /* What malloc is supposed to do */ - return ptr; -} onfail { - return NULL; -} -``` - -If something causes `malloc` to be uncallable, e.g. if there is insufficient -stack space to hold its local variables, it simply returns NULL as if it failed. - -Other functions may have different methods of failure. Some might return an -error, so it might be natural to set their error return value to something like -`ESTACK`: - -```e2 -func f() (err) { - return NIL; -} onfail { - return ESTACK; -} -``` - -The above lets us define how functions should fail due to insufficient stack. -This pattern is also useful outside of functions as a unit, therefore we -introduce the following syntax for generic stack failure handling: - -```e2 -either { - /* Do something */ -} onfail { - /* Do something else, perhaps returning errors */ -} -``` - -Note that the `onfail` block must not fail; therefore, the compiler must begin -to fail functions, whenever subroutines that those functions call have `onfail` -blocks that would be impossible to fulfill due to stack size constraints. - -Functions can be marked as `nofail`, in either the function definition or when -calling it. A `nofail` specification when calling it overrides the function -definition. - -```e2 -nofail func free() () { - /* What free is supposed to do */ -} -``` - -This will ensure that calling `free` can never fail due to lack of stack space. -If such a case were to present itself, the compiler must make the caller fail -instead. This is recursive, and thus you cannot create a loop of `nofail` functions. -You may use `canfail` to be explicit about the reverse in function definitions, -or to override a function when calling it. In the latter case, if the function -does not define an `onfail` section, you must wrap it in a `either {...} onfail -{...}` block. - -## Overflow/underflow handling - -Integer overflow/underflow is *usually* undesirable behavior. - -Simple arithmetic operators return two values. The first is the result of the -operation, and the second is the overflowed part, which is a boolean in -addition/subtraction and the carried part in multiplication; but for division, -it is the remainder. The second return may be ignored. - -Additionally, we define a new syntax for detecting integer overflow on a wider -scope: -```e2 -int y; -try { - /* Perform arithmetic */ - y = x**2 + 127*x; -} on_overflow { - /* Do something else */ -} -``` -The overflow is caught if and only if it is not handled at the point of the -operation and has not been handled at an inner `on_overflow`. - -## Other non-trivial differences from C - -1. Instead of `errno`, we use multiple return values to indicate errors where - appropriate. -2. Minimize undefined behavior, and set stricter rules for - implementation-defined behavior. -3. Support compile-time code execution. -4. More powerful preprocessor. -5. You should be able to release variables from the scope they are in, and not - only be controllable by code blocks, so stack variables can be released in - differing orders. -6. Strings are not null-terminated by default. -7. There is no special null pointer. -8. No implicit integer promotion. -9. Void pointers of varying depth (such as `void **`) can be implicitly casted - to pointers of the same or deeper depth (such as `void **` -> `int ***`, - but not `void **` -> `int *`). +# The e² programming language + +- [Home page](https://docs.runxiyu.org/e2/) +- [C implementation](https://docs.runxiyu.org/tau2/) +- [Git repositories](https://git.runxiyu.org/e2/) +- [Ticket tracker](https://todo.sr.ht/~runxiyu/e2/) +- [Announcement list](https://lists.sr.ht/~runxiyu/e2-announce/) +- [Development list](https://lists.sr.ht/~runxiyu/e2-devel/) diff --git a/language_description.md b/language_description.md new file mode 100644 index 0000000..fb77a16 --- /dev/null +++ b/language_description.md @@ -0,0 +1,146 @@ +--- +title: $e^2$ language description +author: Test_User and Runxi Yu +--- + +## Introduction + +Many languages attempt to be "memory safe" by processes such as reference +counting, borrow checking, and mark-and-sweep garbage collection. These, for +the most part, are guided towards preventing programmer error that causes +use-after-frees, memory leaks, and similar conditions. We hereby refer to them +as "conventional memory safety features". + +However, in most cases, languages other than assembly (including these allegedly +memory safe languages) do not handle stack overflows correctly; +although dynamic allocation failures could be easily handled, correctly-written +programs could crash when running out of stack space, with no method to detect +this condition and fail gracefully. + +Conventional memory safety features are not our priority, but we may choose to +include them in the future, likely with reference counting while allowing weak +pointers to be labelled. + +## General syntax + +We haven't decided on general syntax yet. We generally prefer C-like syntax, +although syntax inspired from other languages are occasionally used when +appropriate; for example, multiple return values look rather awkward in the C +syntax, so perhaps we could use Go syntax for that (`func f(param1, param2) +(return1, return2)`), although we'd prefer naming parameters with `type +identifier` rather than `identifier type`. + +## Stack safety + +When defining a function, the programmer must specify what to do if the +function could not be called (for example, if the stack is full). For example, +`malloc` for allocating dynamic memory would be structured something like +follows: + +```e2 +func malloc(size_t s) (void*) { + /* What malloc is supposed to do */ + return ptr; +} onfail { + return NULL; +} +``` + +If something causes `malloc` to be uncallable, e.g. if there is insufficient +stack space to hold its local variables, it simply returns NULL as if it failed. + +Other functions may have different methods of failure. Some might return an +error, so it might be natural to set their error return value to something like +`ESTACK`: + +```e2 +func f() (err) { + return NIL; +} onfail { + return ESTACK; +} +``` + +The above lets us define how functions should fail due to insufficient stack. +This pattern is also useful outside of functions as a unit, therefore we +introduce the following syntax for generic stack failure handling: + +```e2 +either { + /* Do something */ +} onfail { + /* Do something else, perhaps returning errors */ +} +``` + +Note that the `onfail` block must not fail; therefore, the compiler must begin +to fail functions, whenever subroutines that those functions call have `onfail` +blocks that would be impossible to fulfill due to stack size constraints. + +Functions can be marked as `nofail`, in either the function definition or when +calling it. A `nofail` specification when calling it overrides the function +definition. + +```e2 +nofail func free() () { + /* What free is supposed to do */ +} +``` + +This will ensure that calling `free` can never fail due to lack of stack space. +If such a case were to present itself, the compiler must make the caller fail +instead. This is recursive, and thus you cannot create a loop of `nofail` functions. +You may use `canfail` to be explicit about the reverse in function definitions, +or to override a function when calling it. In the latter case, if the function +does not define an `onfail` section, you must wrap it in a `either {...} onfail +{...}` block. + +`nofail` exists because if you can get into a situation where there's no way to +free resources you no longer need, you have done something wrong. If the +language doesn't give you a way to not do the above, the language has done +something wrong. `free()`, `close()`, unlocking, and other such should be +marked as `nofail`, so that you don't run out of stack space trying to call +them, resulting in inability to free resources. It's good for situations where +failing to call a function partway through is deemed (by the programmer) +undesirable, and useful for times when you don't want to deal with failurem + +## Overflow/underflow handling + +Integer overflow/underflow is *usually* undesirable behavior. + +Simple arithmetic operators return two values. The first is the result of the +operation, and the second is the overflowed part, which is a boolean in +addition/subtraction and the carried part in multiplication; but for division, +it is the remainder. The second return may be ignored. + +Additionally, we define a new syntax for detecting integer overflow on a wider +scope: +```e2 +int y; +try { + /* Perform arithmetic */ + y = x**2 + 127*x; +} on_overflow { + /* Do something else */ +} +``` +The overflow is caught if and only if it is not handled at the point of the +operation and has not been handled at an inner `on_overflow`. + +## Other non-trivial differences from C + +1. Instead of `errno`, we use multiple return values to indicate errors where + appropriate. +2. Minimize undefined behavior, and set stricter rules for + implementation-defined behavior. +3. Support compile-time code execution. +4. More powerful preprocessor. +5. You should be able to release variables from the scope they are in, and not + only be controllable by code blocks, so stack variables can be released in + differing orders. +6. Strings are not null-terminated by default. +7. There is no special null pointer. +8. No implicit integer promotion. +9. Void pointers of varying depth (such as `void **`) can be implicitly casted + to pointers of the same or deeper depth (such as `void **` -> `int ***`, + but not `void **` -> `int *`). diff --git a/reasoning b/reasoning deleted file mode 100644 index 090090d..0000000 --- a/reasoning +++ /dev/null @@ -1,10 +0,0 @@ -Some reasons for some stuff done here. Not going to be a part of the specification, just ideas for why. - -nofail: - If you can get into a situation where there's no way to free resources you no longer need, you have done something wrong. - If the language doesn't give you a way to do the above, the language has done something wrong. - free(), close(), unlocking, and other such should be marked as `nofail`, so that you don't run out of stack space trying to call them, resulting in inability to free resources. - - Good for situations where failing to call a function partway through is deemed (by the programmer) undesirable. - - Also usable for times when you don't want to have to deal with failure. -- cgit v1.2.3