Skip to content

Commit

Permalink
Refactor how builtin AST nodes are created.
Browse files Browse the repository at this point in the history
Having custom C++ to create built-in symbols introduced a lot of weird
complexity around scopes and source locations. It was also a cludge,
because it meant the compiler could rely on idiosyncracies that are not
exposed to the language itself (eg, INVALID_FUNCTION was a type
completely inaccessible to scripts).

It also posed another problem. For historical reasons the compiler still
separates the concept of "AST Node" and "Symbol", and trying to merge
them is proving very difficult to do in small steps. Builtin symbols
make it even more difficult because they didn't have corresponding AST
nodes.

The new mechanism for builtins is to embed SourcePawn-language scripts
into C++ code as strings, and then inject those strings into the parser.
This is very easy to do thanks to all of our recent refactorings. And,
again thanks to recent refactorings, we even get correct file/line
information if these builtins happen to have an error.

There are some user-visible changes as a result of this.

1. Command-line macros (eg spcomp.exe X=Y) will now create an in-memory
   script, like:

```
     #define X Y
```

  This will improve diagnostics a bit.

2. Built-in defines and constants are now created via a separate
   in-memory script, that looks like this:

```
     #define __BINARY_PATH__ "test.smx"
     #define __BINARY_NAME__ "test.smx"
     #define __DATE__ "10/21/2023"
     #define __TIME__ "23:19:59"
     const int EOS = 0;
     const int cellmax = 2147483647;
     const int cellmin = -2147483648;
```

3. The "using" keyword has been removed and is no longer implemented.
   The __nullable__ and destructor syntax for handles has been
   re-introduced, and SourceMod will have to bring back the Handle
   methodmap. Because the native alias syntax no longer exists, CloseHandle
   will not suffice, and SourceMod will also have to add a "Handle.~Handle"
   native mapping.
  • Loading branch information
dvander committed Oct 22, 2023
1 parent 0387700 commit 71f30ab
Show file tree
Hide file tree
Showing 39 changed files with 267 additions and 252 deletions.
1 change: 1 addition & 0 deletions compiler/AMBuilder
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ module = binary.Module(builder, 'compiler')
module.sources += [
'assembler.cpp',
'array-helpers.cpp',
'builtin-generator.cpp',
'compile-context.cpp',
'code-generator.cpp',
'data-queue.cpp',
Expand Down
1 change: 0 additions & 1 deletion compiler/ast-types.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@
FOR_EACH(PstructDecl) \
FOR_EACH(TypedefDecl) \
FOR_EACH(TypesetDecl) \
FOR_EACH(UsingDecl) \
FOR_EACH(IfStmt) \
FOR_EACH(ExprStmt) \
FOR_EACH(ReturnStmt) \
Expand Down
55 changes: 55 additions & 0 deletions compiler/builtin-generator.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
// vim: set ts=8 sts=4 sw=4 tw=99 et:
//
// Copyright (c) 2023 AlliedModders LLC
//
// This software is provided "as-is", without any express or implied warranty.
// In no event will the authors be held liable for any damages arising from
// the use of this software.
//
// Permission is granted to anyone to use this software for any purpose,
// including commercial applications, and to alter it and redistribute it
// freely, subject to the following restrictions:
//
// 1. The origin of this software must not be misrepresented; you must not
// claim that you wrote the original software. If you use this software in
// a product, an acknowledgment in the product documentation would be
// appreciated but is not required.
// 2. Altered source versions must be plainly marked as such, and must not be
// misrepresented as being the original software.
// 3. This notice may not be removed or altered from any source distribution.

#include "builtin-generator.h"

#include "compile-context.h"

namespace sp {

BuiltinGenerator::BuiltinGenerator(CompileContext& cc)
: cc_(cc)
{}

void BuiltinGenerator::AddDefine(const std::string& key, const std::string& value) {
buffer_ += "#define ";
buffer_ += key;
buffer_ += " ";
buffer_ += value;
buffer_ += "\n";
}

void BuiltinGenerator::AddBuiltinConstants() {
buffer_ += "const int EOS = 0;\n";
buffer_ += "const int cellmax = " + std::to_string(INT_MAX) + ";\n";
buffer_ += "const int cellmin = " + std::to_string(INT_MIN) + ";\n";
}

void BuiltinGenerator::AddDefaultInclude() {
if (cc_.default_include().empty())
return;
buffer_ += "#tryinclude <" + cc_.default_include() + ">\n";
}

std::shared_ptr<SourceFile> BuiltinGenerator::Generate(const std::string& name) {
return cc_.sources()->Open(name, std::move(buffer_));
}

} // namespace sp
13 changes: 12 additions & 1 deletion compiler/builtins.h → compiler/builtin-generator.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
// vim: set ts=8 sts=4 sw=4 tw=99 et:
// Pawn compiler - File input, preprocessing and lexical analysis functions
//
// Copyright (c) 2023 AlliedModders LLC
//
Expand All @@ -21,14 +20,26 @@

#pragma once

#include "source-file.h"

namespace sp {

class CompileContext;

class BuiltinGenerator final {
public:
explicit BuiltinGenerator(CompileContext& cc);

void AddDefine(const std::string& key, const std::string& value);

void AddBuiltinConstants();
void AddDefaultInclude();

std::shared_ptr<SourceFile> Generate(const std::string& name);

private:
CompileContext& cc_;
tr::string buffer_;
};

} // namespace sp
1 change: 0 additions & 1 deletion compiler/code-generator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,6 @@ CodeGenerator::EmitStmt(Stmt* stmt)
case StmtKind::TypedefDecl:
case StmtKind::TypesetDecl:
case StmtKind::EnumDecl:
case StmtKind::UsingDecl:
case StmtKind::PstructDecl:
case StmtKind::StaticAssertStmt:
case StmtKind::PragmaUnusedStmt:
Expand Down
91 changes: 41 additions & 50 deletions compiler/lexer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -73,13 +73,18 @@ static constexpr int kLitcharUtf8 = 0x1;
// Do not error, because the characters are being ignored.
static constexpr int kLitcharSkipping = 0x2;

bool Lexer::PlungeQualifiedFile(const std::string& name) {
auto fp = OpenFile(name);
bool Lexer::PlungeQualifiedFile(const token_pos_t& from, const std::string& name) {
auto fp = OpenFile(from, name);
if (!fp)
return false;
if (fp->included())
return true;

PlungeFile(fp);
return true;
}

void Lexer::PlungeFile(std::shared_ptr<SourceFile> fp) {
assert(!IsSkipping());
assert(skiplevel_ == ifstack_.size()); /* these two are always the same when "parsing" */

Expand All @@ -91,23 +96,21 @@ bool Lexer::PlungeQualifiedFile(const std::string& name) {
}

auto pos = current_token()->start;

state_.entry_preproc_if_stack_size = ifstack_.size();
PushLexerState();
EnterFile(std::move(fp), pos);
return true;
}

std::shared_ptr<SourceFile> Lexer::OpenFile(const std::string& name) {
std::shared_ptr<SourceFile> Lexer::OpenFile(const token_pos_t& from, const std::string& name) {
AutoCountErrors detect_errors;

if (auto sf = cc_.sources()->Open(name))
if (auto sf = cc_.sources()->Open(from, name))
return sf;

static const std::vector<std::string> extensions = {".inc", ".p", ".pawn"};
for (const auto& extension : extensions) {
auto alt_name = name + extension;
if (auto sf = cc_.sources()->Open(alt_name))
if (auto sf = cc_.sources()->Open(from, alt_name))
return sf;
if (!detect_errors.ok())
return nullptr;
Expand All @@ -116,10 +119,11 @@ std::shared_ptr<SourceFile> Lexer::OpenFile(const std::string& name) {
}

bool
Lexer::PlungeFile(const std::string& name, int try_currentpath, int try_includepaths)
Lexer::PlungeFile(const token_pos_t& from, const std::string& name, int try_currentpath,
int try_includepaths)
{
if (try_currentpath) {
if (PlungeQualifiedFile(name))
if (PlungeQualifiedFile(from, name))
return true;

// failed to open the file in the active directory, try to open the file
Expand All @@ -129,7 +133,7 @@ Lexer::PlungeFile(const std::string& name, int try_currentpath, int try_includep
auto parent_path = current_path.parent_path();
if (!parent_path.empty()) {
auto new_path = parent_path / name;
if (PlungeQualifiedFile(new_path.string()))
if (PlungeQualifiedFile(from, new_path.string()))
return true;
}
}
Expand All @@ -138,7 +142,7 @@ Lexer::PlungeFile(const std::string& name, int try_currentpath, int try_includep
auto& cc = CompileContext::get();
for (const auto& inc_path : cc.options()->include_paths) {
auto path = fs::path(inc_path) / fs::path(name);
if (PlungeQualifiedFile(path.string()))
if (PlungeQualifiedFile(from, path.string()))
return true;
}
}
Expand All @@ -156,8 +160,11 @@ std::string StringizePath(const fs::path& in_path) {
return path;
}

void Lexer::SetFileDefines(const std::string& file) {
fs::path path = fs::canonical(file);
void Lexer::SetFileDefines(const std::shared_ptr<SourceFile> file) {
if (file->is_builtin())
return;

fs::path path = fs::canonical(file->name());

auto full_path = StringizePath(path);
auto name = StringizePath(path.filename());
Expand Down Expand Up @@ -853,9 +860,16 @@ bool Lexer::MaybeHandleLineContinuation() {
return true;
}

// Returns true if the EOF resulted in a file change.
void Lexer::HandleEof() {
assert(!more());

if (prev_state_.empty() && !file_queue_.empty()) {
auto file = ke::PopFront(&file_queue_);
EnterFile(std::move(file), {});
return;
}

if (prev_state_.empty()) {
freading_ = false;
if (!ifstack_.empty())
Expand Down Expand Up @@ -896,7 +910,7 @@ void Lexer::HandleEof() {
assert(skiplevel_ == ifstack_.size());

assert(!IsSkipping()); /* idem ditto */
SetFileDefines(state_.inpf->name());
SetFileDefines(state_.inpf);
}
}

Expand Down Expand Up @@ -1112,6 +1126,7 @@ const char* sc_tokens[] = {"*=",
"int64",
"interface",
"intn",
"INVALID_FUNCTION",
"let",
"methodmap",
"namespace",
Expand Down Expand Up @@ -1183,7 +1198,8 @@ const char* sc_tokens[] = {"*=",
"-include-path-",
"-end of line-",
"-declaration-",
"-macro"};
"-macro-",
"-maybe-label-"};

Lexer::Lexer(CompileContext& cc)
: cc_(cc)
Expand All @@ -1208,8 +1224,16 @@ Lexer::~Lexer() {
}
}

void Lexer::Init(std::shared_ptr<SourceFile> sf) {
void Lexer::AddFile(std::shared_ptr<SourceFile> sf) {
file_queue_.emplace_back(std::move(sf));
}

void Lexer::Init() {
assert(!file_queue_.empty());

freading_ = true;

auto sf = ke::PopFront(&file_queue_);
EnterFile(std::move(sf), {});
}

Expand Down Expand Up @@ -2272,39 +2296,6 @@ bool Lexer::DeleteMacro(Atom* atom) {
return true;
}

void
declare_handle_intrinsics()
{
// Must not have an existing Handle methodmap.
auto& cc = CompileContext::get();
Atom* handle_atom = cc.atom("Handle");
if (methodmap_find_by_name(handle_atom)) {
report(156);
return;
}

methodmap_t* map = methodmap_add(cc, nullptr, handle_atom);
map->nullable = true;

declare_methodmap_symbol(cc, map);

auto atom = cc.atom("CloseHandle");
if (auto sym = FindSymbol(cc.globals(), atom)) {
auto dtor = new methodmap_method_t(map);
dtor->target = sym;
dtor->name = cc.atom("~Handle");
map->dtor = dtor;
map->methods.emplace(dtor->name, dtor);

auto close = new methodmap_method_t(map);
close->target = sym;
close->name = cc.atom("Close");
map->methods.emplace(close->name, close);
}

map->is_bound = true;
}

DefaultArg::~DefaultArg()
{
delete array;
Expand Down Expand Up @@ -2332,7 +2323,7 @@ void Lexer::EnterFile(std::shared_ptr<SourceFile>&& sf, const token_pos_t& from)
state_.pos = state_.start;
state_.line_start = state_.pos;
SkipUtf8Bom();
SetFileDefines(state_.inpf->name());
SetFileDefines(state_.inpf);

state_.inpf->set_included();

Expand Down
19 changes: 13 additions & 6 deletions compiler/lexer.h
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ enum TokenKind {
tINT64,
tINTERFACE,
tINTN,
tINVALID_FUNCTION,
tLET,
tMETHODMAP,
tNAMESPACE,
Expand Down Expand Up @@ -220,6 +221,7 @@ enum TokenKind {
tNEWDECL, /* for declloc() */
tENTERED_MACRO, /* internal lexer command */
tMAYBE_LABEL, /* internal lexer command, followed by ':' */
// Make sure to update the token list in lexer.cpp.
tLAST_TOKEN_ID
};

Expand Down Expand Up @@ -296,6 +298,9 @@ class Lexer
Lexer(CompileContext& cc);
~Lexer();

void AddFile(std::shared_ptr<SourceFile> sf);
void PlungeFile(std::shared_ptr<SourceFile> sf);

int lex();
int lex_same_line();
bool match_same_line(int tok);
Expand All @@ -310,12 +315,12 @@ class Lexer
void lexpush();
void lexclr(int clreol);

void Init(std::shared_ptr<SourceFile> sf);
void Init();
void Start();
bool PlungeFile(const std::string& name, int try_currentpath, int try_includepaths);
std::shared_ptr<SourceFile> OpenFile(const std::string& name);
bool PlungeFile(const token_pos_t& from, const std::string& name, int try_currentpath,
int try_includepaths);
std::shared_ptr<SourceFile> OpenFile(const token_pos_t& from, const std::string& name);
bool NeedSemicolon();
void AddMacro(const char* pattern, const char* subst);
void LexStringContinuation();
void LexDefinedKeyword();
bool HasMacro(Atom* atom);
Expand Down Expand Up @@ -372,10 +377,10 @@ class Lexer
void LexStringLiteral(full_token_t* tok, int flags);
void LexSymbol(full_token_t* tok, Atom* atom);
bool MaybeHandleLineContinuation();
bool PlungeQualifiedFile(const std::string& name);
bool PlungeQualifiedFile(const token_pos_t& from, const std::string& name);
full_token_t* PushSynthesizedToken(TokenKind kind, const token_pos_t& pos);
void SynthesizeIncludePathToken();
void SetFileDefines(const std::string& file);
void SetFileDefines(const std::shared_ptr<SourceFile> file);
void EnterFile(std::shared_ptr<SourceFile>&& fp, const token_pos_t& from);
void FillTokenPos(token_pos_t* pos);
void SkipLineWhitespace();
Expand Down Expand Up @@ -455,6 +460,7 @@ class Lexer
bool deprecated;
};
std::shared_ptr<MacroEntry> FindMacro(Atom* atom);
void AddMacro(const char* pattern, const char* subst);
bool DeleteMacro(Atom* atom);
bool EnterMacro(std::shared_ptr<MacroEntry> macro);
bool IsInMacro() const { return state_.macro != nullptr; }
Expand Down Expand Up @@ -515,6 +521,7 @@ class Lexer
};

LexerState state_;
std::deque<std::shared_ptr<SourceFile>> file_queue_;
tr::vector<LexerState> prev_state_;

// Set if tokens are being lexed into a new token cache.
Expand Down
Loading

0 comments on commit 71f30ab

Please sign in to comment.