Eudora/compiling

from HTYP, the free directory anyone can edit if they can prove to me that they're not a spambot
Jump to navigation Jump to search

As of 2021-09-10, obstacles to compiling the Eudora source to work under Linux include:

1. It depends on MFC, which is proprietary and expensive (can be illegally obtained, but for dev/test purposes only)
1a. MFC needs to be recompiled for Linux, which requires finding some way to emulate Microsoft-specific language extensions ( __try and __except).
2. It depends on the Stingray Objective Toolkit, which is so expensive they don't even quote a price on the web site, and which is obscure enough that it has so far proven impossible to pirate. (What is needed are the headers and the DLL; the headers are included as part of the Eudora source, and the DLL is probably available as part of the Eudora end-user install, but we haven't been able to find it yet.)

notes from Kara

The following was scraped off Kara's Discord monologuing, 2021/09/10-12, in case it's useful to anyone:

I'd forgotten that thread_local is a thing. I think it probably is possible with macros after all

ideally all the modification needed to the code to make the __try construct work is to add an extra #include to each file with __try in it. Which can be easily automated

How big should the stack be? As in, how many try blocks deep should the program be able to enter into before overflowing the stack? dynamically resizing it would be inefficient both in terms of execution performance and of amount of code needed

no, I once again don't think it can be implemented by macros, because the signal handler needs to know the value of the except expression in order to determine where to send control flow to, but the except statement is syntactically separate from the __try, where is where the signal handler needs to be defined by so it looks like AST transformation is the only option

unless you can longjmp back into the signal handler after it returns? No, wait, that would also not work because you can only longjmp to places still on the stack So I need something that can parse C and C++, transform the AST, and then either convert back to C/C++ or send further to be compiled

I think the best approach would be to write a gcc plugin that transforms the code as part of the compilation timeline

Making progress. Now I need to figure out the correspondence to the POSIX signal codes in this table (https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html) to the Microsoft exception codes in this table (https://docs.microsoft.com/en-us/windows/win32/Debug/getexceptioncode)


I think I figured out the exception handling emulation, but this is going to be a nightmare to test

This is what I have so far. It probably has a ton of bugs, and its not even compilable in its current state, but its a start

#include <setjmp.h>
#include <signal.>
#include <threads.h>

const int max_jmp_depth = 256;
thread_local int jmp_depth;
thread_local jmp_buf[max_jmp_depth] jmps;
thread_local (void (*exprs)(int))[max_jmp_depth];

int sig_code_to_ext_code(int sig_code) {
    return 0;
    //TODO
}

void except_action(int sugnum, siginfo_t *info, void *context) {
    int exc_mode = exprs[jmp_depth - 1](sig_code_to_ext_code(info->si_code));
    if (exc_mode == EXCEPTION_CONTINUE_EXECUTION) {
        return;
    }
    longjmp(jmps[--jmp_depth], exc_mode == EXCEPTION_CONTINUE_SEARCH ? 1 : 2);
}

void finally_action(int signum, siginfo_t *info, void *context) {
    longjmp(jmps[--jmp_depth], 1);
}

// Note: handlers probably need to be repeated for
// SIGINT, SIGTERM, SIGILL, SIGFPE, and SIGABRT

// Representation of AST transform
__try {block1} __except (except_expr) {block2} ->

{
    struct sigaction old;
    struct sigaction now;
    now.sa_sigaction = except_action;
    now.sa_flags = SA_SIGINFO;
    now.sa_mask = 0;
    sigaction(SIGSEGV, *now, *old);
    // TODO define function from except_expr, get function pointer
    exprs[jmp_depth] = *(fn_pointer);
    int jmp_code = setjmp(jmps + (jmp_depth++));
    if (jmp_code == 0) {block1; jmp_depth--}
    sigaction(SIGSEGV, *old, NULL);
    if (jmp_code == 1) {
        raise(SIGSEGV)
    } else /*jmp_code == 2 */ {
        block2;
    }
}

// Representation of AST transform
__try {block1} __finally {block2} ->
{
    struct sigaction old;
    struct sigaction now;
    now.sa_sigaction = finally_action;
    now.sa_flags = SA_SIGINFO;
    now.sa_mask = 0;
    sigaction(SIGSEGV, *now, *old);
    int jmp_code = setjmp(jmps + (jmp_depth++));
    if (jmp_code == 0) {block1; jmp_depth--}
    sigaction(SIGSEGV, *old, NULL);
    { block2; }
    if (jmp_code == 1) {
        raise(SIGSEGV);
    }
}

Non-local control flow can be quite confusing to work with, so I'd be extremely surprised if I didn't get anything fundamentally wrong

The official implementation probably uses the call stack directly for the data and thus avoids needing global (to the executing thread) variables Afaict each signal needs to be handled separate, so the SIGSEGV there is only one of six

ugh, this mismatch between POSIX signal codes and MS exception codes is annoying. Some of them match up, but then theres also 5 POSIX codes that I couldn't find a clear MS code for, and 11 MS codes that I couldn't find a clear POSIX code for

since the conditions they each interrupt in response to are only half the same

i just realized that it's almost impossible to search for information about actual stack overflows

Where do I go from here?

Maybe I really don't need the AST. But I would have to scan the text. What's the best way to look for a pattern in the source code but not in a literal string if it happens to appear there too?

I wish C had anonymous functions

Would make this a lot easier since i wouldn't have to algorithmically make up an identifier for the function and find somewhere in the top-level scope to put its definition

Okay, I think I figured out a way to make that part work by using a non-standard gcc extension... which is still nonstandard, but at least it's open source and runs on linux But I still need to write the source transform thingy without messing with the contents of string literals

I wonder if there's a way to have the C preprocessor output the tokenized form

Like, I think I've figured a fair bit of it out, but I don't know the best way to do the actual source code transformation


The stuck point is needing to work around the MSVC-specific language extension to compile on gcc

If I had something to tokenize a C/C++ file the same was as the pro processor does internally, then it wouldn't be too hard