Converting 400 line VBA function to C++

Pages: 12
I am trying to convert a 400 line VBA function to C++. I can't work out how to replicate this system in C++. I tried to convert each "gosub xxx" into a "void xxx(void)" so that I get access to the overall function's scoped variables, but get the error "Local functions are illegal". I have done some research and found that local functions are OK in C but not in C++. They talk about lambdas but I can't get my head around them. I would really like some guidance on how to approach this challenge.

In VBA I use blocks of code which are all less than one screen in size so each section of logic is easily readable and understandable. For example this is the main loop:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Do While strState <> strEndParse
    Call getChar
    If strState = "beginToken" Then
        GoSub beginToken
    ElseIf strState = "collectTerm" Then
        GoSub collectTerm
    ElseIf strState = "collectString" Then
        GoSub collectString
    ElseIf strState = "collectLabel" Then
        GoSub collectLabel
    ElseIf strState = "lookOper" Then
        GoSub lookOper
    ElseIf strState = "collectCommand" Then
        GoSub collectCommand
    ElseIf strState = "gotEscape" Then
        GoSub gotEscape
    Else
        strState = strEndParse
    End If
Loop

Then each of the subroutines (beginToken, collectTerm, ... etc) are about one screen's length in code, so are easy to read and understand. All these subroutines above are only ever called in one place, so they could be placed inline, but then I get a 400 line module which is impossible to debug. Some of these subroutines call little utility subroutines. These are called in multiple places and are only relevant to this function.

Thanks in advance
"Local functions are illegal"

Have you written?
1
2
3
4
5
6
7
8
int main() {

  void xxx(void) { // ERROR
    // do xxx
  }

  xxx(); // was "GoSub xxx"
}


The error in that is that you can't implement one function inside other function. You have to write it outside:

1
2
3
4
5
6
7
void xxx() {
  // do xxx
}

int main() {
  xxx(); // was "GoSub xxx"
}



PS. In C++ function declared with no parameters void xxx() really means that it takes none; the void is not necessary.
Apparently GoSub is not really a subroutine call, but more of an automatically-returning goto. The "Call" statement is a real subroutine call. So one (ugly) way to code it would be:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
void YourFunction()
{
    while (strState != strEndParse)
    {
        getChar();
        if (strState == "beginToken")
            goto beginToken;
        else if (strState == "collectTerm")
            goto collectTerm;
        else
            strState = strEndParse
gosub_return: ;
    }

    ...

    return; // so you don't fall into the following code

beginToken:
    ...
    goto gosub_return;

collectTerm:
    ...
    goto gosub_return;
}

This is some very ugly C++, though!
Last edited on
Just for fun:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
struct subr_stack { int size = 0; int labels[32] {}; } my_stack ; 
#define GOSUB(subr) do { my_stack.labels[my_stack.size++] = __LINE__; goto subr; case __LINE__:; } while(false)
#define RETURN() do { next_subr = my_stack.labels[--my_stack.size]; goto top; } while (false)
#define BEGIN_PROCEDURE { int next_subr = 0; top: switch (next_subr) { default: 
#define END_PROCEDURE }}

int main()
{
  BEGIN_PROCEDURE
    std::cout << "here";
    GOSUB(foo);
    std::cout << ", and back again\n";
    return 0;
    
    foo:
      std::cout << ", there";
      RETURN();
  END_PROCEDURE
}
Last edited on
That's insane! Very creative use of a switch.
@OP
Translating directly from VB to C++ is the hard way to do this job. It would be much easier, faster and less likely to end up as a complete C++ mess to understand and describe what the program is doing first, eg via a roughed out pseudocode plan, and then implementing that plan using C++.

Of course that requires an understanding of simple C++ coding, so that means going to a tutorial and learning some simple C++ fundamentals, eg switches, decisions and functions.
DizzyDon wrote:
Apparently GoSub is not really a subroutine call, but more of an automatically-returning goto.


Hi DizzyDon, you write some great answers in your posts, but I was wondering what the difference between an "automatically-returning goto" and an ordinary function call (Call statement in VBA)? Maybe some state is saved on the stack with the Call statement? What is wrong with making them function calls in C++? It's just that I believe we shouldn't even mention goto to beginners !

@naturtle

Not sure why this is a struggle for you. One does have to declare or define a function before using it as keskiverto mentioned.
Here is another way to do it, it means main is near the beginning of the file. I personally dislike having to go looking for the main function, finding it 380 lines down because all the functions are defined first.

1
2
3
4
5
6
7
8
9
10
#include files
void xxx() ; //declaration

int main() {
  xxx(); // was "GoSub xxx"
}

void xxx() { // definition
  // do xxx
}


Here is my abbreviated attempt at translation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
while (strState != strEndParse) {
    getChar();
    if (strState == "beginToken") {
        beginToken();
    }
    else if (strState == "collectTerm") {
        collectTerm();
    }
//  ......   
   else {
        strState = strEndParse;
   }
 
}


Notice that I always put braces, even if there is only one statement. This may save someone one day when they add more code.
Last edited on
What is wrong with making them function calls in C++? It's just that I believe we shouldn't even mention goto to beginners!

The difference is that the gosub subroutines are inside of another function and therefore have access to its variables.
And I agree that goto is to be avoided in general and especially by beginners.
It was just a way to make it work, but as againtry said, it would be best to rewrite the whole thing as C++.
@DizzyDon

Alright, cool, thanks for your reply :+)
DizzyDon wrote:
The difference is that the gosub subroutines are inside of another function and therefore have access to its variables.
naturtle wrote:
They talk about lambdas but I can't get my head around them.

The lambda is an object that can be used like a function and it is defined inside another function.
https://en.cppreference.com/w/cpp/language/lambda
The lambda can capture -- have access to variables.
When you call lambda you can also pass arguments, like to other functions you call.

TheIdeasMan wrote:
Here is another way to do it, it means main is near the beginning of the file. I personally dislike having to go looking for the main function, finding it 380 lines down because all the functions are defined first.

Since lambdas have to be defined before (or while) calling them, one would still have the "400-line module" where the DO WHILE LOOP is at the end.
Furthermore, what values they actually capture is, lets say "versatile". (Read: haven't got my head around all the details.)

In other words, regular standalone functions that take necessary data as by value or by reference parameters is the way to get those neat blocks of code that are easy to debug too.
Thank you all for your really helpful responses and the discussion between you.

I have decided to go with the @DizzyDon method in spite of it being ugly as all hell and in spite of my dislike of "goto". When I reworked my code I found that all the "gosub xxx" were able to be converted into "goto xxx" with a "goto gosub_return" at the end of each code block.

@mbozzi - your post absolutely blew me away. What a creative solution. I don't need it for this module, but I will put it on file in case I need it elsewhere.

@TheIDeasMan - I ended up with this as my main loop, but I have taken on your comment about always putting {} around a single statement and have done that in all other places.
1
2
3
4
5
6
7
8
9
10
11
12
        if (strState == "beginToken") goto beginToken;
        else if (strState == "collectTerm") goto collectTerm;
        else if (strState == "collectString") goto collectString;
        else if (strState == "collectLabel") goto collectLabel;
        else if (strState == "lookOper") goto lookOper;
        else if (strState == "collectCommand") goto collectCommand;
        else if (strState == "gotEscape") goto gotEscape;
        else {
            handleError(strModule, strState, "Unknown state");  // tec019
            strState = strEndParse;
        }
gosub_Return:


Interesting - my module is now 334 lines long in C++ compared to 385 lines in VBA.
@TheIDeasMan - I ended up with this as my main loop, ....


But now you have goto which is explicitly what we don't want you to have. Again, againtry's comment about not trying to convert from VBA directly, rather understand the problem and design C++ from the start.

The problem of functions having access to variables can be solved by putting variables and functions into a class, because class functions have direct access to class variables. But don't fall into the trap of making a God class that does everything. Beginner coders often think that OOP is easy (classes with variables and functions): But actually it can be a little tricky to get it right.

Another thing to mention is having a Finite State Machine, you could research some examples to look at. Another thing to research is Design Patterns

.... but I have taken on your comment about always putting {} around a single statement and have done that in all other places.


Except that you didn't do it for the snippet you posted .

Note that mbozzi's code was just for fun, it demonstrates how C++ can be turned into code that looks like another language. Do not do this. You won't get any credit for it, whether from a teacher or a colleague or a code maintainer. It could be a fire-able offence at work, at least an interview terminator.
A single routine which shares state with sub-routines can be modeled as this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <iostream>

void do_thing()
{
  class 
  { 
    // any state which should be accessible by subroutines goes here:
    int shared_state; 

    // "nested functions" go here
    void first_subroutine() { shared_state = 42; }
    void second_subroutine() { shared_state = 123; }

  public:
    void go() // enclosing function goes here
    {
      first_subroutine(); std::cout << shared_state << '\n';
      second_subroutine(); std::cout << shared_state << '\n';
    }
  } implementation;
  
  implementation.go(); 
}

int main() { do_thing(); }


Note that the C++ specification uses classes to define lambda expressions. But lambda expressions came first (not in the C++ spec, but in the history of computing). They were fundamental even in Church and Turing's time, while classes came later.

You could also use lambdas to the same effect:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>
#include <string>

int main()
{
  double x = 0;
  std::cout << x << '\n';
  
  // bold parts are in order:
  //   The name of the nested function,
  //   The parameter list of the nested function,
  //   The return type of the nested function.
  // For now the rest of the hieroglyphs can be ignored.
  auto my_subroutine = [&](double first_parameter) 
    mutable -> std::string
  { // here's the function's body
    x = first_parameter;
    return "my string";
  };
  
  std::string s = my_subroutine(42.0);
  std::cout << s << '\n';
  std::cout << x << '\n';
}


From a C++ perspective my_subroutine is an object with class type. Because this class has a member function with the special name operator(), objects with that type can appear on the left of an argument list and be called just like a normal function, as in
my_subroutine(42.0)
In this way it is very similar to the first snippet.

Last edited on
Now I feel bad for even mentioning the goto "solution".
If the OP would share his original VBA code, it could probably be rewritten to not only be proper C++ but to be faster, too. For instance the fixed strings could probably be enums, which would compare much faster.
Note that mbozzi's code was just for fun, it demonstrates how C++ can be turned into code that looks like another language. Do not do this. You won't get any credit for it, whether from a teacher or a colleague or a code maintainer. It could be a fire-able offence at work, at least an interview terminator.


Spoil-sport!

I seem to remember reading somewhere that one of the main authors for a major unix program (a shell??) used something similar to emulate a Basic language in C and then wrote the code using that. Has anyone any other info on this?
Bourne abused macros to make the original sh program look like ALGOL-68.
https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
LOCAL REGPTR	syncase(esym)
	REG INT	esym;
{
	skipnl();
	IF wdval==esym
	THEN	return(0);
	ELSE	REG REGPTR	r=getstak(REGTYPE);
		r->regptr=0;
		LOOP wdarg->argnxt=r->regptr;
		     r->regptr=wdarg;
		     IF wdval ORF ( word()!=')' ANDF wdval!='|' )
		     THEN synbad();
		     FI
		     IF wdval=='|'
		     THEN word();
		     ELSE break;
		     FI
		POOL
		r->regcom=cmd(0,NLFLG|MTFLG);
		IF wdval==ECSYM
		THEN	r->regnxt=syncase(esym);
		ELSE	chksym(esym);
			r->regnxt=0;
		FI
		return(r);
	FI
}

@DizzyDon

That's gold ! So that's where the fi and esac came from :+)

Edit:

I managed to teach myself C before going to university, where I had to do FORTRAN77 followed by Pascal. The BEGIN and END gave me the shits :+D. I do not want to dis these languages, they have their merits :+)
Last edited on
seeplus wrote:
Spoilsport!


Yep, reading it as I see it. Imagine you are a C++ tutor or lecturer, and all the assignments look like VBA or Python !

I guess if one wants to make a compiler for new language they could use LLVM for that - much more work though !

https://tomassetti.me/a-tutorial-on-how-to-write-a-compiler-using-llvm/
Last edited on
Bourne abused macros to make the original sh program look like ALGOL-68.


Yep - that's what I was thinking about. :) :)

I loved Pascal when I used it in the 70's, 80's and 90's! I also liked Fortran 77 - but never got to use it in real production code. I hated Cobol and disliked Algol 68 - which has now sunk without trace. Yes!
The BEGIN and END gave me the shits


These were introduced in Algol-60 as the reserved words begin and end (or on some systems begin end). Algol-60 also introduced the ubiquitous ;

Yes, Algol-68 introduced fi, elif, od and esac as clause terminators - not least of which to solve the 'dangling else' problem with if statements.
Pages: 12