Get sizeof(unsigned char *)

Pages: 12
I have 2 functions that should send raw data to a socket connection. I know that writing to a socket is done with `write(sfd, buf, buf_len)`.

The first function converts a string to an unsigned char *.

1
2
3
4
5
6
   void dbSend(string input) {
      cout << "String :" << input << endl;
      unsigned char * cData = new unsigned char [input.length() +1];
      strncpy((char *)cData, input.c_str(), input.length() +1);
      dbSend(cData);
   }


The second function will be usaed to send the unsigned car to the socket but now it should only display the size of the char array

1
2
3
4
5
   void dbSend(unsigned char * raw) {
      int buf_len = sizeof(raw);
      cout << "buf_len: " << buf_len << endl;
      cout << "buf:     " << raw << endl;
   }


The following function calls:
dbSend(user); dbSend("stringTest); dbSend(md5);

result in:
1
2
3
4
5
6
7
8
String :Tester
buf_len: 8
buf:     Tester
String :stringTest
buf_len: 8
buf:     stringTest
buf_len: 8
buf:     i�ز���K�V��E�&�v(�gI��?�L��Y


For buf_len, I would expect values to be 7, 11 and 31.

How can I get the size of the array?
If the char array is a proper C string, having a null terminator, then using the C library's strlen function should give you the proper string size.

Using sizeof is likely returning the size of the pointer, not the number of characters in the char array.
strlen() doesn't compile with type unsigned char* - just char*. If it's really necessary to have the function with a unsigned char* param (which should probably be const unsigned char*) then you'll need to cast to char* for strlen() argument. You can do something like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <string>
#include <iostream>
#include <cstring>

using namespace std::string_literals;

void dbSend(const unsigned char* raw) {
	const auto buf_len {std::strlen(reinterpret_cast<const char*>(raw))};

	std::cout << "buf_len: " << buf_len << '\n';
	std::cout << "buf:     " << raw << '\n';
}

void dbSend(const std::string& input) {
	dbSend(reinterpret_cast<const unsigned char*>(input.c_str()));
}

int main() {
	dbSend("string test"s);
}


Note that if you want to pass const char* as an arg to the functions, then it will match with std::string (as std::string has a constructor that takes a type of char*) and not the unsigned char* version.
Just after posting my question, I found this explanation on the difference between sizeof() and strlen():
Evaluation size: sizeof() is a compile-time expression giving you
the size of a type or a variable’s type. It doesn’t care about
the value of the variable. Strlen on the other hand, gives you
the length of a C-style NULL-terminated string.


And while trying to implement strlen() I already struggled with the unsigned char ;-(

1
2
3
4
5
6
7
8
9
10
I ended up with this:
    void dbSend(unsigned char * raw) {
      int buf_len = strlen((const char *) raw) + 1;
      cout << "buf_len: " << buf_len << " buf: " << raw << endl;
    }
    void dbSend(string input) {
      unsigned char * cData = new unsigned char [input.length() +1];
      strncpy((char *)cData, input.c_str(), input.length() +1);
      dbSend(cData);
    }


Your code is smaller but before I start using it, I will first look for the meaning of reinterpret_cast.

Thanx
strlen() doesn't compile with type unsigned char*

As I said "If the char array is a proper C string."

An unsigned char array is not a proper C string; as you point out doing a cast can convert it to one.
@Bengbers - as your code uses new, you need to also use delete[] to free the allocated memory - otherwise there is a memory leak. reinterpret_cast<> is the C++ way of doing pointer casting. Your code uses (const char*) which is the C cast way. C++ also has static_cast<>, const_cast<> and dynamic_cast<>

reinterpret_cast is very, very close to C style casts of (type) (target) style.
the other casts from C++ are significantly different from C style casts more often than not.

sizeof targets the variable you hit. If that is a pointer, it gives the size of a pointer and has NO knowledge of the data being pointed to. If the sizeof target is a complex type, like a class or struct, that *contains* a pointer, it will bubble up that same idea and give you the size of the pointer + the size of the other items for the total size of the object, again ignoring anything pointed to by one or more pointers. This seems to astonish beginners with std::string and vector, both of which are very tiny objects because they are small with all their data behind pointers.

Allocating memory with new and releasing with delete is expensive in a tight loop. If you are sending and fetching nonstop, consider keeping the memory around forever, instead of new/delete pairs in the inner frequently executing code.

This may be low enough level that C style strings are justified, but its not necessary. C++ strings can interface with networking and C interfaces if you want; the gap is bridged very cleanly.
@seeplus

Using Eclipse, I created a new C/C++ project, replaced "Hello World" with your program and build the project.
It took less than 1 minut to run the program.

But when I try to copy these lines:
1
2
3
4
5
#include <string>
#include <iostream>
#include <cstring>

using namespace std::string_literals;

to my code, I get an error, saying that string_literals is not a namespace.

Why is this line not accepted in my BasexSocket.h? (And where can I find more information on using string_literals)?

Ben
the web has info on them.
Its doing things like "hello world"s which makes the quoted text a c++ string and can access its fields like .length() and so on. There are 4 or 5 suffixes in play.

does it work without the namespace? Could be a compiler quirk. It should be on any compiler now, its c++ 11 I think?
Last edited on
Without the namespace, compiling gives this result:
Building in: /home/bengbers/Thuis/Werkbank/std2/build/default
make -f ../../Makefile
clang++ -c -O2 -o std2.o /home/bengbers/Thuis/Werkbank/std2/std2.cpp
/home/bengbers/Thuis/Werkbank/std2/std2.cpp:19:23: error: no matching literal operator for call to 'operator""s' with arguments of types 'const char *' and 'unsigned long', and no matching literal operator template
dbSend("string test2"s);
^
1 error generated.
make: *** [../../Makefile:22: std2.o] Fout 1
Build complete (2 errors, 0 warnings): /home/bengbers/Thuis/Werkbank/std2/build/default


I have included /usr/include/c++/11 in the path
Using string literals (operator""s) requires C++14 or later to work.
https://en.cppreference.com/w/cpp/string/basic_string/operator%22%22s
C++17 introduced string literals for string_view (operator""sv).
https://en.cppreference.com/w/cpp/string/basic_string_view/operator%22%22sv
@George
After selecting dialect c++14, the code compiles and executes fine (but the editor still warns for an error, Symbol 'string_literals could not be resolved)
Always specify the latest standard, c++20 is the current standard. If you don't have it, then update.

If your school/organisation doesn't have it, then that is sad.
TheIdeasMan wrote:
Always specify the latest standard, c++20 is the current standard.

For large chunks of the C++20 additions to work in Visual Studio, such as std::ranges or std::format for example, requires using /std:c++latest. /std:c++20 results in lots of "can't find" compile errors.

@Bengbers, what is your compiler? You may be using one that is out-dated.

Specifically what version/revision? Just saying Clang or GCC or whatever doesn't give an indication of whether it is C++14/17/20 compliant.
Last edited on
@George

Yes, whatever one has to do to get the current/latest to work on ones machine.

@Bengbers

From the shell, what does clang++ -v show?

Somewhere in the make file, one should be able to set the c++ std with -std=c++20 , but it depends on version of the compiler you have. Earlier versions won't have the later standards.

Also compile with at least -Wall and -Wextra

How much code do you have? Can you compile the whole thing from the shell? Then you can easily specify whatever compilation options you want. But it is worth figuring out how to do that in the build system, such as make or cmake et al.
The C-style cast expression resolves to a reinterpret_cast only if the requirements of none of the other compile-time cast operators would be satisfied.

C-style cast expression
When the C-style cast expression is encountered, the compiler attempts to interpret it as the following cast expressions, in this order:
a) const_cast<new-type>(expression);
b) static_cast<new-type>(expression), with extensions....
c) static_cast (with extensions) followed by const_cast;
d) reinterpret_cast<new-type>(expression);
e) reinterpret_cast followed by const_cast.
The first choice that satisfies the requirements of the respective cast operator is selected...
https://en.cppreference.com/w/cpp/language/explicit_cast
@TheideasMan

> clang++ -v

clang version 13.0.0 (Fedora 13.0.0-3.fc35)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-redhat-linux/11
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-redhat-linux/11
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64

In Eclipse I have now selected dialect ISO C++20 (-std=c++2a)
I have selected Linux GCC toolchain
@Bengbers

Ok cool, that is a recent compiler, clang++ 13 using g++ 11 should cope with c++20, so you should be able to get the above examples to work. With gcc toolchain one should be able to do -std=c++20 rather than -std=c++2a . The latter is the experimental version before it was released officially.

but the editor still warns for an error, Symbol 'string_literals could not be resolved)


I have had problems with Eclipse Intellisense not being up to date with the compiler standard, sometimes there are issues with the plugin versions not be quite right. It was one of the issues which turned me away from Eclipse. I use Visual Studio Code now on my Fedora (Kinoite) 35, and it works rather well using cmake.


By the way, clang++ 14.0 is out now, but you may not see it in the software updates until Fedora 36 arrives in a couple of weeks time, (it's in beta testing at the moment). I have just built clang++ 15.0 from the latest source code yesterday - it took a couple of hours to do it. Also did gcc version 12.0.1 20220410 (experimental) The new official version of gcc 12.0 is also due out very soon as well. Mainly these latest versions give further support for c++23, with a few extra pieces of c++20.

I hope this helps :+)
Bengbers wrote:
After selecting dialect c++14, the code compiles and executes fine (but the editor still warns for an error, Symbol 'string_literals could not be resolved)

It sounds like something similar to my experience with Visual Studio's Intellisense. It sometimes bugs out and reports as errors/warnings code that VSC++ happily compiles without complaint.

The Intellisense issues are really numerous with C++20 support, it is at this time experimental. After a few repeated warnings/errors that aren't I note the whinging and then ignore them if there are no compile-time problems.

As TheIdeasMan mentioned, use -std=c++20 instead of -std=c++2a.
Pages: 12