Counting unicode characters in an array?

Pages: 12
How can I count all characters in an array with Unicode characters? strlen() gives more and isprint() gives less value. I can't find the right one. In the code, there are 51 characters. strlen() gives 57 and isprint gives 37:). And yes this is homework.
1
2

char my_String [] {"ù322dA@@kyUMGyy3t5u7UMEy~{yyEC€}{y6(y6(y6(y4ty67"}
use unicode.
char is ascii / 8 bit characters usually.
wchar_t is a unicode character.
see this example:
https://www.cplusplus.com/reference/cwchar/wcslen/

better, use c++ strings too if you can.
Last edited on
What encoding is this? UTF-8?

https://stackoverflow.com/questions/3586923/counting-unicode-characters-in-c
Jerry Coffin wrote:
In UTF-8, a non-leading byte always has the top two bits set to 10, so just ignore all such bytes.
Last edited on
I don't know the encoding. Our dumb teacher gave this. There are 4 more and we need to create a 2d array with them. And I can't understand anything. Yes, I created a 2d array but this Unicode thing is blowing my mind. We just learned this c style strings but...
If 51 is the correct answer, then the answers in the link I posted appear to do the correct calculations.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
int main()
{
	char my_String [] {"ù322dA@@kyUMGyy3t5u7UMEy~{yyEC€}{y6(y6(y6(y4ty67"};
	
	char* p = my_String;
	
	// https://stackoverflow.com/a/3586973
	int count = 0;
	while (*p != 0)
	{
		if ((*p & 0xc0) != 0x80)
			++count;
		++p;
	}
	
	std::cout << "count = " << count << '\n';
}

count = 51

(note to others: the forum isn't rendering the three characters at the beginning, at least not for me)
Last edited on
The three characters at the very beginning of the string that do not display (because they are control characters) are:

c2 8f   U+008F: SINGLE SHIFT THREE
c2 8d   U+008D: REVERSE LINE FEED
c2 8d   U+008D: REVERSE LINE FEED

That seems kind of strange.
And the string in general is very weird.
The assignment makes no sense.
Last edited on
The assignment really doesn't make any sense. Anyone here to help, thank you very much. But I am exhausted and I will mark this as solved.
@Ganado thanks but I'm using a 2d array. I think this is not working with it. Thanks for your effort.
if I am not mistaken the first 3 didn't print correctly to the forum, making it unpossible to fix here.
would need the hex values or something to see what they really are. the box with a ? is unknown symbol for the font...
if I am not mistaken

Obviously you are mistaken.
Do you not read other people's posts?
Why is this fury? :)
@jonnin I think they're just trash. I will not include them.
No, I don't always read all the replies. My bad on that.
Assuming they are trash may or may not make sense. But anything at this point appears to be guessing, one as good or bad as another.
did you try the wide string length function?
Last edited on
I'm using a 2d array so there are 4 more of this. I tried to use it but I received a lot of mistakes.

1
2
3
4
5
6
7
8
char dizim [][60] 
    {
        {"ù322dA@@kyUMGyy3t5u7UMEy~{yyEC€}{y6(y6(y6(y4ty67"},           
        {"6(y6(y6(y6(y6(y6(y6(3y6(3y6(3yEk€}|y1#+y15dff67fgg3ddd)=?"}, 
        {"#+y1#+y1#+y1#+y1#+y1#+y1#+y1#+y1#+y1#+y1#&&%(JK33Y)("},
        {"+yrmjyJIHkGFFk|zxy.y-y-njfy5'y0\"y,y+37y+33cuy[}s43"},
        {"7y+37y+37yy+37y+y+ykgc#+y1#+y6(6(3yE37y-65/gt46&"}
    };
Show more of your attempt and explain what exactly is wrong, otherwise we are just guessing.
Apparently "dizim" means "syntax" in Turkish.
None of this makes sense.
It just means sequence/array, I think.
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
int main ()
{
  wchar_t  dizim [][60] 
    {
        {L"ù322dA@@kyUMGyy3t5u7UMEy~{yyEC€}{y6(y6(y6(y4ty67"},           
        {L"6(y6(y6(y6(y6(y6(y6(3y6(3y6(3yEk€}|y1#+y15dff67fgg3ddd)=?"}, 
        {L"#+y1#+y1#+y1#+y1#+y1#+y1#+y1#+y1#+y1#+y1#&&%(JK33Y)("},
        {L"+yrmjyJIHkGFFk|zxy.y-y-njfy5'y0\"y,y+37y+33cuy[}s43"},
        {L"7y+37y+37yy+37y+y+ykgc#+y1#+y6(6(3yE37y-65/gt46&"}
    };
 	
	for(int i = 0; i < 5; i++)
		cout <<wcslen(dizim[i]) << endl;

  return 0;
}




48
57
52
50
48


but they don't print anything meaningful trying to print the text.
Last edited on
@DizzyDon you're being interesting this night :)). Who broke your heart? Anyway, I converted the whole thing into English. Here is the full appointment:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
#include <iostream>
#include <cstring>
#include <cctype>

using namespace std;

int width {60}; 
int height {5}; 

void letter_finder (char my_string [][60]) 
{
    cout << endl;

    int letters_perline {};     
    int line_counter {};       
    int total_letter_counter {}; 

    for (int i {}; i < height; i++)
    {
        line_counter++;
        letters_perline = 0;

        for (int k {}; k < width; k++)
            if (isalpha (my_string [i][k]))
            {
                letters_perline++;
                total_letter_counter++;
            }
        cout << "In line " << line_counter << " there are " << letters_perline << " letters." << endl;      
    }
    cout << "All lines have " << total_letter_counter << " letters." << endl;
}

void digit_finder (char my_string [][60]) 
{
    cout << endl;

    int digits_perline {};
    int line_counter {};
    int total_digit_counter {};

    for (int i {}; i < height; i++ )
    {
        line_counter++;   
        digits_perline = 0;

        for (int k {}; k < width; k++)
            if (isdigit(my_string [i][k]))
            {
                digits_perline++;
                total_digit_counter++;     
            }
        cout << "In line " << line_counter << " there are " << digits_perline << " digits." << endl;    
    }       
    cout << "All lines have " << total_digit_counter << " digits." << endl;
}

int main ()
{
    char my_string [][60] 
    {
        {"ù322dA@@kyUMGyy3t5u7UMEy~{yyEC€}{y6(y6(y6(y4ty67"},           
        {"6(y6(y6(y6(y6(y6(y6(3y6(3y6(3yEk€}|y1#+y15dff67fgg3ddd)=?"},  
        {"#+y1#+y1#+y1#+y1#+y1#+y1#+y1#+y1#+y1#+y1#&&%(JK33Y)("},
        {"+yrmjyJIHkGFFk|zxy.y-y-njfy5'y0\"y,y+37y+33cuy[}s43"},
        {"7y+37y+37yy+37y+y+ykgc#+y1#+y6(6(3yE37y-65/gt46&"}
    };

    digit_finder (my_string);
    letter_finder (my_string);
    
    cout << endl;

    int total_characters {};                 

    for (int i {}; i < height; i++ )     
    {                                         
        for (int k {}; k < width; k++)
            if (my_string [i][k] != '\0')
                if (isdigit(my_string [i][k]))
                    my_string [i][k] = '.';
                else
                    my_string [i][k] = ':';
        total_characters += strlen(my_string[i]);
    }      

    cout << "In total there are " << total_characters << " characters." << endl;
    cout << "\nConverted array down there: " << endl;

    for (int i {}; i < height; i++ )     
    {  
        cout << endl;
        for (int k {}; k < width; k++)
            cout << my_string [i][k];
    }      

    return 0;
}
@jonnin looks like I forgot the L's down there. What's their purpose? And thanks it looks like working. But converting is broken now.
Pages: 12