The longest substring

The purpose is to find the length of the longest substring in a given string for which I wrote the inefficient algorithm below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>

int main() {
	std::string str{ "treatment" };
	std::vector<char> vec;

	size_t max{ 0 };

	for (size_t i = 0; i < str.size(); ++i) {
		auto f = std::find(begin(vec), end(vec), str[i]);
		if (f != end(vec)) {
			vec.erase(begin(vec), f+1);
			vec.push_back(str[i]);
			max = std::max(max, i-max);
		}
		else vec.push_back(str[i]);
		max = std::max(max, vec.size());
	}

	std::cout << max << '\n';
	return 0;
}


Will you give me an 'idea' how to write a better algorithm, in terms of efficiency, please?
Last edited on
> Will you give me an 'idea' how to write a better algorithm, in terms of efficiency, please?
You typically start by putting some search terms into a search engine, and read up on the subject.
https://duckduckgo.com/?q=longest+substring+&t=newext&atb=v296-1&ia=web

For example https://www.geeksforgeeks.org/length-of-the-longest-substring-without-repeating-characters/
Here you will find examples in increasing order of efficiency.

Also this for your bookmarks.
https://xlinux.nist.gov/dads/
I sort of have 2 modes. I have the toy problem mode, like the ones frequently posted here, which have no use to me in practice, and I have real problem mode, where I have to do something for my job or some other serious task.
In job mode, I look up the best way.
In toy mode, I actively try to find it myself. I often fail those (does not work out at all) or have a suboptimal solution -- its a difficult task to do from scratch. You can try to do it on paper, look for patterns you can exploit to avoid work, look for something that is made up of trivial known algorithms like a lookup table or adapted sort or whatever. There isnt a recipe to follow to find a better way, it just takes messing with the problem for a while looking for a way to crack it.

also, you have to be very careful about talking about efficiency. Doing more operations but running faster is very possible, and valid. Doing less operations is a valid measure too. Deciding what counts as an operation is important too: an expensive comparison that skips a simple copy or swap, you have to count the compares instead of the swaps when it is more expensive. Theoretical performance and actual wall clock results can diverge here, is the point.
@salem c, thank you for your links. Yeah, I should have searched for the exercise in advance before posting it here to discuss a better algorithm. Thanks also for the link for my bookmarks! That looks useful and massive. I don't know how to use it.

I found the algorithm below in the link geeksforgeeks. It is very smart!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>

int solution(std::string str) {

	int res{ 0 }, i { 0 };
	std::vector<int> lastIndex(256, -1);

	for (int j = 0; j < str.size(); j++) {

		i = std::max(i, lastIndex[str[j]] + 1);
		res = std::max(res, j - i + 1);
		lastIndex[str[j]] = j;
	}
	return res;
}

 int main() { 
	 
	 std::cout << solution("programming") << '\n';
	 return 0;
 }
Last edited on
@jonnin, I see what you said. Thank you for your pieces of advice and mentioning lookup table and adapted sort. These have been used much in exercises I've posted here thus far. They're vey useful and great. What other useful tools do you have in mind to suggest, please?
at that point you have half or so of the low hanging fruit ... so many of those problems use just those 2 concepts.
Next up is where you have to actually get to work, which means busting out an algorithms book ... the more common algorithms you know, the better off you will be. Another common theme in those competition problems are sequences/series so knowing how to generate the Nth term of a series without computing the N before it is a good study. Unfortunately, that can be across the board from easy to difficult to not really possible. Past that, numerical methods/ approximations are great as well -- things that converge on an answer, taylor polys, newtons/similar method... babylonian sqrt ...

on the other extreme, a number of the problems touch the NP complete problems (travelling salesman, prime factorization, etc) with very carefully chosen conditions that prevent you from having to run the brute force approach; you have to find the trick in their limitations that make it solvable in a low big-O.

and then there are the classics, like variations on the 8 queens problem. They mix them up and the trick is recognizing that its just an alteration of that kind of problem, then you have to alter the solver...

and there are the bit twiddle ones. The tricks that people used to do with raw bits... are many and amazing.

there is always a trick. Most of these problems can be solved in a page of code or less. Any time you find yourself solving it by double, triple, and worse nested looping, many pages of code, or general brute force, you are wrong :)

I am no expert on these kinds of problems. The screwy wording annoys me, and what few of them I do are when someone posts one here that seems halfway interesting.
Last edited on
Topic archived. No new replies allowed.