### Big O notation regarding the math

Hi guys,

A confession I must make, I don't have the slightest inclination of how the math behind big-oh notation works.

I can somewhat understand why we call it asymptotic notation. We call it as such because we study how our curve whether it be linear, exponential, etc grows towards our vertical asymptote, which can represent time or space. Our horizontal asymptote will be our input of the function. So as our input increases we study how our curve responds.

In the first example, we are analysing selection sort, which is f(n) = n^2

it also mentions that selection sort can be represented by f(n) = (n^2/2)-(n/2)

This is where my understanding breaks down and where the math comes in.

The formal definition is as follows : f(N) - O(g(N)), if there exists positive constants c, N0 such that f(N) <= c . g(N) for all N >= N0

And just below the author tries to explain it but I'm still not entirely concrete with the explanation.

 The formal definition is useful when you need to perform a math proof. For example, the time complexity for selection sort can be defined by the function f(n) = n²/2-n/2 as we have discussed in the previous section. If we allow our function g(n) to be n², we can find a constant c = 1, and a N₀ = 0, and so long as N > N₀, N² will always be greater than N²/2-N/2. We can easily prove this by subtracting N²/2 from both functions, then we can easily see N²/2 > -N/2 to be true when N > 0. Therefore, we can come up with the conclusion that f(n) = O(n²), in the other selection sort is “big O squared”.

in regards to the above, what does the following mean; f(N) - O(g(N)), if there exists positive constants c, N0 such that f(N) <= c . g(N) for all N >= N0

and why does the author choose the constant as 1? And also, why does the author choose n0 as 0?

Thanks :)
Last edited on
 ```f(N) = O(g(N)) if f(N) < C.g(N) for some constant C and sufficiently large N f(N) = o(g(N)) if f(N)/g(N) -> 0 as N -> infinity f(N) ~ g(N) ("f twiddles g") if f(N)/g(N) -> 1 as N -> infinity```
Hi lastchance :)

Still seems like gibberish to me.

Also still not entirely sure why the author (maybe arbitrarily??) choose 1 as the constant and 0 as N0 and what do the constant and N0 represent?

Additionally, the author mentions " If we allow our function g(n) to be n², we can find a constant c = 1, and a N₀ = 0, and so long as N > N₀, N² will always be greater than N²/2-N/2. " but more specifically "o long as N > N₀, N² will always be greater than N²/2-N/2." May guess is that N represents the input such as a size of a string or size of a vector but what does N₀ represent?

Thanks
Last edited on
Just a follow up, let's say f(n) = 4n+3 and g(n) = n

4n + 3 = n iff there exists constants c and n0 such that f(n) <= c*g(n) for all n < n0

I'm guessing c is the constant that we multiply n by and n0 will be the lowest input value that satisfies the equation? ( is that correct?)

so to find n0 we solve the inequality,

let's try c = 5

=> 4n + 3 <= 5n
=> 3 <= 5n - 4n
=> 3 <= n
=> n >= 3

so n0 = 3 and c = 5

so I'm guessing f(n) = 4n+3 and g(n) = n holds true when c = 5 and n >= 3 ??

but now let's change g(n), f(n) still = 4n+3, and g(n) = n^2

so how do we prove that these two are not equal?

c = let's say 100?
=> 4n + 3 <= 100*n^2
=> ???

but we can't really simplify the above equation much more, so does this mean that g(n) does not equal f(n)??

 Also still not entirely sure why the author (maybe arbitrarily??) choose 1 as the constant and 0 as N0
There's no particular reason. One just needs to find any constant that satisfies the inequality. Now, if your question is how one finds the constant, that's a different (and much more difficult) question.

 => 4n + 3 <= 100*n^2 => ???

0 <= 100 n^2 - 4 n - 3
By quadratic formula, the zeroes of that second degree polynomial are (approximately)
n=-0.154
n=0.194
Since the sign of second degree coefficient is positive, the parabola is upwards, thus the above inequality is approximately
n <= -0.154 or n >=0.194
But since we only care about the positive side:
n >=0.194

Since we've found a value for c and n0 such that c n^2 >= 4 n + 3 for all n > n0 (in particular, c = 100 and n0 = 1), n^2 is indeed an upper bound on 4 n + 3.
(Needless to say, it's not a particularly tight upper bound.)
Last edited on
Just a follow up, let's say f(n) = 4n+3 and g(n) = n

Well, if n > 3 (say) then 3 < n.
So, 4n+3 < 4n+n
So, 4n+3 <5n
So, f(n) < 5 g(n)
So, for all n > 3, f(n) < C.g(n)
So, f(n) = O(g(n))

Alternatively,
f(n)/g(n) = (4n+3)/n = 4+3/n -> 4 as n->infinity
So, by definition of the limit, for "sufficiently large" n (which is problem-dependent),
f(n)/g(n) < 4+epsilon
where epsilon is a strictly positive number, but as small as you like. Hence, for large enough n,
f(n)/g(n) < C,
where C is anything strictly bigger than 4.

f(n)/g(n) = (4n+3)/n2 = (4/n+3/n2) which tends to 0 as n-> infinity.
So, f(n) = o(n2), ("little o"), which is rather stronger than f(n)=O(n) ("big O").

Your bounding constant (C) and first n0 are not unique and are often chosen, on a problem-dependent basis, for ease of proving bounds. As @Helios pointed out, simply to determine complexity they don't have to give particularly tight bounds.

You could, of course, use the "twiddles" notation:
4n+3 ~ 4n as n->infinity

To be honest, your link isn't that good. You would be better with an Analysis textbook. His "Big O doesn't matter" section would be more accurately titled "Big O isn't everything".
Last edited on
Starting to click now, let me give you a run down or synopsis of my understanding.

So big O is used to set the upper bounds on an algorithm (space or time), this upper bound is the longest an algorithm will take to run, with our example : f(n) = 4n+3 and g(n) = n

we find that if C = 5 and n0 = 3, then 4n+3 = O(n) or in other words f(n) = O(g(n)) where g(n) = n

A question that sprints to mind is that of a not so tight upper bound, as Helios mentioned " c n^2 >= 4 n + 3 for all n > n0 (in particular, c = 100 and n0 = 1), n^2 is indeed an upper bound on 4 n + 3."

so we could set an upper bound of anything greater than f(n)? it could be abnormally bigger than f(n) but it would still be correct just not a tight upper bound?

I'm also guessing n0 represents the input such as a vector that has a size greater then 3?

Thanks guys :)
 so we could set an upper bound of anything greater than f(n)? it could be abnormally bigger than f(n) but it would still be correct just not a tight upper bound?
An upper bound could be any function that grows at all faster than the function in question, yes. The upper bound would have to be the slowest-growing function that can overtake the function in question. Big O notation is usually understood to denote the upper bound of an algorithm. If the upper bound isn't completely known, one might say "the worst case time complexity of the Foo algorithm has been proven to be no worse than O(2^n)".

 I'm also guessing n0 represents the input such as a vector that has a size greater then 3?
No, the value is arbitrary. Since one of the functions you're comparing has been multiplied by a constant that has been chosen arbitrarily (c), the value of n0 that has been obtained is unrelated to any concrete magnitude.
makes sense, although the constant is still a little perplexing to me.

let's say we have a function f(n) = 2n+3, we could say that g(n) = n^2, again, as helios mentioned, this g(n) would work but it's not a tight upper bound at all.

A more appropriate upper bound ( g(n) = n ) would be to find the constant that when multiplied with g(n) satisfies our conditions. For argument sake, we could choose the constant 8,

=> f(n) <= c.g(n)
=> 2n+3 <= 8 * n
=> 3 <= 8n - 2n
=> 3 <= 6n

=> (3/6) <= (6n/6)
=> 0.5 <= n
=> n >= 0.5

I'm not entirely sure if n0 has to strictly be an integer and if so 8 would not constitute as a correct value for the constant in this case.

but assuming n0 can be an integer, then we can say f(n) = O( g(n) ) when c = 8 and n0 = 0.5

meaning the function f(n) when c = 8 and n0 = 0.5 will not grow faster than g(n).

What perplexes me, is that, we can choose a ridiculously large constant and always make f(n) <= g(n)

we could have f(n) = n^5 and g(n) = n, we could then make f(n) = O( g(n) ) if we multiply n by a constant such as 10000000000000

this is what is confusing me about the constant.

 I'm not entirely sure if n0 has to strictly be an integer and if so 8 would not constitute as a correct value for the constant in this case.
Since the inequality holds for all n >= 0.5, then it also holds for all n >= 1.

 What perplexes me, is that, we can choose a ridiculously large constant and always make f(n) <= g(n) we could have f(n) = n^5 and g(n) = n, we could then make f(n) = O( g(n) ) if we multiply n by a constant such as 10000000000000
No, given n^5 and c n, there's no constant c so large that n^5 cannot eventually overtake c n. In fact, for any pair of polynomials of degrees n and n+1, the polynomial degree n+1 always overtakes the polynomial degree n. For example, there's sufficiently large value of a such that if n > a then 2^-1000 n^2 > 2^1000 n
 And just below the author tries to explain it but I'm still not entirely concrete with the explanation.

All the article is doing is showing graphically, and in writing, the reason why higher order complexity overrides the need to consider (after a certain point, N0) lower terms to determine the overall complexity.

f, and g are complexity functions describing how the complexity changes with increasing n.

BTW: if f is 2n + 3, and g is n^2, you would proceed as follows to avoid divide by zero problems (Algebra 101):
1. 2n+3 <= n^2
2. n^2 - 2n - 3 >= 0
3. (n - 3)(n + 1) >= 0
4. So, n >= 3 (Note n + 1 will always by > 0)

Or for n < 3, f < g ... try it.

You can play around and fiddle with the algebra if you want to introduce c.The principle is exactly the same.

Last edited on
in concept only, a lot of the big-O math overlaps with the convergence / divergence of series/equations tricks. If you study that branch of math, you will see how the math tries to isolate a dominate term. Its important for proofs and rigor I suppose, but I found it useless in practice (you can almost always plug in numbers and see if it converges or not with minimal work whereas doing it officially and carefully on paper can be tricky )
@Helios starting to click now

 What perplexes me, is that, we can choose a ridiculously large constant and always make f(n) <= g(n) we could have f(n) = n^5 and g(n) = n, we could then make f(n) = O( g(n) ) if we multiply n by a constant such as 10000000000000 No, given n^5 and c n, there's no constant c so large that n^5 cannot eventually overtake c n. In fact, for any pair of polynomials of degrees n and n+1, the polynomial degree n+1 always overtakes the polynomial degree n. For example, there's sufficiently large value of a such that if n > a then 2^-1000 n^2 > 2^1000 n

if we did say f(n) = n^5 and g(n) = n, then n will not be a upper bound as f(n) will eventually grow larger than g(n) no matter what constant we choose, so if we chose 10000000000000(1e+13) as the constant, 1e+13 * n will hold true until n = 10001, at n = 10000, 10000^5 = 1e+16 and also, 1e+13 * 10000 =1e+16

thus f(n) will surpass g(n) when n = 10001, therefore f(n) != O( g(n) )

Is this correct?

* just a note that math is wrong, 10000^5 != 1e+16 but rather, 1e+20, but, what I'm trying to say is... f(n) will eventually surpass g(n)

Thanks
Last edited on
Yes.
Thanks guys, evidently enough, I've never worked as a dev but from what I've gathered, big-oh is rarely used by devs. Obviously there is use cases, you don't want to write an algorithm to be of exponential time but have you used it much in practice after you've coded a solution that may be used in production?
 but have you used it much in practice

When you're coding, you should have an idea at least of Big O. It's not difficult to get an idea at least of the time complexity of your algorithm when you're coding it.

Obviously, if you code a product and you have bubble sort in there and it's running slow, you'd want to use something faster like quick sort. Going online, you'll find the Big O of the common algorithms and that's usually good enough.

When its not obvious what the time complexity is and you can't find it online, you can just try out a couple different algorithms and time them to see which runs faster if you really don't want to do the math.

However, understand even the math can be a bit deceptive. O(100000) will reduce to O(1) while O(10n) will reduce to O(n). So while time complexity for one algorithm may be better than another, its completely possible that the better time complexity takes longer to finish for smaller data sets.

Since O(100n) reduces to O(n), compare it to O(n^2) and you'll go with O(n). But notice that the first algorithm won't actually perform better than O(n^2) until n is greater than 100!

So use common sense and logic. Big O and other notations are good tools for generalizations, but the best algorithm depends on your application.