What is a computational problem?

In our day-to-day lives, we often ask questions we want answers to. "Is this number odd?" "What is the shortest route from point A to point B?" "Does this program have any bugs?" All of these questions have answers (for suitable inputs), so it becomes a matter of how to figure out that answer.

For some questions, we can figure out the answers ourselves. Still, it's often desirable to automate the process of figuring out the answer. Instead of trying to chart a driving rout by looking at roads on a paper map (remember those?), it's easier to put in your endpoints into Google Maps and let it do the work for you. We could read through the programs we write to look for bugs (that's what a code review is), but the more a computer can automate this process, the more productive we are.

In this course, we will formalize what kinds of questions we might want to ask (using the concept of a problem), and what it means to automate finding the answer (using the concept of a program). Armed with these concepts, we will explore the foundational concerns of theoretical computer science:

Given a problem, can we create a program to automatically figure out the answer? Note that we're not concerned about if we're smart enough to create such a program, but if such a program is theoretically possible to create.
If it is possible to create a program to find the solution to a problem, can we create an efficient program? It's no use if the program takes too long to find the answer.

In this first chapter, we will formalize exactly what kinds of questions we would like to find the answer to.

🧠 STUDY TIP

How to use this book

If you're comfortable with theoretical mathematics, you already know what to do: understand the concepts and practice using them. But, if you're new to the world of math at this level, follow these steps closely:

Read the definitions of each key concept carefully. Write it down to reinforce the definition in your mind.
Rephrase the book's definition. Your version doesn't need to be perfect, but it's a way to think about the concept in a way that makes sense to you. You'll still refer back to the book definition when working with that concept, but your definition is a personal reminder.
Come up some examples of that concept and make sure they fit the book definition. When we wer talking about "questions we want the answer to" at the beginning of this chapter, did you come up with any other questions you want answered?
Play around with the interactive examples following the definition. They're in the book not as a test, but as a way to build up your understanding.

Math isn't linear, and as you gain more experience, you can and should jump around. But while you're starting out, I recommend fully understanding each definition before moving onto the next. I've tried to provide ample exercises so you can master each concept right as it's presented in the book.

Definitions are not the end of your learning, but they are crucial! Using the same definition means you and I will always refer to the same concept. You're free, even encouraged, to seek out more resources online. But be careful: sometimes others use the same word for slightly different concepts. Use those resources, but make sure you understand this book's definitions as you make it through this course.

Computational problems

And with that, we come to our first definition of the course: the computational problem .

A computational problem is a mathematical function that takes in a string and outputs a string.

Technically, we haven't defined what a "string" is, but for now, think of a string in your favorite programming language.

🧠 STUDY TIP

Recall what you should do for the definition above:

Write down the definition, by hand if possible.
Write down your own version of this definition.
Come up with a few examples of functions that map strings to strings.
Play with the examples below.

Here are a couple of example computational problems to help you understand the concept.

Consider the problem MultiplyByTwo :

a string representing an integer \(N\) in base-10

a string representing the integer \(2N\) in base-10. If the input is not a valid base-10 integer, return the string "no".

Try it! Fill in the inputs and outputs for MultiplyByTwo in the table below. Remember to surround your strings in double quotes to denote they are strings.

`"2"`
`"5"`
`"231"`
		`"8"`
		`"1036"`
`"not a number"`

Consider the problem LessThan :

a string of the form "N < M", where \(N\) and \(M\) are both base-10 representations of integers

the string "yes" if the input is correctly formatted and \(N\) is numerically less than \(M\). Otherwise, return the string "no".

Try it! Fill in the inputs and outputs for LessThan in the table below. Remember to surround your strings in double quotes to denote they are strings.

`"5 < 6"`
`"1024 < 49212"`
`"24 < -1"`
`"24 > 2"`

Consider the problem PolynomialEval :

a string consisting of: space-separated list of integers \(c_m, c_{m-1}, ..., c_0\) in base-10, followed by a semicolon, followed by one base-10 integer \(N\)

a base-10 string represention of the integer that results from evaluating the polynomial \(c_m x^m + c_{m-1} x^{m-1} + ... + c_0\) at \(x = N\). If the input is not formatted correctly, return the string "no".

Try it! Fill in the inputs and outputs for PolynomialEval in the table below. Remember to surround your strings in double quotes to denote they are strings.

`"2;5"`
`";4"`
`"5 2;4"`
`"2 3 4;3"`
`"2 3 4"`
`";"`

The LessThan problem above is interesting because it's a yes/no question. As we'll see throughout this course, these problems are important enough to warrant their own name:

A decision problem is a computational problem that only ever outputs the string "yes" and "no".

🧠 STUDY TIP

I didn't put any interactive examples below because you got some practice earlier. Still, follow the other steps like rephrasing the above definition and coming up with your own examples of decision problems!

Hint: you can create a decision problem similar to PolynomialEval . What yes/no question might we ask about polynomials?

Instances are inputs

As we'll see in later parts of the course, some answers are more useful than others. For example, if a problem asked, "what is the shortest path between point A and B?", we might care more about the scenarios when there actually is a path between those points. For this reason, we can classify the inputs to our problems:

An instance is an input to a computational problem. Instances come in two flavors:

A negative instance is an input to a problem that results in the output being "no".
A positive instance is any other input. For decision problems, this is equivalent to any input that results in the output being "yes", but for other computational problems, the output is anything other than "no".

Note: you may notice that for some inputs, the error message mentions an alphabet. We will cover this concept in the next section.

Recall the definition of the PolynomialEval problem above. The definition requires a very specific input format, and any string that doesn't conform to that format results in a "no". What are some positive instances to the PolynomialEval problem?

The positive instances for LessThan are easier to think about. Because the problem is a decision problem (only returns "yes" and "no"), positive instances to LessThan are simply the inputs that result in an output of "yes". What are some positive instances to the LessThan problem?

As an aside, what happens if "no" is a valid answer to the question being asked? For example, what if the problem is to find the first two letter word in the input, and the input is "long no yes"? This input yields an output of "no", but we probably don't want to classify this input as a negative instance.

We could adjust our definition of a negative instance, or we can simply avoid the issue altogether. Change the problem so that when you find a two-letter word in the input, return it surrounded by angle brackets. In the above example, the output would be "<no>". Only if there are no two-letter words in the input would the output be the string "no". Insisting our problems are defined this way makes our terminology simpler.

Strings are made from alphabets

It may feel like a huge restriction that we limit ourselves to functions that take in strings and spit out strings. After all, how can we navigate from point A to point B on a map if we can't even pass in the map as an input? As it turns out, strings are more powerful than you might imagine!

But, in order to understand how powerful strings are, let's formalize the notion of a string. We start by defining an alphabet:

An alphabet is a finite set of unique symbols.

With this we can define a string:

A string is a finite sequence of zero or more symbols from an alphabet.

An empty string is a string consisting of zero symbols. An empty string is denoted either as "" or as \(\epsilon\).

In this course, we always write strings surrounded by double quotes to show they are strings.

Here are some examples of alphabets, along with strings using those alphabets.

Alphabet	Sample strings
\(\{\mathtt{0}, \mathtt{1}\}\)	\(\epsilon\)
	`"0"`
	`"1"`
	`"00"`
	`"01"`
	`"10"`
	`"11"`
	`"1010110101001"`
\(\{\mathtt{A}, \mathtt{B}, \mathtt{C}, ..., \mathtt{Z}\}\)	\(\epsilon\)
	`"A"`
	`"K"`
	`"HELLO"`
	`"JCVVONANJOCOAWUOHOUEJRRANEHU"`
ASCII characters	\(\epsilon\)
	`"I can use spaces"`
	`"I can even use symbols: - + ?"`
	`"Because the ASCII alphabet includes the newline character, strings can span multiple lines. They can even contain blank lines."`

String encodings

Now that we know what a string is, we can return to our question: how do we work with entities like graphs if we can only pass in strings to our computational problems and get strings out? It turns out that we can represent the entities we care about as strings. As long as our string contains all the information about the entity, the string is as good as the original entity.

This way of representing something as a string is called a string encoding:

An encoding is a way to represent some entity in another form.

In this course, we focus on string encodings , which is a way to represent some entity as a string. The resulting string is called the encoded string .

In order to understand how we can represent arbitrary entities as strings, we have to ask: what is the important information about this entity that needs to be captured to recreate this entity unambigously? The following examples will walk you through this process.

Let's consider an undirected graph:

Here, we've represented the graph as a drawing. We want to capture all the information present in this drawing, only in text form. To do that, let's ask: what are the key features of this graph? If you were to describe this graph to a classmate over the phone, how would you describe it?

Reveal answer

You would want to describe the following features:

What nodes are in the graph, such as the fact that there are nodes A, B, all the way to H.
Which nodes are connected to each other, such as the fact that C connects to A, D, E and F.

You could also describe where the nodes are placed visually in the drawing, but remember that moving around the nodes doesn't change the graph's structure. So, we'll avoid describing how the graph is laid out, just its nodes and connections.

With the important parts of the graph identified, we can write down this key information in a string. There are many ways to do this, but let's start with a simple translation: write down the different edges of the graph, one after another.

For example, A connects to C, so write down A,C. Repeat for the other edges. What would this look like?

Reveal answer

You could have chosen many different ways of listing out the edges, but in the following encoded string, our encoding (the rules for translating the graph into a string) assumes we'll write edges with commas, then separate different edges with a space.

If your screen is narrow, you may need to scroll the string horizontally to see it all.

"A,C B,D C,D C,E C,F D,G E,F E,H F,G G,H"

Try hovering over or tapping edges in the graph or the string, and see the corresponding edge be highlighted in the other figure.

The above encoding has some interesting properties:

For an undirected graph, each edge is written once in the encoded string. Whichever "direction" we choose for the encoded edge is arbitrary. For example, the encoded string will only contain A,C or C,A but not both.
If there is a node with no connected neighbors, we have no way to represent it in the encoded string! Can you think of some ways to fix this problem?

Let's try another encoding, one that differs with respect to both properties above. We can represent the graph as adjacency lists: write each node name, a colon, then all neighboring nodes. Do this for each node, one per line. What would the encoded string look like for the above graph using this encoding?

Reveal answer

Here's the graph encoded using the second encoding:

"A: C
 B: D
 C: A D E F
 D: B C G
 E: C F H
 F: C E G
 G: D F H
 H: E G"

See if you can spot how both properties from above are different for this encoding compared to the first one.

Try it! Use the edge-based encoding (the first encoding above) to represent the following graph as a string. There are multiple answers, see if you can find more than one.

As seen in the above example, we can choose to encode our entities in many different ways. For example, as long as our encoding for graphs can unambigously represent any graph, we are free to choose any encoding. In this course, we will use the edge-based encoding for graphs.

Another limitation you may have noticed is that our definition of computational problem only allows for a single input and a single output. But what if we want to take in multiple pieces of data? The PolynomialEval problem has already worked around this (can you figure out how?), but the next example shows a string encoding of multiple values that comes in handy.

The book What can be computed? defines the following encoding for representing two strings as a single string. There is the Encode as Single String (ESS) function to convert two strings into a single string, and the inverse DEcode from Single String (DESS) function to extract the two original strings out of the encoded single string:

\begin{align} \mathrm{ESS}(I_1, I_2) &= \mathrm{len}(I_1) + \mathtt{space} + I_1 + I_2 \\ \mathrm{DESS}(I) &= (I_1, I_2) \end{align}

Let's see how ESS can be used to encode two strings as one:

`("ABCD", "EFG")`		`"4 ABCDEFG"`
`("Hello", "world")`
`("Input 1", "Input 2")`
`("Second is empty", "")`
`("", "First is empty")`

Next, try decoding the encoded strings by breaking up the encoded string into its constituent parts:

`"4 ABCDEFG"`		`("ABCD", "EFG")`
`3 Twostrings`
`12 Strings withspaces in them`
`6 string`
`0 string`

With this encoding, we are no longer restricted to single values in our inputs. Our single-string inputs can encode two values, each of which can recursively encode more values!

Languages

A language (or a formal language ) is a set of strings using an alphabet.

Consider the alphabet \(\Sigma = \{\mathtt{0}, \mathtt{1}, \mathtt{2}, \mathtt{3}, \mathtt{4}, \mathtt{5}, \mathtt{6}, \mathtt{7}, \mathtt{8}, \mathtt{9}\}\).

Define the language MultipleOfThree as the set of strings \(S\) using \(\Sigma\) where:

\(S\) is the decimal (base-10) representation of an integer.
The represented integer is a multiple of \(3\).

Provide some examples of strings in the MultipleOfThree language. Remember to use double-quotes around your strings to indicate they are strings.

Next, provide some examples of strings not in the MultipleOfThree language.

The language above is an infinite set of strings. But notice the definition of the term language doesn't mention anything about the size of the set. A language can just as well be a finite set, as shown below:

Consider the alphabet \(\Sigma = \{\mathtt{0}, \mathtt{1}\}\).

Define the language SmallBinary as the set of strings \(S\) using \(\Sigma\) that are exactly two characters long.

How many strings are in this language?
What are they?

Reveal answer

There are exactly four strings in this language. They are:

"00"
"01"
"10"
"11"

Based on the examples above, it's tempting to think all languages are defined using some rules or patterns that constrain the strings in that language. But look again at the definition of a language: there is no restriction that the strings in the set are constrained by any rule.

Try your hand at this language where the strings are composed of decimals integers (strings of 0-9). Try some random guesses and see if you can find a pattern.

Found the pattern yet?

Reveal answer

There is no pattern! Some strings are just in the language, and some strings are not. There is no rhyme or reason to it.

This example is implemented by randomly accepting a string as part of the language or not. If you compare your results with your friends, you will see different results.