If you are a current student, please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.csail.mit.edu) to authenticate, and then you will be redirected back to this page.
1) The Setup
So far in C/C++, we've talked about general syntax/practice, variables and their implications, and pointers. We also talked about a basic form of string representation, the char-array and how we can declare them such as:
char Lalphabet[] = "abcdefghijklmnopqrstuvwxyz";
which makes a char
array which we could access and/or manipulate using standard array indexing syntax such as:
Lalphabet[4] = 'H'; //change the 4th element to an H
or
Serial.println(Lalphabet[6]); //print out the 6th element
We could use this in a working example on your ESP32 as shown below (note we use the sizeof
1 function to determine the size of the char array so we index through it properly.)
char Lalphabet[] = "abcdefghijklmnopqrstuvwxyz";
void setup() {
Serial.begin(115200);
}
void loop() {
Serial.println("The Alphabet:");
for( int i = 0;i<sizeof(Lalphabet);i++){
Serial.println(Lalphabet[i]);
}
delay(1000); //bad practice, but ok for this example
}
And the output in the Serial Monitor would be the following ad nauseum (try it):
The Alphabet:
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
The Alphabet:
a
b
....
There's a one thing we should notice here that is subtle...where is that extra blank line coming from between the letter z
and the beginning of the next The Alphabet
being printed? If we analyze our for
loop, we'll notice we print one line each time through. There are 26 letters in the English alphabet, so we should be printing 26 times plus once for the "The Alphabet", but you'll see on each iteration of loop
above you get 28 things printed! 26 Letters, one blank line, and one line saying "The Alphabet". Something is off.
Let's do the following which will repeatedly print out to the serial monitor the size of our char array:
char Lalphabet[] = "abcdefghijklmnopqrstuvwxyz";
void setup() {
Serial.begin(115200);
}
void loop() {
Serial.println(sizeof(Lalphabet));
delay(1000); //delay is bad practice, but ok for this example
}
When you run this you'll get:
27
27
27
27
27
...
Oh jeeze. The alphabet array we made had 26 letters in it (each letter being a char
which has a size of one byte, so we'd expect 26 bytes in size). But this char array we made is reported to have a size of 27. Where is that extra byte coming from? This is where we get into the idea of representing strings in C/C++!
2) C-strings (char arrays)
In C any time you want to represent a "string" of text you do it with a char
array. As we've seen, C is lower level, and therefore we need to be more conscious of how it lives in memory (whereas Python abstracts a lot of that for you)
String
class exists and can work with the ESP32 (our microcontroller). As a class, 6.08 and 6.S08 used this previously. It provides a lot of easy-to-use features, especially if you're coming from a Pythonic enviroment, but it does really start to clog up the works in terms of deliverables and growing within the C/C++ language. We will be using C-strings in 6.08 and expect you do so on the assignments in the class. But the traditional C style String has been part of C since the beginning and will always work, so it is also a good thing know. It also lets us focus more on pointers, which is always a good thing!The first thing to note is that a character and a string are not the same thing in C as they are in Python. In Python, a single character is really just a string that is one character long. Also in Python, single quotes and double quotes ('
vs. "
) mean the same thing.
In C/C++, a single character must be indicated using single quotes ('
) whereas a string must be indicated using double quotes ("
). A string, is just an array of characters. So you do stuff like the following:
char text_[] = "Some text"; //create char array...use double quotes
char x = 'a'; //create char (size of one byte)...use single quotes
Just like with any array in C you can specify the size at the declaration (and this size can be larger than what you need at the moment):
char text_[300] = "Some text"; //create char array that is 300 long, but currently only has "Some text" string in it
char x = 'a'; //create char (size of one byte)
2.1) NULL Character
A properly structured C-string will be comprised of an array of characters and be terminated in a final character known as the "NULL" character, which can actually be specified as '\0'
in the language. This character is implicit in string creation so when you're typing out "Some text"
what that is really referring to is a character array that is 10 characters long...nine for the letters "Some text" and one for an implicit NULL Character. '\0'
means "String is done." and each string you create should always have one.
The location of that null character is often used by other functions and operators in C to know when to stop reading in memory. For example, below we create a character array and load it with the string "There". This means it starts out with five characters and the NULL at the end. If you print it in the Serial monitor you'll see "There" printed.
If you then instead replace character 3 with a null character as shown below, and then reprint you'll see when you print it "stops" printing early! That's because the terminator is now earlier in the string.
char input_2[] = "There"; //create a string that is five characters and a null terminator.
Serial.println(input_2); //prints There
input_2[3] = '\0';
Serial.println(input_2); //prints The
The null terminator is VERY important. Accidental deletion of it can cause problems in printing and other things. This is because functions working with the string have no idea where the valid characters stop and others begin. Careful management and replacement of the null can of course be done, but there are also some nice general libraries that will help us do standard C operations while managing those C
3) string.h
When working with char arrays, a standard library in C that can help a lot with char array manipulation is the string.h library. It includes 22 helpful functions that can aid in merging, sorting, and parsing char arrays in a safe and productive manner. In order to use it, you need only do the following towards the top of your file:
#include<string.h>
And then use the functions as desired. Oftentimes in C you can't just mess with your string in the same way as you can in Python. Python does a lot for you, whereas in C, there's much less abstraction.
4) C-String Cheat Sheet
We could go through a whole bunch of char array examples, but that would take a while. You should know how to do some Python if you're in this course, so we're going to map some Python functionality to an equivalent C functionality! Think of this as a language translator2
4.1) Creation
If you want to create a a string in Python you just go and do
message = "Hi"
message_b = "Hello."
There's a few ways to do this in C/C++:
char message[] = "Hi"; //created three-byte C-string comprising characters 'H', 'i', and '\0' (the null character)
char message2[3] = "Hi"; //created three-byte C-string comprising characters 'H', 'i', and '\0' (the null character)
char message3[10] = "Hi"; //created ten-byte char array, that starts with a three-byte C-string comprising characters 'H', 'i', and '\0' (the null character)
All three of these have created a char array containing the string "Hi"
(which is three bytes in C, including our terminating null character). The third declaration, message3
, however has placed that null-terminated string within a larger ten-byte char array. The remaining seven characters in that array are as yet undefined at this point in the code (whatever values are currently in the memory).
Note in C you may also do the following:
char message[20]; //create a Char array but don't put anything in it. It does exist in memory though!
4.2) Re-Assigning a String
Let's say you want your string to be a completely different new string. In Python that's easy:
message = "Hi"
message = "Hello" #changed my mind
In C/C++ there is the very helpful function <a href="https://www.tutorialspoint.com/c_standard_library/c_function_strcpy.htm", target="_blank">strcpy from the the string.h
library that takes in a pointer to the string you want to overwrite, with the content you'd like to overwrite with:
#include<string.h>
//then in function somewhere:
char message[40] = "Hi I Like C"; //create 40-long char array holding null-terminated string of length 12 (11 chars, and null-terminator)
Serial.println(message); //Will print "Hi I Like C"
strcpy(message,"No...I love C"); //overwrite with 14 long string with 13 characters and null (still fits inside 40)
Serial.println(message); //Will print "No...I love C"
Care must be taken here. What will happen if you instead do the following on the ESP32 (actually run this!):
#include<string.h>
void setup() {
Serial.begin(115200);
}
void loop(){
char message[] = "Hi I Like C";//12 long (11 char one null)
Serial.println(message);
strcpy(message,"No...I love C. It is the best.Seriously. The best");
Serial.println(message);
}
When you do, you'll get the following (or something similar):
Hi I Like C
No...I love C. I?? riously. The best
Stack smashing protect failure!
abort() was called at PC 0x400d59a8 on core 1
Backtrace: 0x400853ec:0x3ffb1f20 0x40085619:0x3ffb1f40 0x400d59a8:0x3ffb1f60 0x400d0b71:0x3ffb1f80 0x400d1749:0x3ffb1fb0 0x4008950d:0x2e796c73
Rebooting...
ets Jun 8 2016 00:22:57
Basically your ESP crashed! (So cool!) This is because you tried to copy a string into a char array that wasn't large enough to hold it! As a result, while the copying-event probably happened, the system at large only respects the boundaries that are declared for the message
array, and most likely overwrote parts of the string that went beyond the orginal 12-long boundary. Since the Serial.println
function uses the null terminator and some other things to figure out how much to read and print, when it came time to print, some other information would have gotten written there and basically everything broke.
So the lesson is, always be sure the character array you're writing to is large enough to hold what you want to put into it.
The following, will run fine, for example:
#include<string.h>
void setup() {
Serial.begin(115200);
}
void loop(){
char message[100] = "Hi I Like C";//12 long (11 char one null) in a 100-long char array!
Serial.println(message);
strcpy(message,"No...I love C. It is the best.Seriously. The best");
Serial.println(message);
}
No stack-smashing.
Note you may also, depending on how poorly you mismanage your strings and null terminators, get an error like this:
Guru Meditation Error: Core 1 panic'ed (LoadProhibited). Exception was unhandled.
Core 1 register dump:
PC : 0x400014fd PS : 0x00060130 A0 : 0x800da4e0 A1 : 0x3ffb1b70
A2 : 0x00000048 A3 : 0x00000044 A4 : 0x000000ff A5 : 0x0000ff00
A6 : 0x00ff0000 A7 : 0xff000000 A8 : 0x00000000 A9 : 0x3ffb1e40
A10 : 0x00000003 A11 : 0x00060123 A12 : 0x00060120 A13 : 0x00000020
A14 : 0x00000020 A15 : 0x00000000 SAR : 0x0000001f EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000048 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xffffffff
Backtrace: 0x400014fd:0x3ffb1b70 0x400da4dd:0x3ffb1b80 0x400d8645:0x3ffb1e90 0x400d0b86:0x3ffb1f50 0x400d0bf1:0x3ffb1f90 0x400d17d5:0x3ffb1fb0 0x4008950d:0x3ffb1fd0
This will arise from similar situations, so make sure if you see that, check what you're doing with your strings!
4.3) Appending
Let's say you want to add something onto the end of your String! (A common occurrence!)
In Python you'd do:
message = "Hi"
message += " there!" #append onto the string
whereas in C one way to do it using the string.h
library is using strcat
#include<string.h>
//then inside of a function:
char message[40] = "Hi"; //pick a size that's large enough for any expected use!
strcat(message," there!"); //good if there is enough space pre-allocated for message!
4.4) String Formatting
Let's say you need to inject non-string information into a string for showing to the outside world!
There are several ways to do this in Python. The more modern way is:
num_of_cats = 2 #variable holding integer!
message = "I have {} cats".format(num_of_cats)
#will print as: "I have 2 cats"
In C/C++ you'd do using sprintf
3:
char message[40]; //pre-allocate sufficient space to store any expected string
int num_of_cats = 2; //integer expressing number of cats
sprintf(message,"I have %d cats", num_of_cats); //
//Will Serial.println as "I have 2 cats"
4.5) Length of String
In Python if you want to know the length of your string you just do:
message = "cats"
len_of_string = len(message) #variable holding integer!
In C/C++, you can do something very similar using the strlen
function, another part of the string.h
library. It will return the length of the first string found in a character array. It is distinctly different from the sizeof
function mentioned earlier. For example:
char message[40] = "Cats";
Serial.println(message); //prints "Cats"
Serial.println(sizeof(message)); //prints 40 (size of array)
Serial.println(strlen(message)); //prints 4...since there are four characters in "Cats"
4.6) Clearing Out
Since we'll often use strings in order to collect data/build up data, we'll sometimes want to empty out our string. In Python you can just do:
message = "cats"
message = "" #empty now
In C there's a couple ways...one way is to do:
char message[40] = "Cats";
strcpy(message,"");//empty now.
This basically places a null character right at the start of the character array so when anything reads the pointer message
, it immediately terminates. A slightly more robust way, and one which is useful in case you've been using and reusing your char array is to set all values inside of the buffer to be the value 0, (which is the ascii value of null). This can be done like so:
memset(message, 0, sizeof(message)); //write 0 (or '\0') to all bytes in char array!
5) Further Readings
6) Try It Out
Manipulating char arrays with functions looks scarier than it is. For example, since all char arrays are just arrays, and the variable name of all arrays is actually a pointer to the beginning of the array (see pointer section previously), if you want to hand in char arrays to a function as inputs you simply need to list them as pointers.
You cannot directly return a char array as a data type. Instead what you do is pass a char array in as an argument and the function will change it. Consider the little function below, combiner
. It takes in two char arrays (its inputs), and uses a third argument as an output. You can then use that output as needed (note this code will run on your ESP32 just as is):
#include<string.h>
char input_1[200] = "one";
char input_2[200] = " two";
char output[500];
void setup() {
Serial.begin(115200);
}
void loop(){
Serial.println("starting:");
Serial.println(input_1);
Serial.println(input_2);
combiner(input_1,input_2,output);
Serial.println(output);
}
//Take in two char arrays and merge them into output (assume it is large enough):
void combiner(char* input_1, char* input_2, char* output){
strcpy(output,input_1);
strcat(output,input_2);
}
Now it is your turn to write a function that manipulates char arrays.
The function, interleaver
should take in two inputs:
char* input_1
A properly null-terminated char-array of unknown sizechar* input_2
A second, properly null-terminated char-array of unknown size
And produce one output (in the form of a third char array handed to it)
char* output
, which you may assume is sufficiently large for the functions needs.
The function should interleave the letters of the two strings, starting with the first input, going back and forth.
For example if you gave it two strings, "Hi there" and "Cats" for input_1 and input_2, respectively, you should produce "HCia ttshere " as an output.
The function should be able to handle strings of 0 content (e.g. char thing[] = "";
)
Before you get started...do you want infinite submissions? Run test cases on your own system! Your ESP is a perfect testing platform. For example to test the code below using the first test case do the following, where the output will show up on your Serial port! This is much better than debugging in the submission box.
#include<string.h>
void setup() {
Serial.begin(115200);
Serial.println("testing");
}
void loop() {
//test case (taken from page or write your own!)
Serial.println("Starting Test Case:");
char in1[] = "ABCD";
char in2[] = "1234";
char storage[300];
memset(storage,0,sizeof(storage)); //fill array with all nulls
interleaver(in1,in2,storage);
Serial.println(storage);
delay(500);
}
void interleaver(char* input_1, char* input_2, char* output) {
/* Sorry not giving you the code*/
}
Then running this with my correct code (which I left out) I get this printed in the serial monitor:
Starting Test Case:
A1B2C3D4
Starting Test Case:
A1B2C3D4
Starting Test Case:
A1B2C3D4
Starting Test Case:
A1B2C3D4
...
So try this for every question you need to write. Real world problems will never have convenient test cases to run, you'll need to make them yourself, so it is good to get practice doing this now!
Serial.print
calls in your submitted code since that library doesn't exist on the C++ sandbox we run on the server!!
Footnotes
1sizeof()
takes in any variable in C/C++ and returns the number of bytes of that data structure. An int
is four bytes long so sizeof(5)
, for example will return 4
. If you have an array of integers that is four long, sizeof(array)
will be 16
(four values, each being four bytes long). With char arrays, each char is one byte. For more information see here
2And just like any translator, it isn't perfect and is a product of its creators, so don't take this as 100% perfect
3sprintf
is part of the stdio.h
library which is default included in an Arduino .ino file