Suppose you have a list of text data in Excel columns that you want to work with and you need to extract only a portion of the text following a specific pattern. This pattern can vary, like a specific character, a particular number of characters, or whatever your actual data is.
It is the very first and important step that you can decipher the actual pattern of your data so that you can build the correct formula to manipulate the data. See for instance the sample column of data below:
In the sample above, how would extract the text AFTER the first underscore character? For instance in cell A1, how will you extract joker_12568adcf?
Using the MID Function
MID function returns a specific number of characters from a text string, starting at the position you specify, based on the number of characters you specify. But one thing to bear in mind, MID always counts each character, whether single-byte or double-byte, as 1, no matter what the default language setting is.
Its syntax is:
MID(text, start_num, num_chars)
where:
- Text (Required) The text string containing the characters you want to extract.
- Start_num (Required) The position of the first character you want to extract in text. The first character in text has start_num 1, and so on.
- Num_chars Required. Specifies the number of characters you want MID to return from text.
Example:
=MID(A3,7,20) will return the twenty characters starting from the 7th character in the text in A3.
Using the FIND Function
FIND function locates one text string within a second text string, and return the number of the starting position of the first text string from the first character of the second text string. Remeber, FIND always counts each character, whether single-byte or double-byte, as 1, no matter what the default language setting is.
Its syntax is:
FIND(find_text, within_text, [start_num])
where:
- Find_text Required. The text you want to find.
- Within_text Required. The text containing the text you want to find.
- Start_num Optional. Specifies the character at which to start the search. The first character in within_text is character number 1. If you omit start_num, it is assumed to be 1.
Example:
FIND(“M”,A2) will find the first occurence of the letter M contained in cell A2.
Using the LEN Function
LEN returns the number of characters in a text string. Its only parameter is the text entry which is the subject of the calculation of the number of characters.
The Working Formula
In order to correctly extract the characters to the right of the first instance of the underscore in our example above, we will be utilizing a combination of the functions described above.
The exact formula to be entered in B1 in our sample above is:
=MID(A1,FIND("_",A1)+1,LEN(A1))
As a result of the formula, all texts after the first underscore ( _ ) are extracted in column B.
The sample Excel file with the working formula can be downloaded below.