MS Excel 2003/2007/2010/2013 Tips: How to delete part of text strings in cells using formula

Printer-friendly versionPrinter-friendly versionSend by emailSend by emailPDF versionPDF version

Suppose you have thousands of cells or rows of cells containing text codes with identifiable patterns, you can use Excel to manipulate those text codes using formulas. This is useful when the common method of "Find & Replace" won't be fit for the scenario.

Deleting text in cells using formulas

For instance, look at these complex codes:

2013-A1734-B12-CA310-PC1173-L32-60
2013-A1622-B18-CC110-PC2280-L10-60
2013-A4850-B22-CA310-PC4486-L16-70
2013-A4680-B20-CD160-PC1030-L27-80

2013-A4680-B20-CD160-PC1030-L27-80-A
2013-A4680-B20-CD160-PC1030-L27-80-C

In various data composition, these codes can mean for instance:

2013 = Year
A1734 = Accounting document code (internal)
B12 = Billing document code (internal)
CA310 = Product Line code
PC1173 = Production code
L32
= Production Line code
60 = Production batch number

A = Misc data pointer

Regardless of how to "compose" your code, this article will be helpful if you want to work with patterned codes such as the list above in Excel exploiting Excel's built-in functions and formulas.

In this example, the goal is to remove ALL characters to the right of a marker. Our marker here will be the dashes. You may notice that the codes have 6 to 7 dashes. In this illustrative example, we will try removing ALL characters to the right of the fifth (5th) dash using a formula.

Original Code: 2013-A4680-B20-CD160-PC1030-L27-80-A   

Desired Output: 2013-A4680-B20-CD160-PC1030

In this sample, all characters after the fifth dash will be removed using combination of functions in a formula.

Raw data showing original codes

The formula to be used here in order to achieve the goal is this:

=LEFT(B2,FIND("*",SUBSTITUTE(B2,"-","*",5))-1)

The logic here is shown in the various functions used.

 

The Substitute() funtion part: SUBSTITUTE(B2,"-","*",5)

The Substitute function has the syntax:

SUBSTITUTE(text, old_text, new_text, [instance_num])

What the function does is it "substitutes" new_text for old_text in a text string. Use SUBSTITUTE when you want to replace specific text in a text string, in our case, since there are six (6) to seven (7) dashes in the code, we need to set a marker that signal Excel "where" to start deleting characters. In the formula, we will substitute the fifth dash with an asterisk. The result of the formula when B2 contains 2013-A4680-B20-CD160-PC1030-L27-80-A is 2013-A4680-B20-CD160-PC1030*L27-80-A.

The SUBSTITUTE function syntax has the following arguments:

  • Text  Required. The text or the reference to a cell containing text for which you want to substitute characters.
  • Old_text  Required. The text you want to replace.
  • New_text  Required. The text you want to replace old_text with.
  • Instance_num  Optional. Specifies which occurrence of old_text you want to replace with new_text. If you specify instance_num, only that instance of old_text is replaced. Otherwise, every occurrence of old_text in text is changed to new_text.

 

The Find() part: FIND("*",SUBSTITUTE(B2,"-","*",5))

The FIND function locates one text string within a second text string, and return the number of the starting position of the first text string from the first character of the second text string.

It has the syntax:

FIND(find_text, within_text, [start_num])

The FIND and FINDB function syntax has the following arguments:

  • Find_text  Required. The text you want to find.
  • Within_text  Required. The text containing the text you want to find.
  • Start_num  Optional. Specifies the character at which to start the search. The first character in within_text is character number 1. If you omit start_num, it is assumed to be 1.

In our example, the first function to execute is the inner function - the substitute function. That result of the substitute function for instance if B2 contains 2013-A4680-B20-CD160-PC1030-L27-80-A is 2013-A4680-B20-CD160-PC1030*L27-80-A. Now, the FIND function will locate the position of the asterisk (*). When only the FIND function is executed in B2, it will give 28.

Raw data shoing results of FIND function

Lastly, the final formula is LEFT() function

 

The LEFT() function part: LEFT(B2,FIND("*",SUBSTITUTE(B2,"-","*",5))-1)

The LEFT function returns the first character or characters in a text string, based on the number of characters you specify. It has the syntax:

LEFT(text, [num_chars])

The LEFT function syntax has the following arguments:

  • Text  Required. The text string that contains the characters you want to extract.
  • Num_chars  Optional. Specifies the number of characters you want LEFT to extract.
    • Num_chars must be greater than or equal to zero.
    • If num_chars is greater than the length of text, LEFT returns all of text.
    • If num_chars is omitted, it is assumed to be 1.

In our example above: 

SUBSTITUTE(B2,"-","*",5) => 2013-A4680-B20-CD160-PC1030*L27-80-A

FIND("*",SUBSTITUTE(B2,"-","*",5)) => 28

Therefore, the LEFT function is left with:

=LEFT(B2,28-1)

This simply means that Excel will only display the first 27 characters contained in the cell B2.

Raw data showing the use of the LEFT function

 

You can download the sample Excel file below.

MS Office Software: