String Manipulation

String manipulation is another feature that python provides. It is important to manipulate strings in order to give a specific output.

Accessing Strings

In python, and like other programming languages, you can access strings but calling a specific index of that string. Let me give you a simple string example:

Let’s create a string variable called stringVar where it stores the string “My string”. Now we can split the string in different indexes.

 

M Y   S T R I N G
0 1 2 3 4 5 6 7 8

 

Every string starts from index 0. On this string, we can go from index 0 to 8. Imagine, if we want to print each letter from the string in a new line. We could do that in two ways. The first example is by using a string slice operator (this is be explained later):

stringVar = "My string"
index = 0
while (index < len(stringVar)):
    print(stringVar[index])
    index = index + 1

 

Output:
M
y
 
s
t
r
i
n
g

 

The second example is the following:

stringVar = "My string"
for index in stringVar:
    print(index)

 

Output:
M
Y

 

Now, let’s compare these two examples. In the first example, we access the index by accessing the string array of that string variable. In the second example we don’t need to call the string array because the for loop is doing that already. For the example proposed, where you need to print each letter from the string in a new line, the first example and the second example work for that purpose, but the second one is more efficient and simpler in terms of simplicity and code lines. 

Basic Operations

In terms of string operations, python provides some useful operations to manipulate strings.

 

Operator

Description

+

The plus symbol is called the concatenation operator, is used to add or join strings together.

*

It is a repetition operator, with allows you to concatenate multiple string copies.

[]

This is a slice operator. As shown above in the example, it gives a character at a specified index.

[:]

This is a range slice operator. It gives n characters from n range.

in

This operator returns a Boolean value. It returns true (1) if a substring is inside a string.

not in

This operator works the same way as in operator, but in this case it returns true (1) if the substring is not inside the string.

r/R

r is a raw string operator. It basically ignores any Escape characters on the string. It is mostly used when calling strings with a specified directory.

%

This is a format operator and is capable of perform string formatting. It is possible to format different keywords and variables into a string format.

Now, let’s take a look to different examples where we use the different string operators.

# Concatenation operator in strings (+)
variable1 = "My"
variable2 = "string"
print(variable1 + variable2)

 

Output: Mystring

 

# Repetition operator in strings (*)
variable = "string"
print(variable * 3 )

 

Output: stringstringstring

 

# Slice operator in strings []
variable = "MY"
print(variable[0])
print(variable[1])

 

Output: 
M
Y
 

 

# Range slice operator in strings [:]
variable = "My string"
print(variable[3:9])

 

Output: string

 

# in operator in strings
variable = "My string"
print("string" in variable)

 

Output: True

 

# not in operator in strings
variable = "My string"
print("cow" not in variable)

 

Output: True

 

# r/R operator in strings
print(r'C://Users/User1/Desktop')

 

Output: C://Users/User1/Desktop

 

# % operator in strings
stringVar = "My string"
print("Printing the following string: %s"%(stringVar))

 

Output: Printing the following string: My string


All the examples are applied to each one of the operators explained on the table above.

String Slices

In terms of string manipulation, it was already explained two operators that can accomplish the string slices, which are: slice operator and the range slice operator. The examples of how each one work are the same as shown above.

 # Slice operator in strings []
variable = "MY"
print(variable[0])
print(variable[1])  

 

Output: 
M
Y

 

# Range slice operator in strings [:]
variable = "My string"
print(variable[3:9])

 

Output: string


Function and Methods

Now, there is another way to manipulate string which is by using built-in function and methods in python. Let’s take a look to the different functions and methods that we can use:

 

Function/Method

Description

casefold()

It returns a lowercase string from a given string.

center(width, fillchar)

It returns a padded string in left and right spaces with a option to insert a fillchar on those paddings.

count(string, begin, end)

It return the number of substrings inside a string, with the option to search the string in a specific index begin and end.

encode(encoding = ‘utf-8’, error = ‘strict’)

decode(encoding = ‘utf-8’, error = ‘strict’)

Encode function returns encoded string and decode function returns a decoded string. The encoding standard for default is the UTF-8 which the user can choose optionally and also the user can choose the error mode.

endswith(suffix, begin=0, end=len(string))

It returns a Boolean value of the string ends within a specified substring or string.

expandtabs(tabsize=8)

It modifies a string by expanding tab characters inside a string.

find(substring, beginIndex, endIndex)

It returns the index of the first match of a substring inside a string. It returns -1 if substring is not found.

format(*args, **kwargs)

Returns a formatted string. You can format a string by using braces {} and by replacing the braces with the pretended value.

index(substring, begin, end)

It works the same away as find() function.

isalnum()

It returns a Boolean value depending on if a string is alphanumeric or not.

isalpha()

It returns a Boolean value depending on if all characters in the string are alphabetic or not.

isdecimal()

It returns a Boolean value depending on if all characters are decimal characters or not. Decimals characters are characters that have base 10.

isdigit()

It returns a Boolean value depending on if all characters are digits or not.

isidentifier()

It returns a Boolean value depending on if a string is a valid identifier or not. In other words, an identifier is any name given to variable, functions, classes, etc.

islower()

It returns a Boolean value depending on if all characters in the string are lowercase or not.

isnumeric()

It returns a Boolean value depending on if all characters of the string are numeric characters or not.

isprintable()

It returns a Boolean value depending on if the string is printable (true) or if the string is empty (false).

isupper()

It returns a Boolean value depending on if all characters in the string are uppercase or not.

isspace()

It returns a Boolean value depending on if there is only whitespace inside the string or not.

istitle()

It returns a Boolean value depending on if the string is a titlecased string.

join()

With join method is possible to concatenate a string with a iterable object (list, tuple, string, etc.).

len()

Returns the size (length) of a string. The size value is equal to the number of characters inside a string.

ljust(width, fillchar)

This method allows to justify the string to the left and fill the remaining space with fillchars.

lower()

It converts and returns a lowercase string.

lstrip()

It returns a string with all leading characters removed. It can be provided an optional parameter (char) but if non-parameter is provided, then all the leading spaces are removed.

partition(sep)

This method splits the string from the specified string in the method parameter. It returns a tuple.

maketrans()

Returns a mapping table for translation usable. It is widely used for mapping dictionaries.

replace(old, new, count)

It replaces and old substring for a new substring inside a string. Old and new parameters need to be used, but the count is optional.

rfind(substring, start, end)

Similar to find() function but it searches in a reverse direction. Start and end parameters are optional.

rindex(substring, start, end)

Similar to index() function but it searches in a reverse direction. Start and end parameters are optional.

rjust(width, fillchar)

This method allows to justify the string to the right and fill the remaining space with fillchars on the left side. Basically, is the reverse function of ljust().

rstrip()

It returns a string with all trailing characters removed. It can be provided an optional parameter (char) but if non parameter is provided, then all the trailing spaces are removed.

RSPLIT(sep=none, maxsplit = -1)

This method returns a comma separated list from a string. It splits with a separator as a delimiter, if the user doesn’t specify the separator, then whitespace string will be used as a separator. This method is equal to split() with the exception that this method splitting start’s from the right.

split(sep=none, maxsplit = -1)

This method returns a comma separated list and works equal as rsplit(), with the difference that this method start normally from the left.

splitlines()

This method split the string in lines, in other words, the split is based in line breaks such as new line (\n).

STARTswith(prefix, start, end)

It returns a Boolean value, based on if the string starts with the prefix defined or not.

swapcase()

Invert the case of all characters on the string.

title()

This method converts a string into a title cased string.

TRANSLATE(table)

We can use this method for any given map created by maketrans() method. This method translates a given table and returns a string.

upper()

This method converts all characters from a string to uppercase.

zfill(width)

This method fill any given string with 0 digits at the left for a specific width.

rpartition()

This method returns a tuple, where it splits the string for any given separator substring.

 

Now, let’s take a look to different examples where we use the different string manipulation function.

 

# casefold method
variable1 = "MY STRING"
new_variable1 = variable1.casefold()
print(new_variable1)

 

Output:my string

 

# center method
variable = "My string"
new_variable = variable.center(25,"@")
print(new_variable)

 

Output: @@@@@@@@My string@@@@@@@@

 

# count method
variable = "My string string string string string"
new_variable = variable.count("s")
print("Count = " + str(new_variable))

 

Output: Count = 5

 

# encode/decode method
variable = "My string"
encoded_variable = variable.encode()
print(encoded_variable)
decoded_variable = encoded_variable.decode()
print(decoded_variable)

 

Output: 
b'My string'
My string

 

# endswith method
variable = "My string0"
end_with_variable = variable.endswith("0")
print(end_with_variable)

 

Output:True

 

# expandtabs method
variable = "My \t string \t is \t unique"
print(variable.expandtabs(1))
print(variable.expandtabs(3))
print(variable.expandtabs(5))

 

Output: 
My   string   is   unique
My     string   is    unique
My    string    is   unique

 

# find method
variable = "My string"
print(variable.find("string"))

 

Output: 3

 

# format method
variable1 = "My"
variable2 = "string"
print("{} and {}".format(variable1, variable2))

 

Output:

 

# index method
variable = "My string"
print(variable.index("string"))

 

Output: My and string

 

# isalnum method
variable = "Mystring123"
print(variable.isalnum())

 

Output: True

 

# isalpha method
variable = "Mystring"
print(variable.isalpha())

 

Output: True

 

# isdecimal method
variable = "1234567890"
print(variable.isdecimal())

 

Output: True

 

# isdigit method
variable = "100"
print(variable.isdigit())

 

Output:

 

# isidentifier method
variable = "Mystring"
print(variable.isidentifier())

 

Output: True

 

# isnumeric method
variable = "1234567890"
print(variable.isnumeric())

 

Output: True

 

# isprintable method
variable = "My string"
print(variable.isprintable())

 

Output:  True

 

# isupper method
variable = "MYSTRING"
print(variable.isupper())

 

Output: True

 

# isspace method
variable = "    "
print(variable.isspace())

 

Output: True

 

# istitle method
variable = "My String"
print(variable.istitle())

 

Output: True

 

# join method
variable = "-"
number_list = ['1','2','3','4','5','6','7','8','9']
print(variable.join(number_list))

 

Output: 1-2-3-4-5-6-7-8-9

 

# len method
variable = "My string"
print(len(variable))

 

Output: 9

 

# ljust method
variable = "Mystring"
print(variable.ljust(25,"&"))

 

Output: Mystring&&&&&&&&&&&&&&&&&

 

# lower method
variable = "MY STRING"
print(variable.lower())

 

Output: my string

 

# lstrip method
additional_var = "!!!"
variable = "       My string"
print(additional_var + variable.lstrip())

 

Output: !!!My string

 

# partition method
variable = "Mystring"
print(variable.partition("string"))

 

Output: ('My', 'string', '')

 

# maketrans method
dictionary = {"a": "1", "b": "2", "c": "3", "d": "4"}
variable = "abcd"
print(variable.maketrans(dictionary))

 

Output: {97: '1', 98: '2', 99: '3', 100: '4'}

 

# replace method
variable = "My string"
new_variable = "My new string"
print(variable.replace(variable, new_variable))

 

Output: My new string

 

# rfind method
variable = "My string"
print(variable.rfind("string"))

 

Output: 3

 

# rindex method
variable = "My string"
print(variable.rindex("My"))

 

Output: 0

 

# rjust method
variable = "My string"
print(variable.rjust(25,"#"))

 

Output: ################My string

 

# rstrip method
variable = "My string       "
additional_string = "!!!"
print(variable.rstrip() + additional_string)

 

Output: My string!!!

 

# rsplit method
variable = "My string"
print(variable.rsplit())

 

Output: ['My', 'string']

 

# split method
variable = "My string"
print(variable.split())

 

Output: ['My', 'string']

 

# splitlines method
variable = "My \n string is \n my \n string"
print(variable.splitlines())

 

Output: ['My ', ' string is ', ' my ', ' string']

 

# swapcase method
variable = "My string"
print(variable.swapcase())

 

Output: mY STRING

 

# startswith method
variable = "My string"
print(variable.startswith("My"))

 

Output: True

 

# title method
variable = "my STRING"
print(variable.title())

 

Output: My String

 

# translate method
dictionary = {"a":"1","b":"2","c":"3"}
variable = "abc"
trans_table = variable.maketrans(dictionary)
print("Trans table: " + str(trans_table))
translated_table = variable.translate(dictionary)
print("Translated table: " + str(translated_table))

 

Output: 
Trans table: {97: '1', 98: '2', 99: '3'}
Translated table: abc

 

# upper method
variable = "my string"
print(variable.upper())

 

Output: MY STRING

 

# zfill method
variable = "My string"
print(variable.zfill(25))

 

Output: 0000000000000000My string

 

# rpartition method
variable = "My string is just a string"
print(variable.rpartition("string"))

 

Output: ('My string is just a ', 'string', '')

 

 

Related Tutorials