String Compression

String Compression

Given an array of characters chars, compress it using the following algorithm:

Begin with an empty string s. For each group of consecutive repeating characters in chars:

  • If the group's length is 1, append the character to s.
  • Otherwise, append the character followed by the group's length.

The compressed string s should not be returned separately, but instead, be stored in the input character array chars. Note that group lengths that are 10 or longer will be split into multiple characters in chars.

After you are done modifying the input array, return the new length of the array.

You must write an algorithm that uses only constant extra space.

Problem Statement

The problem requires us to compress a given array of characters 'chars' using a specific algorithm, such that consecutive repeating characters are represented by the character followed by its count. If a group of characters is not repeated, it should simply be represented by the character itself. The compressed string should be stored in the input character array 'chars' and the function should return the new length of the array after modification. The compressed string should not be returned separately, and the algorithm must use only constant extra space. Group lengths that are 10 or longer will be split into multiple characters in 'chars'.

Examples

Example 1:

Input: chars = ["a","a","b","b","c","c","c"]
Output: Return 6, and the first 6 characters of the input array should be: ["a","2","b","2","c","3"]
Explanation: The groups are "aa", "bb", and "ccc". This compresses to "a2b2c3".

Example 2:

Input: chars = ["a"]
Output: Return 1, and the first character of the input array should be: ["a"]
Explanation: The only group is "a", which remains uncompressed since it's a single character.

Example 3:

Input: chars = ["a","b","b","b","b","b","b","b","b","b","b","b","b"]
Output: Return 4, and the first 4 characters of the input array should be: ["a","b","1","2"].
Explanation: The groups are "a" and "bbbbbbbbbbbb". This compresses to "ab12".

Clarifying Questions

  1. What is the size range of the input array 'chars'?
  2. Can we assume that the input array 'chars' contains only uppercase and lowercase English letters?
  3. Should the function modify the original input array or create a new one?
  4. Is it acceptable to modify the input array in place, including removing elements?
  5. Can we assume that the input array 'chars' is not null or empty?
  6. Is it required to return the compressed string s, or just the new length of the array?
  7. Is it necessary to preserve the order of characters in the compressed string?
  8. Can we use any built-in functions or libraries in our solution?

String Compression Solutions

There are different approaches to solving this problem. Some of them include:

  1. Two-pointers
  2. String Concatenation
  3. Stack

Two-pointers Approach:

This approach involves iterating through the array and keeping track of the current character and its count. Whenever the current character changes, we append the compressed characters to the output string and reset the count.

Algorithm:

  1. Initialize variables currentChar and count to the first character in the array and 1, respectively.
  2. Iterate through the array from the second character.
  3. If the current character is the same as the previous character, increment the count.
  4. If the current character is different from the previous character, append the previous character to the output string, followed by its count if the count is greater than 1.
  5. Update currentChar to the current character and reset the count to 1.
  6. After the loop, append the last character to the output string, followed by its count if the count is greater than 1.

Time complexity: O(n), where n is the length of the input array. This is because we only iterate over the array once.

Space complexity: O(1), since we only use constant extra space.

Using Stack Approach:

This approach involves using a stack to keep track of the current group of repeating characters. We can then iterate through the array, push the current character onto the stack if it matches the top element, and pop the stack if it doesn't match. Whenever we pop the stack, we append the compressed characters to the output string.

Algorithm:

  1. Initialize an empty stack and push the first character in the array onto the stack.
  2. Iterate through the array from the second character.
  3. If the current character is the same as the top element of the stack, push the current character onto the stack.
  4. If the current character is different from the top element of the stack, pop the stack and append the popped character to the output string, followed by its count if the count is greater than 1.
  5. Repeat steps 3-4 until the stack is empty or the current character is successfully pushed onto the stack.
  6. After the loop, pop any remaining characters from the stack and append them to the output string, followed by their counts if the counts are greater than 1.

Time complexity:  O(n), where n is the length of the input array. This is because we only iterate over the array once and perform constant time operations on the stack.

Space complexity: O(n), since we use a stack to store the characters and their counts. In the worst case, when all characters are unique, the stack will contain n elements.

String Concatenation Approach:

This approach involves concatenating the compressed characters directly to the output string instead of appending them to an intermediate data structure. We can then iterate through the array and update the output string accordingly.

Algorithm:

  1. Initialize an empty output string and variables currentChar and count to the first character in the array and 1, respectively.
  2. Iterate through the array from the second character.
  3. If the current character is the same as the previous character, increment the count.
  4. If the current character is different from the previous character, append the previous character to the output string, followed by its count if the count is greater than 1.
  5. Update currentChar to the current character and reset the count to 1.
  6. After the loop, append the last character to the output string, followed by its count if the count is greater than 1.

Time complexity:  O(n), where n is the length of the input array. This is because we only iterate over the array once and perform constant time operations on the string.

Space complexity: O(n), since we use a string to store the compressed characters. In the worst case, when all characters are unique, the string will contain n elements.

 

In terms of time and space complexity, the Two-Pointer Approach is the most optimal. The approache have a time complexity of O(n) and a space complexity of O(1), which means they can solve the problem efficiently without using much additional space.

The Stack Approach and String Concatenation Approach have higher space complexities, which can be a problem for larger input arrays. Therefore, they may not be as optimal as the Two-Pointer and Iterative approaches, especially for very large input arrays.

Overall, the Two-Pointer Approach and Iterative Approach are the most optimal for solving this problem.

Share: