Ops Scripting w. Python: Frequency 2

Tracking Frequency in Python: Part II

Updated: 2020–05–24 for clarity

Previously I presented the problem on how to use count frequency using a frequency hash, or in Python.

In this article, I will present solutions and discussion about Python language features.

The Solutions

These solutions will use a collection loop . I call them collection loop as the loop construct iterates over a collection, which in our case a collection of lines from the file.

Solution 1: Basic Collection loop

We open the file and iterate line by line in this example. To keep things simple, we will not handle errors:

For every line, we only care about the shell and without a newline character polluting our string. So we do a few operations, strip off the newline, split the string into a list, and a list slice. This can be broken up into these steps:

line = line.rstrip()         # strip newline
line_items = line.split(':') # split up line by ':' divider
shell = line_items[6] # slice off 7th item

This can all be done in a single line.

shell = line.rstrip().split(':')[6]

Now that we have a have the shell, we need to check if we actually got a shell. Sometimes, though rarely, there may not actually be a shell defined for that user.

if shell:
# do stuff with that shell as a key

Each item in the dictionary will have a key that represents the shell, and a value that represents frequency of shell used in our data file .

We simply need to increment the value. As Python does not initialize values when first used, we have to do this manually.

# initialize new key if key doesn't exist in dict
if shell not in counts:
counts[shell] = 0
# increment the count
counts[shell] += 1

Solution 2: Dict get Method

Instead conditionally setting the frequency count, we can use the get method that comes with the class. This will return a default value if the key is not found, which should be , or it will return the value. Either way, we increment the value by one to increase the count.

Solution 3: DefaultDict

Another method is to just auto-initialize all keys that are referenced for the first time to 0 with subclass called defaultdict from the library.

With this, python now behaves like other languages, but is more powerful as we can control the behavior of the default with a custom lambda.


From these solutions, the you should have picked up the following takeaways for Python:

  • Collection Loop ()
  • Splitting a String
  • List Slicing (or indexing in this case)
  • Testing variable is initialized
  • 3 ways to initialize default value in dict class
  • the operator

In the next article, I will show how to use lambda and dict comprehensions to solve the same problem.

Linux NinjaPants Automation Engineering Mutant — exploring DevOps, o11y, k8s, progressive deployment (ci/cd), cloud native infra, infra as code