
Ops Scripting w. Python: Frequency 2
Tracking Frequency in Python: Part II
Updated: 2020–05–24 for clarity
Previously I presented the problem on how to use count frequency using a frequency hash, or dict
in Python.
In this article, I will present solutions and discussion about Python language features.
The Solutions
These solutions will use a collection loop for
. I call them collection loop as the loop construct iterates over a collection, which in our case a collection of lines from the passwd
file.
Solution 1: Basic Collection loop
We open the file and iterate line by line in this example. To keep things simple, we will not handle errors:
For every line, we only care about the shell and without a newline character polluting our string. So we do a few operations, strip off the newline, split the string into a list, and a list slice. This can be broken up into these steps:
line = line.rstrip() # strip newline
line_items = line.split(':') # split up line by ':' divider
shell = line_items[6] # slice off 7th item
This can all be done in a single line.
shell = line.rstrip().split(':')[6]
Now that we have a have the shell, we need to check if we actually got a shell. Sometimes, though rarely, there may not actually be a shell defined for that user.
if shell:
# do stuff with that shell as a key
Each item in the counts
dictionary will have a key that represents the shell, and a value that represents frequency of shell used in our data file passwd
.
We simply need to increment the value. As Python does not initialize values when first used, we have to do this manually.
# initialize new key if key doesn't exist in dict
if shell not in counts:
counts[shell] = 0
# increment the count
counts[shell] += 1
Solution 2: Dict get Method
Instead conditionally setting the frequency count, we can use the get method that comes with the dict
class. This will return a default value if the key is not found, which should be 0
, or it will return the value. Either way, we increment the value by one to increase the count.
Solution 3: DefaultDict
Another method is to just auto-initialize all keys that are referenced for the first time to 0 with dict
subclass called defaultdict from the collections
library.
With this, python now behaves like other languages, but is more powerful as we can control the behavior of the default with a custom lambda.
Conclusion
From these solutions, the you should have picked up the following takeaways for Python:
- Collection Loop (
for
) - Splitting a String
- List Slicing (or indexing in this case)
- Testing variable is initialized
- 3 ways to initialize default value in dict class
- the
in
operator
In the next article, I will show how to use lambda and dict comprehensions to solve the same problem.